Utility of team indices for predicting end of season ranking in two national polls.
A team's success during any given year may be evaluated using a number of different measures. Many teams consider a number one ranking in post-season polls to be the only hallmark of a successful season. For others, just being among the "Top 25" is enough to create a sense of validity for the season's efforts. For many teams as well as fans, coaches' polls provide the sport and its faithful with a sense of empirical credibility (Thomas, 1991). Since the Associated Press introduced national polls in 1949, 28 number one ranked teams have either won the national championship or finished as a runner-up (Vitale & Douchant, 1994). Yet, national polls represent only one of many benchmarks used by teams to measure success. Many Division 1 coaches even consider polls to be more of an anathema than an actual measure of achievement. For these coaches and programs, other Indicators such as team statistics (e.g., field goal percentage, free-throw percentage, number of rebounds) offer a much more valid and reliable means of monitoring progress throughout the competitive season.
Several aspects of the game have been offered as predictors of a team's performance (Cooper, DeNeve, & Mosteller, 1992; Ittenbach, Kloos, & Etheridge, 1992; Merskey, 1987; Pim, 1986; Wood, 1992). However, apart from a few studies utilizing probability models in football (Stern, 1991; Thompson, 1975), basketball (Schwertman, McCready, & Howard, 1991), or baseball (Albright, 1993), a general lack of statistical scrutiny of these statistics and polls exists in the literature (Ittenbach et al., 1992; Stefani, 1977). Given the widespread appeal of college basketball today, the acceptance of media polls as a criteria for success, and the perceived importance of team statistics for most Division 1 basketball programs, a systematic analysis of team statistics as they relate to final season rankings seems long overdue. The purpose of the present study, then, is twofold: first, to determine if team statistics can be used to predict final season rankings in two major media polls using the population of teams participating in the 1991 NCAA Division 1 Men's Basketball Tournament and, second, to identify variables most predictive of those rankings.
Statistics for the 64 teams competing in the 1991 NCAA Division 1 Men's Basketball Tournament were gathered and analyzed. Indices of team performance, as reported in the NCAA Tournament section of the March 11 and March 14 issues of USA Today newspaper ("Men's Division," 1991; "NCAA by the Numbers," 1991) composed the data used in the secondary data analysis. Six predictor variables were used in the analyses: points per game, points allowed, field-goal percentage, number of free-throws, three point field-goal percentage, and number of rebounds. Percentages were converted to whole number values (x 100) for use in the parametric analyses. USA Today/CNN and United Press International (UPI) final season cumulative point total rankings served as the two criterion variables. These two polls were selected for this study because all rankings are based on the perceptions of college basketball coaches.
Three different sets of analyses were conducted in this study. First descriptive statistics were computed for each of the six dependent variables. Second, two full-model regression analyses were computed and tested for significance. Bonferonni's (Dunn, 1961) correction for multiple comparisons was used to maintain an experiment-wise error rate at the .10 level across both sets of analyses ([[Alpha].sub.ADJ] [less than] 05). Third, beta weights were tested for statistical significance within the limits of the hypothesiswise error rates ([[Alpha].sub.ADJ] [less than] .008) and evaluated alongside their respective structure coefficients to identify the most stable and useful prediction models.
Descriptive statistics were computed for each of the six predictor variables and are presented in Table 1. Pearson correlations were not tested for statistical significance, only calculated for descriptive purposes.
Null hypothesis one, that no statistically significant relationship existed between the six predictor variables and the USA Today/CNN final season cumulative point total scale, was rejected, [F.sub.(6,57)] = 4.66, p [less than] .01. Consequently, a statistically significant relationship exists between the six scales, taken together, and the USA Today/CNN cumulative [TABULAR DATA FOR TABLE 1 OMITTED] point total rankings (R = .57, [R.sup.2] = .33) Thirty-three percent of the variance in the criterion variable is accounted for by the variables in the regression model, leaving 67% of the variance unaccounted for or, accounted for by other influences.
Closer examination of the equation's beta weights and structure coefficients suggested that while two beta weights were found to be statistically significant (points per game and points allowed) using an adjusted alpha, only one variable, points per game, was actually considered to be a substantively strong predictor of final season ranked score performance (see Table 2).
Null hypothesis 2, that no statistically significant relationship existed between the six predictor variables and the UPI final season ranked score results, was also rejected, [F.sub.(6,57)] [greater than] = 9.26, p [less than] .01. Consequently, a statistically significant relationship exists between the six scales, taken together, and the UPI cumulative point total rankings (R = .70, [R.sup.2] = .49). [TABULAR DATA FOR TABLE 2 OMITTED] Forty-nine percent of the variance in the criterion scale was accounted for by the variables in the regression model, leaving 51% of the variance unaccounted for or, accounted for by factors other than those contained in this study.
Closer examination of the equation's beta weights and structure coefficients suggested that while three beta weights were found to be statistically significant (points per game, points allowed, number of field goals) using an adjusted alpha, the three were not considered to comprise the best overall set of predictor variables. A fourth variable, number of rebounds, was considered to be a better predictor of UPI final season cumulative point total ranking than the variable points allowed. This decision was made primarily on the strength of the structure coefficient (see Table 2).
The hypothesis that the six indices (points per game, points allowed, number of field-goals, number of free-throws, number of three-point goals, and number of rebounds) would be useful as predictors of end of season standing in two national media polls was accepted for both hypotheses. With multiple R-values of .57 and .70, and multiple [R.sup.2] values of .33 and .49, respectively, it can be assumed that the composition of both models fit their criterion scales reasonably well. Following is a brief discussion of the predictors for each model, individually, and in union with the other.
When one considers the USA Today/CNN scale specifically, only one variable, points per game, emerged as a statistically and substantively strong predictor. This variable's predictive power is enhanced when one considers that this one variable is capable of generating almost half of the predictive power of all of the remaining variables combined ([.67.sup.2] = .45). The only other variable for which a beta weight was found to be statistically significant had a structure coefficient that suggested very little in the way of a unique contribution to the model (-.12). That is, much of its apparent strength was coming at the expense of one or more other variables (e.g., point per game). Support for this premise also comes from the moderately strong bivariate r-values between points allowed and the variables points per game (.60) and number of rebounds (.53), correlations for which collinearity may be an issue.
It is not surprising that points per game is the single best predictor of performance across an entire season. If basketball is a game of point production, then the more a given team scores the greater the likelihood that a team will have a winning season. The fact that points allowed did not emerge as a substantive predictor, despite its statistical significance, has some support in the notion that while many teams do indeed try to hold their opponents to a minimum number of points, many teams care less about the number of points scored by their opponent than they do about their own production. In many respects, it all goes back to the notion that it doesn't matter how many points one's opponent scores as along as "our team" scores more.
When one considers the UPI scale, three variables: points per game, points allowed, and number of rebounds emerged as substantively strong predictor variables. While the third variable, number of rebounds, failed to yield a statistically significant beta weight upon formal testing (p =.05), a standardized beta weight of .29 in combination with a structure coefficient of .36 suggested the possibility of a third stable predictor. It was certainly held to be as strong of a predictor as the variable points allowed, for which a statistically significant beta weight (-.46) was countered by an extremely weak structure coefficient (.11). In the second model (UPI), nearly half of the variance is accounted for while only one-third of the variance in dependent variable scores is accounted for by predictor variables in the first model. Perhaps the presence of points allowed and number of rebounds as notable predictors is derived from their relationship to points per game (r = .60, r = .42, respectively). After all, coaching axioms abound extolling the virtues of "controlling the boards." Adeptness at defensive rebounding will invariably decrease an opposing team's attempts-at-goal while success at offensive rebounding will increase one's own opportunities to score.
When patterns of results are examined for both regression models collectively, it becomes apparent that one variable appears most salient, points per game. In both polls, USA Today/CNN and UPI, points per game was considered to be both statistically and substantively significant. In both cases, the structure coefficients accounted for nearly half of each model's predictive ability. The UPI model has the added advantage of having a second strong contributor (predictor), field-goal percentage, with a statistically significant standardized regression weight (.36) and a substantively strong structure coefficient (.68). The implication here is that all three variables, points per game, number of field goals, and number of rebounds, are working together to account for the substantially higher F- and multiple R-values of the UPI equation.
While both criterion variables consist of coaches ratings that in all likelihood share a great deal of overlap, it is somewhat surprising that the measures of statistical significance across the two models are not more consistent with one another. The variance unaccounted for by the equations may be reflective of other, non-quantitative factors which come into play in the selection of the top-tier teams. The role of personal favorites and personal biases of voters cannot be discounted. That is, the respect coaches have for other coaches and programs may play a pronounced role in weekly or end of season rankings. This factor was not among the six indices. This human factor is likely the source of some variation not addressed by the objective reporting of team statistics. Perhaps voters are attracted to the sensational big wins of underdog teams. The assumption that voters may notice teams who consistently have relatively high scoring games might also help to explain this phenomenon. Another explanation of the observation of points per game as a valuable indicator of team success may reflect the evolution of the game from a slower, more deliberate contest, into the fast paced, high scoring events witnessed in recent years. Those teams which pursue the faster paced game inevitably have more opportunities to score and, arguably, are likely to score more points than teams whose strategy is to decrease the tempo of the game. Also of interest is the amount of points scored as a result of turnovers caused by an increased emphasis on defense and transition scoring. The addition of the shot clock and the more frequent philosophy of coaches who encourage up-tempo games have also contributed to a faster paced game, overall.
Several limitations are in order. First, without a second data set on which to test these models, it is unclear how mathematically tight the measures of statistical significance and effect size are. That is, regression models built on samples of limited size will nearly always fit their observed data better than they will fit other, similar data sets. Some shrinkage may be expected when using these equations with other, related data sets. Second, it was unclear why the USA Today/CNN final season distribution resulted in less prominent measures of statistical and practical significance than the UPI distribution, especially since both polls consist of ratings of Division 1 coaches (presumably the same or similar coaches). Non-quantitative factors have been implicated. A second possibility lies in the realization that the USA Today/CNN rankings are obtained after the tournament and may be influenced more heavily by impressions gathered during the tournament rather than impressions gathered throughout the regular season. According to Ittenbach et al. (1992), this extra time may induce more error than usable information. A third limitation arises due to the rather limited range of this population. All 64 teams have distinguished themselves in one major respect, they are proven winners, at least for the season in question. Because measures of association and correlation are sensitive to restricted ranges, it would be interesting to see these tests conducted on samples of data representing more than just the top-tier of Division 1 Men's Basketball Programs. Other samples might include additional NCAA Men's Divisions, NCAA Women's Divisions, the National Basketball Association, and perhaps even high school programs affiliated with the National Federation of high school sports. It is very likely that these models and predictors would both strengthen and stabilize beyond the values presented here.
In summary, an existing data set of statistics for the 64 teams competing in the 1991 NCAA Divisions 1 Men's Basketball Tournament were analyzed using two full-model regression analyses to determine if team statistics could be used to predict end of season performance as determined by two national media polls. The variable points per game emerged as substantive predictors for the USA Today/CNN cumulative point scale. For UPI rankings, points per game was joined by number of field goals and number of rebounds as major contributors.
Albright, S. C. (1993). A statistical analysis of hitting streaks in baseball. Journal of the American Statistical Association, 88(424), 1175-1196.
Cooper, H., DeNeve, K. M., & Mosteller, F. (1992). Predicting professional sports game outcomes from intermediate game scores. Chance, 5(3-4), 18-22.
Deckard, L. (1991, June). $1 bill in television revenue causes stir among colleges. Amusement Business, 103(25), 12-14.
Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56, 52-64.
Ittenbach, R. F., Kloos, E. T., & Etheridge, J. D. (1992). Team performance and national polls: The 1990-1991 NCAA division 1 basketball season. Perceptual and Motor Skills, 74, 707-710.
Men's Div. 1 computer ratings. (1991, March 11). USA Today, 9(124), p. 4E.
Merskey, M. J. (1987, January). Coaching and teaching the free-throw shooter. The Basketball Clinic, pp. 8-11.
NCAA by the numbers. (1991, March 14) USA Today, 9(127), p. 8E.
Pim, R. (1986, November). The effect of personal fouls on winning and losing basketball games. The Coaching Clinic, pp. 14-16.
Schwertman, N. C., McCready, T. A., & Howard, L. (1991, February). Probability models for the NCAA regional basketball tournaments. The American Statistician, 45(1), 35-38.
Stefani, R. T. (1977). Football and basketball predictions using least squares. IEEE Transactions on Systems, Man, and Cybernetics, 7, 117-121.
Stern, H. (1991). On the probability of winning a football game. The American Statistician, 45(3), 179-183.
Thomas, R. M. (1991). The top 64 teams? That depends on who's counting. The New York Times, 140(48,538), p. B9.
Thompson, B. (1990). Don't forget the structure coefficients. Measurement and Evaluation in Counseling and Development, 22, 178-180.
Thompson, B., & Borrello, G. (1985). The importance of structure coefficients in regression research. Educational and Psychological Measurement, 45, 203-209.
Thompson, M. (1975). On any given Sunday: Fair competitor ordering with maximum likelihood methods. Journal of the American Statistical Association, 70(351), 536-541.
Vitale, D., & Douchant, M. (1994). Tourney time: It's awesome baby! Indianapolis, IN: Masters.
Wood, G. (1992). Predicting outcomes: Sports and stocks. Journal of Gambling Studies, 8(2), 201-222.
All correspondence regarding this manuscript should be addressed to Richard F. Ittenbach, Room 200, School of Education, The University of Mississippi, University, MS 38677. We wish to extend a special note of thanks to Roscoe Boyer for his help in the review and preparation of this manuscript.
|Printer friendly Cite/link Email Feedback|
|Author:||Ittenbach, Richard F.; Esters, Irvin G.|
|Publication:||Journal of Sport Behavior|
|Date:||Sep 1, 1995|
|Previous Article:||Self-evaluation compared to coaches' evaluation of athletes competitive orientation.|
|Next Article:||The decline of the .400 hitter: an explanation and a test.|
|Who's really #1?|
|Ducks dropping but stay in rankings.|
|Effects of consecutive basketball games on the game-related statistics that discriminate winner and losing teams.|