# Adjusting Cesarean delivery rates for case mix.

Cesarean delivery rates are one of the first measures used to judge
hospital and health plan performance (National Committee for Quality
Assurance 1995). There are good substantive reasons for reporting on
cesareans: lower cesarean rates save money and might also reflect better
quality. United States rates have increased fourfold since 1970 and are
double those of many other developed countries with lower infant
mortality rates (Notzon, Chattingius, Bergsjo, et al. 1994). Both the
government (U.S. Department of Health and Human Services 1990) and
private groups (ACOG Committee 1994) are encouraging reductions in the
rates.

Further, cesarean rates are attractive measures for methodological reasons. They are common enough to make good statistical comparisons of hospital and even physician style feasible (Burns, Geller, and Wholey 1995). The rate is the number of cesarean deliveries divided by the number of births, which is given by the number of birth certificates. So, in contrast to other operations, the denominator of potential operands is easily determined.

Any comparison of measures of utilization or outcomes, to be fair, requires adequate adjustment for case mix (Iezzoni 1994). The search for cesarean adjusters is facilitated by previous studies that uncovered many factors related to rates: age, parity, and birth weight (Parrish, Holt, Easterling, et al. 1994; Tussing and Wojtowycz 1992; Williams and Wroblewski 1991); clinical diagnoses (Hueston 1994); and anthropometric factors such as maternal size and weight gain and the size and sex of the infant (Witter, Caulfield, and Stoltzfus 1995). Other studies have found effects of nonclinical factors such as race and ethnicity (Braveman et al. 1995), patient insurance status, and hospital and physician characteristics (Burns, Geller, and Wholey 1995; Stafford 1991). The impact of nonclinical factors has implications for policy and quality improvement, but before taking action we need to adjust for clinical differences that should be related to cesareans.

This article is a case study of the problems remaining in developing a risk adjustment system using top-quality administrative data. In it, we estimate models that predict the probability of cesarean for deliveries in Washington state during 1989 and 1990, based solely on clinical factors. The clinical models range from simple (one to four factors) to complex (those requiring detailed clinical data). All adjustments are based on selected elements from administrative data (merged birth certificate and hospital discharge data) that were judged to be valid, objective, and independent of delivery management decisions.

METHODS

SOURCES OF DATA AND MATCHING

We obtained the Wasington State Birth Event Record Data for calendar years 1989 and 1990. The Washington data were the most detailed available at the time, and have benefited from extensive study by researchers at the University of Washington (Jones, LoGerfo, Shy, et al. 1993). The State Department of Health created these data by matching all birth certificates with mothers' and babies' hospital discharge records. For the two years, there were 133,589 three-way matches, plus 23,722 two-way matches or unmatched single records, which we excluded. Systematic reasons for not matching include multiple births (matching the birth record to the correct twin's hospital record is hard) and home births. We were unable to find differences in the distributions of clinical or socioeconomic maternal characteristics or method of delivery, but we found slightly more deaths in unmatched birth records than in the matched.

Of the three-way matches, we excluded 1,067 multiple births, an additional 6,120 babies under 2,500 grams weight, and 32 cases with extensive missing data, leaving 126,370 singleton non-low-birthweight events for analysis. Multiple births and low-birthweight infants are analyzed elsewhere because the clinical issues involving their delivery and outcomes are distinct.

COMBINING HOSPITAL DISCHARGE AND BIRTH CERTIFICATE INFORMATION

Information on many variables is present in both the hospital discharge record and the birth certificate. We cross-checked and combined information from the two sources to create our analysis variables, including the three key variables: current cesarean delivery, prior cesarean delivery, and breech presentation.

Cesarean delivery is indicated in the hospital discharge record by ICD-9 codes of 74.xx, excepting 74.3x and 74.91, for 25,263 deliveries. In the birth certificate it is indicated by delivery method listed as primary or repeat cesarean in 21,669 births. The method of delivery is missing in 11 percent of birth certificates, of which 70 percent are in 7 of the 80 hospitals with deliveries in Washington. The mothers with missing method of delivery had cesarean rates similar to those of the other mothers based on the discharge record, but 93 percent are also missing birth certificate risk factor data. Both sources agree on cesarean in 21,195 cases (kappa = .96 on the 112,350 births with data); because most coding errors are omissions and length of stay was close to four days for cases with only one source indicating a cesarean, we coded the birth as a cesarean if either source indicated it was (25,581 cases).

Prior cesarean is indicated in the hospital discharge record by ICD-9 codes of 654.2x, "uterine scar from previous surgery." We dropped the 534 such cases that were first births, assuming that these were non-cesarean uterine scars, leaving 12,791 deliveries. On the birth certificate, delivery method was listed as repeat cesarean or vaginal birth after cesarean (VBAC) in 12,207 cases. Both sources agree on prior cesarean in 10,591 cases (kappa = .831); we coded prior cesarean if either source indicated it was (14,407 cases).

Breech, other malpresentation, or malposition is indicated in the discharge record by ICD-9 diagnosis codes of 652.xx or procedure codes 72.5x for 7,491 deliveries. On the birth certificate, breech/malpresentation was listed as a complication of labor and/or delivery for 3,844 cases. Both sources agree on breech/malpresentation in 3,086 cases (kappa = .525); we coded breech if either source indicated it was (8,249 cases, of which 962 were also prior cesareans).

Some variables related to cesarean rates were purposely omitted. We did not use race or ethnicity in the absence of clinical evidence that these variables should affect cesarean rates. Also, we wanted to be able to study the effects of race after adjustment for clinical factors, and so could not include it. For comparisons of provider style, we should not adjust for management decisions that lead to cesareans or for excuses for or outcomes of cesareans. So we did not adjust for utilization variables such as epidural or tocolysis because these accompany management decisions. In comparing hospital cesarean policies, one would not want to adjust for trial of labor rates. So we did not use amnionitis or gestational age as a predictor for mothers with prior cesareans or breech infants: amnionitis is very rare in elective cesareans (0.8 percent in these data), and so is an indicator for trial of labor. Gestational age was slightly less with cesareans, but we judged that the relationship mainly results from decisions to schedule cesareans.

For other potential clinical risk factors, we tested the validity of data from each source in four ways as shown below: whether coding practices are consistent across hospitals, whether the variable was unequivocally a risk factor as opposed to an outcome, whether overall prevalence was consistent with clinical intuition, and whether the recorded variable was associated with its known outcomes.

We did not use two important diagnoses that arise after the decision to undertake a trial of labor - dystocia and fetal distress (Stafford 1991). These diagnoses are not consistently determined or recorded; for example, fetal distress is less likely to be recorded in hospitals where electronic fetal monitoring is used selectively. Also, some providers may be more likely to diagnose or record them when a cesarean has been done. Analyses of variance for diagnoses show that recorded fetal distress and dystocia cluster more in Washington hospitals than do prior cesarean or breech, and their recorded average hospital rates are much more variable than could be due to chance.

Other potential predictors were dropped as a result of the tests: blood transfusions and other excessive bleeding (the timing could not be assessed); rubella test positive (we wanted "an infection during the pregnancy" but the high prevalence in some hospitals suggested that rubella+ was coded for all women with immunity); mother's anemia (coded implausibly often in some hospitals on both birth certificates and discharge records, and we could not distinguish comorbidities from complications). Premature labor was coded less reliably than weeks gestation. Two variables were based on just one source. Rh-sensitization (too prevalent in the mother's discharge data and not associated with either cesarean rates or outcomes); the discharge diagnosis of "post dates leading to fetal problems" (coded in some hospitals but not in others).

Maternal hypertension was an example of a variable passing all tests. It is a risk factor, not a complication. Only 1.5 percent of the variance in the measure was between hospitals, indicating that coding was consistent. Consistent with clinical intuition about its prevalence, it was coded only in birth certificates in 2.3 percent of cases, only in hospital discharge records 1.5 percent of cases, and in both in 2.2 percent of cases. In these three categories, cesarean rates were between 30 and 32 percent, whereas in the other 94 percent of records, rates were 20 percent. Such results reinforced our decision to treat conditions as present if mentioned in either source.

The other variables for which we treated the condition as present if either the hospital discharge record or the birth certificate mentioned it were: amnionitis/fever/prolonged rupture of membranes, pre-pregnancy diabetes, gestational diabetes, active herpes, hypertension, other fetal conditions, and other maternal risk factors (see footnotes to Table 1 for lists).

[TABULAR DATA FOR TABLE 1 OMITTED]

STATISTICAL METHODS

Our exploratory analysis of the risk of receiving a cesarean in terms of the mother's clinical characteristics used a 25 percent random sample. With so many cases, we did not need to use 50 percent of them as a training sample. All of our model specifications were developed at this stage with respecification of the model or variables freely allowed. For the final analyses reported here, we subsequently reestimated the risk of cesarean model using the remaining 75 percent of the data, and used that model to predict probabilities of cesarean for all the data. No respecifications of variables or models were allowed during the final analyses.

INDEPENDENT VARIABLES

Following clinical advice, we split mothers hierarchically into four categories: (1) prior cesarean, (2) breech/malpresentation in women without a prior cesarean, (3) first births without breech/malpresentation, and (4) all other deliveries.

In our exploratory (25 percent) analysis, we developed models using the other clinical variables to explain the occurrence of cesarean in each clinical category. Our aim was to develop parsimonious models that fit the data well. The clinical variables ultimately chosen for inclusion are shown with their means in Table 1. We tried other variables, including smoking, alcohol use, number of prenatal visits, growth retardation, drug abuse, amniocentesis, and history of herpes, but none were significant in the 25 percent sample in the equations for any of the four clinical categories.

Variables were scaled based on tabulations of mean cesarean rates by level using the 25 percent sample. For example, the high cesarean rates for first births, lower and similar rates for second through fourth births, and slightly lower rates for higher order births led us to specify two dummy variables: one for first birth and one for second through fourth birth. Cesarean rates were higher when the interval from previous delivery was either long or short. We specified a dummy variable for intervals outside the 1.5- to 4.0-year range (set to 0 for first births or if the interval from a previous birth was unknown). Cesarean rates had a nearly linear relationship with mother's age except for very old and very young mothers. We specified age as a linear spline with hinge points at ages 16 and 39. Compared to medium-weight babies, cesarean rates were sharply higher for very large babies (over 4,750 grams), and somewhat higher for small babies (near our 2,500 gram cutoff). The relationships were well fit by quadratic functions of birth weight. Gestational age enters as a linear term and as an indicator variable for post-date deliveries (gestational age 42 weeks or more). Changes in clinical practice over the brief (two-year) period spanned by our data appeared linear, so we specified admission in years, centered on January 1, 1990.

We started with simple models and elaborated them as necessary to obtain a satisfactory fit as judged by the Hosmer-Lemeshow test (Lemeshow and Hosmer 1982). This test involves fitting a trial model, then ordering the observations by the predicted probability of cesarean and dividing them into ten equal-sized groups. We then test whether observed and predicted (expected) number of cesareans in each group are significantly different, using the chi-squared statistic. If they are different, we find a better-fitting model.

We introduced the variables in Table 1 as part of our attempt to find well-fitting models. There were strong interactions between the categories and the other independent variables, so we found it necessary to fit separate equations for each of the four major clinical categories. Even then, some of the models estimated using logistic regression fit poorly, but using probit regression to estimate these models led to satisfactory fits.

The final equations included those variables that were significant in the 25 percent sample in the equations for any of the four major clinical categories. We summarize the fit of the final risk-of-cesarean models in several ways for each major clinical group. We display the observed and expected cesarean rates in each of the ten Hosmer-Lemeshow groups, together with the p-value for the differences between observed and expected. We also calculate R-squared on a natural probability scale and the C-statistics on areas under the ROC curve (Iezzoni et al. 1994: ch. 9).

CALCULATING PROVIDER CASE MIX FOR CESAREAN

We used the equations that describe the clinical risk of having a cesarean to calculate the expected probability of cesarean for each delivery. These expected probabilities, averaged over the deliveries for each hospital, are a measure of each hospital's obstetrical case mix.

Going from simpler to more complicated models improves hospital adjustment only if the added factors have a strong impact on individual chances of getting a cesarean and if their prevalence varies from one hospital to another. We express the strength of simpler adjustments by the fraction of variance in the average full-clinical-model case mix in hospitals they explain. This is computed as the R-squared of regressions constrained to have slope 1 [TABULAR DATA FOR TABLE 2 OMITTED] and intercept 0 of average full clinical case mix on the average case mix from simpler adjustments, weighted by the number of deliveries performed.

RESULTS

CLINICAL RISK OF CESAREAN

The columns of Table 2 are split into four sets of three. The sets represent the four major clinical categories. There are no total mother columns, because all of our estimates are made after the mothers are divided into these four categories. The first column of each set shows the risk of cesarean regression coefficients for each of the four major clinical categories, estimated using the 75 percent sample.

The middle column interprets these coefficients by showing the effect of a one-unit increase in each of the independent variables on the probability of cesarean for a delivery that, before the increase in the independent variable, had an average probability of cesarean. For example, consider a mother with a prior cesarean but no breech presentation. Suppose the mother's other clinical characteristics were such that her predicted probability of cesarean was the average for all prior cesarean deliveries, .69 (Table 1, top row). If such a mother had a breech presentation, her probability of cesarean would rise to .87 = .69 + .18 (Table 2, Breech row, Prior Cesarean Effect column = 18 percent).

Breech presentation for prior cesarean deliveries and first births for breech deliveries both increase the already high probability of cesarean by almost 20 percentage points. Other variables with large effects (greater than 15 percentage points) in one or more of the major clinical categories include fetal conditions (polyhydramnios, oligohydramnios or Rh-sensitization), prepregnancy diabetes, active herpes, placental or cord problems, and amnionitis. These variables are rare, but are strong indicators for cesarean. The last column in each set shows that most of the variables remain significant on the 75 percent sample (z scores [greater than] 1.96 imply p [less than] .05). Some important and well-known predictors are highly significant, even though they may not be decisive in particular cases: mother's age (especially for nulliparous women), baby's weight (the effect rises sharply past 4,750 grams), and hypertension. The significance (i.e., the z-statistic) of such variables is smaller for prior cesarean and breech deliveries for two reasons: the numbers of mothers are smaller, and the impact of a predisposing factor is less if many mothers were going to get a cesarean anyway.

Cesarean rates for deliveries with prior cesareans decreased by 3 percentage points (see date of admission row), indicating an increase in successful VBAC rates over a single year. Cesarean rates for the other clinical categories decreased insignificantly.

Table 3 summarizes the fit of the equations shown in Table 2. Overall, the equations predict a wide range of probabilities of cesarean (averaging from 1 to 73 percent in the ten groups), and the observed group average rates match the predicted very closely (p = .86). It is striking that so many women (the bottom three groups, who are the multiparas with no serious complications) can be identified in advance as having such a low chance of cesarean (less than 2 percent). The R-squared and the area under the ROC curve are high in the combined group. The estimates of these statistics from the validation sample differed only in the third decimal place from those of the estimation sample, presumably because the models fit well and the sample size is so large.

Within the strata defined by the four major clinical categories, the ranges of predictions are not as broad, but the relationship between observed and expected averages is still close. In no case did the observed rates differ from the expected rates in a group by more than 1.4 percentage points. R-squared values within the high-risk clinical categories are low.

[TABULAR DATA FOR TABLE 4 OMITTED]

OVERVIEW OF VARIANCE EXPLAINED

Table 4 shows the predictive power of a variety of potential adjustment scales. The first column gives the R-squared values for models predicting cesarean (a zero-one variable) in individual delivery data.

Using prior cesarean alone explains 19 percent, and using prior cesarean together with nulliparity explains 23 percent of the variance of individual decisions. The simple partition of deliveries into our four major clinical categories (prior cesarean, breech, first birth, and other) explains over 30 percent of the variance in cesareans. Additional clinical details about each delivery (with the equations shown in Table 2) bring the total up to almost 37 percent.

The relative value of these adjustments on average hospital case mix is shown in the last column of Table 4. Differences from the values for individual explanations are due to how much the predictors differ from one hospital to another. Prior cesarean is an important predictor of individual method of delivery, but its prevalence does not differ much across Washington hospitals. By contrast, first births are more important for hospital adjustments because hospitals differ widely on the proportion of first births among deliveries.

IMPACT OF ADJUSTMENT ON HOSPITAL RANKINGS

Figure 1 shows the impact of full clinical adjustment on the cesarean rates of the 80 hospitals delivering babies in Washington in the two years of our data. The raw cesarean rates are on the horizontal axis, and adjusted rates are on the vertical axis. The size of the circles is proportional to the number of deliveries. If adjustment had no effect, the hospitals would line up on the 45 [degrees] line. They do not do so, showing that adjustment has some effect on rankings. The adjustments (differences from the 45 [degrees] line) are small except for a few small hospitals, whose very low unadjusted cesarean rates are due in part to their serving mothers with little expectation of cesarean. The observed slope is less than 45 [degrees] because hospitals with higher cesarean rates tend to have more difficult cases.

DISCUSSION

Our scales of the expected probability of cesarean reflect average obstetrical practice in Washington state during 1989-1990; they are useful because they give us a way to adjust cesarean rates to account for clinical case-mix differences across providers. However, the expected probability of cesarean is a norm, not an ideal rate; it does not say whether or not the procedure was appropriate. The average expected probability for our sample is equal to the actual cesarean rate (over 20 percent of our singleton non-low-birthweight deliveries), which many have argued is too high. For example, reflecting current practice, hospitals get credit for .692 of a cesarean for a typical woman with a prior cesarean, but with best practice, fewer than half of such women might get a cesarean. Conversely, there may be reasons for doing a cesarean in a particular case that are not captured by the clinical variables in our equations.

We developed the scales to assess variation in cesarean rates after adjusting for clinical factors, but such scales might have other uses. It is remarkable that 35 percent of mothers (the non-breech multiparas without serious complications) had predictably less than a 2 percent chance of getting a cesarean. Such information might be useful in deciding who is suitable for alternative birthing arrangements or for delivery in small rural hospitals (with adequate contingency plans). All variables in the scale except for the few concerning difficulties with labor would be known in advance. High cesarean rates on these low-risk mothers might also be used to pinpoint providers with unusual policies.

How should someone wanting to monitor providers prospectively, trying to give feedback to doctors, or trying to make a report card for patients use our results? Adjustment of rates did not greatly alter hospital rankings, but the adjustments are fair, improve face validity, and work surprisingly well in explaining which mothers get cesareans. So they should improve the acceptance of monitoring of rates. Providers mainly object to performance measures when they feel that their patients are sicker than most: adjusting for sickness improves acceptability, whether or not the results fully vindicate the providers.

A central analytic group that can collect data should rerun regressions on their data. The regression coefficients will change in a new data set because of the changed time and place, and because variables such as hypertension might differ if based on different data sources, definitions, or recording cultures (Notzon, Chattingius, Bergsjo, et al. 1994).

Our main message is the care needed to screen and specify adjustment variables and the relative value of possible predictor variables in scales. For each variable that passes the tests, one should weigh the costs of collection against the added predictive power. Data already on-line are cheapest; hospitals could use such data to generate simple reports for each provider delivering there. Even if variables were not collected electronically yet, it might not be hard to design a system that would collect the major predictive variables at low cost.

The power of variables from our merged data shows the value of being opportunistic in putting together data from different sources. Agencies or researchers that have electronic files of both birth certificates and hospital discharge data can merge them to obtain better data. It was surprising how often diagnoses from each source appeared equally valid with respect to predicting cesareans and other outcomes. The observed results are consistent with random undercoding of diagnoses and procedures in each data set.

However, using two data sets can more than double the work. In addition to checking each variable separately, analysts have to decide how to handle inconsistencies in the data. In the case of cesareans, simple adjustments worked almost as well as complex, and neither greatly altered the raw hospital rankings. In a nonadversarial setting, such as "internal" quality improvement, adjustments using one data set should suffice. But in dealing with people who are challenged by the findings, or with picky researchers, making adjustments as precise as possible may keep people from rejecting findings because "the ratings did not control well for X." Such resistance undermined the HCFA mortality data (Berwick and Wald 1990).

Although diagnoses of dystocia are strongly associated with cesareans, they are too subjective to be used in adjustments. We chose instead to adjust for more objective diagnoses associated with dystocia such as maternal weight gain, infant birth weight, and malposition. Reducing the incidence of dystocia with different childbirth management strategies is one key to reducing unnecessary cesareans (Paul and Miller 1995). Other diagnoses may also be associated with provider management. For example, the prevalence of breech might be reduced by greater use of cephalic version prior to delivery (Gifford, Keeler, and Kahn 1995). Adjusting for such variables in reported cesarean rates reduces the incentives to try to affect them, but we have to judge how much providers can realistically do, before dropping them. Even if we do not adjust for dystocia or breech, collecting data on such indications - and on such processes as whether dystocia was defined according to agreed-upon guidelines or whether version was attempted - would help us to understand the causes of variation in cesarean rates.

Some variables kept in the scales are not essential because they are only weakly connected with cesareans (the extreme-age variables, date of admission). Further, one might worry about scales that encourage gaming of response. For example, we adjusted for missing risk data because women with missing risk data in Washington were more likely to get cesareans than the average, and such adjustments improved our retrospective analysis. However, one would not want to give future hospitals extra credit for not collecting or choosing not to code such data.

The central statistical group reporting adjusted rates can either collect electronic files with all of the data on each case or, more simply, collect rates for a few categories and directly standardize them to get an overall rate (based on the overall proportion of women in each category). Initially, providers might be given the choice between providing individual data or rates, with the carrot of a better understanding of their own behavior to induce them to provide individual data. With proper system design, there should be little incremental effort in getting the data electronically, and substantial payoff even if they contain only a few explanatory variables per case.

The number of maternal subgroups whose rates are reported is closely related to the number of variables going into the case mix used for comparison of observed and expected cesarean rates. Because of cost and custom, hospitals have typically reported only primary cesarean rates, that is, rates for mothers with no prior cesarean, and repeat cesarean rates. If only two rates are to be reported, it is more informative to report repeat and nulliparous cesarean rates than to report the repeat and primary, because the primary rate is so heavily influenced by parity. The rare strong indications for cesareans, such as active herpes or placental problems, could be handled either by adjustment or by dropping such cases from the rate calculation.

For cesarean deliveries, as for any procedure, rates depend on the patient characteristics in the group under study. Depending on the data and resources available, either simple or complex adjustments for patient characteristics can be done. Although better data lead to better adjustments, not all variables related to procedure rates should be used. Proper adjustments may not alter reported results greatly, but they will improve their validity and acceptability.

REFERENCES

ACOG Committee (Committee on Obstetric Practice). 1994. "Vaginal Delivery After a Previous Cesarean Birth," Number 143: October. Washington, DC: The American College of Obstetricians and Gynecologists.

Berwick, D. M., and D. L. Wald. 1990. "Hospital Leaders' Opinions of the HCFA Mortality Data." Journal of the American Medical Association 263 (2): 247-49.

Braveman, P., S. Egerter, F. Edmunston, and M. Verdon. 1995. "Racial/Ethnic Differences in the Likelihood of Cesarean Delivery, California." American Journal of Public Health 85 (5): 625-30.

Burns, L. R., S. E. Geller, and D. R. Wholey. 1995. "The Effect of Physician Factors on the Cesarean Section Decision." Medical Care 33 (4): 365-82.

Gifford, D. S., E. Keeler, and K. Kahn. 1995. "Reductions in Cost and Cesarean Rate by Routine Use of External Cephalic Version: A Decision Analysis." Obstetrics and Gynecology 85 (6): 965-68.

Hueston, W.J. 1994. "Development of a Cesarean Delivery Risk Score." Obstetrics and Gynecology 84 (6): 965-68.

Iezzoni, L. I. (ed.). 1994. Risk Adjustment for Measuring Health Outcomes. Chicago: Health Administration Press.

Jones, L.,J. LoGerfo, K. Shy, F. Connell, V. Holt, K. Parrish, and K. McCandless. 1993. "StORQS: Washington's Statewide Obstetrical Review and Quality System: Overview and Provider Evaluation." Quality Review Bulletin 19 (April): 110-18.

Lemeshow, S., and D. W. Hosmer, Jr. 1982. "A Review of Goodness of Fit Statistics for Use in the Development of Logistic Regression Models." American Journal of Epidemiology 115 (1): 92-106.

National Committee for Quality Assurance. 1995. "Technical Report: Report Card Pilot Project." Washington, DC: National Committee for Quality Assurance.

Notzon, F. C., S. Chattingius, P. Bergsjo, S. Cole, S. Taffel, L. Irgens, and A. K. Daltveit. 1994. "Cesarean Section Delivery in the 1980s: International Comparison by Indication." American Journal of Obstetrics and Gynecology 170 (February): 495-504.

Parrish, K. M., V. L. Holt, T. R. Easterling, F. A. Connell, and J. P. LoGerfo. 1994. "Effect of Changes in Maternal Age, Parity, and Birth Weight Distribution on Primary Cesarean Delivery Rates." Journal of the American Medical Association 271 (6): 443-47.

Paul, R. H., and D. A. Miller. 1995. "Cesarean Birth: How to Reduce the Rate." American Journal of Obstetrics and Gynecology 172: 1903-11.

Stafford, R. 1991. "The Impact of Nonclinical Factors on Repeat Cesarean Section." Journal of the American Medical Association 265 (1): 59-63.

Tussing, A.D., and M. A. Wojtowycz. 1992. "The Cesarean Decision in New York State, 1986: Economic and Noneconomic Aspects." Medical Care 30 (6): 529-40.

United States Department of Health and Human Services. 1990. Healthy People 2000: National Health Promotion and Disease Prevention Objectives. Publication No. 9150212. Washington, DC: Government Printing Office.

Williams, R., and R. Wroblewski. 1991. "1984-1988 Maternal and Child Health Data Base: Descriptive Narrative." Community and Organization Research Institute (CORI), University of California, Santa Barbara.

Witter, F. R., L. E. Caulfield, and R.J. Stoltzfus. 1995. "Influence of Maternal Anthropomorphic Status and Birth Weight on the Risk of Cesarean Delivery." Obstetrics and Gynecology 85 (6): 947-51.

Emmett B. Keeler, Ph.D., Rolla Edward Park, Ph.D., Robert M. Bell, Ph.D., and Joan Keesey, B.A. are with RAND Health Sciences Program, Santa Monica, CA. Deidre Spelliscy Gifford, M.D., M.P.H., at RAND during the work on this project, is now at Brown University School of Medicine, Department of Obstetrics and Gynecology. Address correspondence and requests for reprints to Emmett B. Keeler, Ph.D., RAND, 1700 Main Street, Santa Monica, CA 90407-2138; tel. 310/393-7660, ext. 7239; fax 310/393-4818; e-mail emmett_keeler@rand.org. This article, submitted to Health Services Research on July 1, 1996, was revised and accepted for publication on November 5, 1996.

Further, cesarean rates are attractive measures for methodological reasons. They are common enough to make good statistical comparisons of hospital and even physician style feasible (Burns, Geller, and Wholey 1995). The rate is the number of cesarean deliveries divided by the number of births, which is given by the number of birth certificates. So, in contrast to other operations, the denominator of potential operands is easily determined.

Any comparison of measures of utilization or outcomes, to be fair, requires adequate adjustment for case mix (Iezzoni 1994). The search for cesarean adjusters is facilitated by previous studies that uncovered many factors related to rates: age, parity, and birth weight (Parrish, Holt, Easterling, et al. 1994; Tussing and Wojtowycz 1992; Williams and Wroblewski 1991); clinical diagnoses (Hueston 1994); and anthropometric factors such as maternal size and weight gain and the size and sex of the infant (Witter, Caulfield, and Stoltzfus 1995). Other studies have found effects of nonclinical factors such as race and ethnicity (Braveman et al. 1995), patient insurance status, and hospital and physician characteristics (Burns, Geller, and Wholey 1995; Stafford 1991). The impact of nonclinical factors has implications for policy and quality improvement, but before taking action we need to adjust for clinical differences that should be related to cesareans.

This article is a case study of the problems remaining in developing a risk adjustment system using top-quality administrative data. In it, we estimate models that predict the probability of cesarean for deliveries in Washington state during 1989 and 1990, based solely on clinical factors. The clinical models range from simple (one to four factors) to complex (those requiring detailed clinical data). All adjustments are based on selected elements from administrative data (merged birth certificate and hospital discharge data) that were judged to be valid, objective, and independent of delivery management decisions.

METHODS

SOURCES OF DATA AND MATCHING

We obtained the Wasington State Birth Event Record Data for calendar years 1989 and 1990. The Washington data were the most detailed available at the time, and have benefited from extensive study by researchers at the University of Washington (Jones, LoGerfo, Shy, et al. 1993). The State Department of Health created these data by matching all birth certificates with mothers' and babies' hospital discharge records. For the two years, there were 133,589 three-way matches, plus 23,722 two-way matches or unmatched single records, which we excluded. Systematic reasons for not matching include multiple births (matching the birth record to the correct twin's hospital record is hard) and home births. We were unable to find differences in the distributions of clinical or socioeconomic maternal characteristics or method of delivery, but we found slightly more deaths in unmatched birth records than in the matched.

Of the three-way matches, we excluded 1,067 multiple births, an additional 6,120 babies under 2,500 grams weight, and 32 cases with extensive missing data, leaving 126,370 singleton non-low-birthweight events for analysis. Multiple births and low-birthweight infants are analyzed elsewhere because the clinical issues involving their delivery and outcomes are distinct.

COMBINING HOSPITAL DISCHARGE AND BIRTH CERTIFICATE INFORMATION

Information on many variables is present in both the hospital discharge record and the birth certificate. We cross-checked and combined information from the two sources to create our analysis variables, including the three key variables: current cesarean delivery, prior cesarean delivery, and breech presentation.

Cesarean delivery is indicated in the hospital discharge record by ICD-9 codes of 74.xx, excepting 74.3x and 74.91, for 25,263 deliveries. In the birth certificate it is indicated by delivery method listed as primary or repeat cesarean in 21,669 births. The method of delivery is missing in 11 percent of birth certificates, of which 70 percent are in 7 of the 80 hospitals with deliveries in Washington. The mothers with missing method of delivery had cesarean rates similar to those of the other mothers based on the discharge record, but 93 percent are also missing birth certificate risk factor data. Both sources agree on cesarean in 21,195 cases (kappa = .96 on the 112,350 births with data); because most coding errors are omissions and length of stay was close to four days for cases with only one source indicating a cesarean, we coded the birth as a cesarean if either source indicated it was (25,581 cases).

Prior cesarean is indicated in the hospital discharge record by ICD-9 codes of 654.2x, "uterine scar from previous surgery." We dropped the 534 such cases that were first births, assuming that these were non-cesarean uterine scars, leaving 12,791 deliveries. On the birth certificate, delivery method was listed as repeat cesarean or vaginal birth after cesarean (VBAC) in 12,207 cases. Both sources agree on prior cesarean in 10,591 cases (kappa = .831); we coded prior cesarean if either source indicated it was (14,407 cases).

Breech, other malpresentation, or malposition is indicated in the discharge record by ICD-9 diagnosis codes of 652.xx or procedure codes 72.5x for 7,491 deliveries. On the birth certificate, breech/malpresentation was listed as a complication of labor and/or delivery for 3,844 cases. Both sources agree on breech/malpresentation in 3,086 cases (kappa = .525); we coded breech if either source indicated it was (8,249 cases, of which 962 were also prior cesareans).

Some variables related to cesarean rates were purposely omitted. We did not use race or ethnicity in the absence of clinical evidence that these variables should affect cesarean rates. Also, we wanted to be able to study the effects of race after adjustment for clinical factors, and so could not include it. For comparisons of provider style, we should not adjust for management decisions that lead to cesareans or for excuses for or outcomes of cesareans. So we did not adjust for utilization variables such as epidural or tocolysis because these accompany management decisions. In comparing hospital cesarean policies, one would not want to adjust for trial of labor rates. So we did not use amnionitis or gestational age as a predictor for mothers with prior cesareans or breech infants: amnionitis is very rare in elective cesareans (0.8 percent in these data), and so is an indicator for trial of labor. Gestational age was slightly less with cesareans, but we judged that the relationship mainly results from decisions to schedule cesareans.

For other potential clinical risk factors, we tested the validity of data from each source in four ways as shown below: whether coding practices are consistent across hospitals, whether the variable was unequivocally a risk factor as opposed to an outcome, whether overall prevalence was consistent with clinical intuition, and whether the recorded variable was associated with its known outcomes.

We did not use two important diagnoses that arise after the decision to undertake a trial of labor - dystocia and fetal distress (Stafford 1991). These diagnoses are not consistently determined or recorded; for example, fetal distress is less likely to be recorded in hospitals where electronic fetal monitoring is used selectively. Also, some providers may be more likely to diagnose or record them when a cesarean has been done. Analyses of variance for diagnoses show that recorded fetal distress and dystocia cluster more in Washington hospitals than do prior cesarean or breech, and their recorded average hospital rates are much more variable than could be due to chance.

Other potential predictors were dropped as a result of the tests: blood transfusions and other excessive bleeding (the timing could not be assessed); rubella test positive (we wanted "an infection during the pregnancy" but the high prevalence in some hospitals suggested that rubella+ was coded for all women with immunity); mother's anemia (coded implausibly often in some hospitals on both birth certificates and discharge records, and we could not distinguish comorbidities from complications). Premature labor was coded less reliably than weeks gestation. Two variables were based on just one source. Rh-sensitization (too prevalent in the mother's discharge data and not associated with either cesarean rates or outcomes); the discharge diagnosis of "post dates leading to fetal problems" (coded in some hospitals but not in others).

Maternal hypertension was an example of a variable passing all tests. It is a risk factor, not a complication. Only 1.5 percent of the variance in the measure was between hospitals, indicating that coding was consistent. Consistent with clinical intuition about its prevalence, it was coded only in birth certificates in 2.3 percent of cases, only in hospital discharge records 1.5 percent of cases, and in both in 2.2 percent of cases. In these three categories, cesarean rates were between 30 and 32 percent, whereas in the other 94 percent of records, rates were 20 percent. Such results reinforced our decision to treat conditions as present if mentioned in either source.

The other variables for which we treated the condition as present if either the hospital discharge record or the birth certificate mentioned it were: amnionitis/fever/prolonged rupture of membranes, pre-pregnancy diabetes, gestational diabetes, active herpes, hypertension, other fetal conditions, and other maternal risk factors (see footnotes to Table 1 for lists).

[TABULAR DATA FOR TABLE 1 OMITTED]

STATISTICAL METHODS

Our exploratory analysis of the risk of receiving a cesarean in terms of the mother's clinical characteristics used a 25 percent random sample. With so many cases, we did not need to use 50 percent of them as a training sample. All of our model specifications were developed at this stage with respecification of the model or variables freely allowed. For the final analyses reported here, we subsequently reestimated the risk of cesarean model using the remaining 75 percent of the data, and used that model to predict probabilities of cesarean for all the data. No respecifications of variables or models were allowed during the final analyses.

INDEPENDENT VARIABLES

Following clinical advice, we split mothers hierarchically into four categories: (1) prior cesarean, (2) breech/malpresentation in women without a prior cesarean, (3) first births without breech/malpresentation, and (4) all other deliveries.

In our exploratory (25 percent) analysis, we developed models using the other clinical variables to explain the occurrence of cesarean in each clinical category. Our aim was to develop parsimonious models that fit the data well. The clinical variables ultimately chosen for inclusion are shown with their means in Table 1. We tried other variables, including smoking, alcohol use, number of prenatal visits, growth retardation, drug abuse, amniocentesis, and history of herpes, but none were significant in the 25 percent sample in the equations for any of the four clinical categories.

Variables were scaled based on tabulations of mean cesarean rates by level using the 25 percent sample. For example, the high cesarean rates for first births, lower and similar rates for second through fourth births, and slightly lower rates for higher order births led us to specify two dummy variables: one for first birth and one for second through fourth birth. Cesarean rates were higher when the interval from previous delivery was either long or short. We specified a dummy variable for intervals outside the 1.5- to 4.0-year range (set to 0 for first births or if the interval from a previous birth was unknown). Cesarean rates had a nearly linear relationship with mother's age except for very old and very young mothers. We specified age as a linear spline with hinge points at ages 16 and 39. Compared to medium-weight babies, cesarean rates were sharply higher for very large babies (over 4,750 grams), and somewhat higher for small babies (near our 2,500 gram cutoff). The relationships were well fit by quadratic functions of birth weight. Gestational age enters as a linear term and as an indicator variable for post-date deliveries (gestational age 42 weeks or more). Changes in clinical practice over the brief (two-year) period spanned by our data appeared linear, so we specified admission in years, centered on January 1, 1990.

We started with simple models and elaborated them as necessary to obtain a satisfactory fit as judged by the Hosmer-Lemeshow test (Lemeshow and Hosmer 1982). This test involves fitting a trial model, then ordering the observations by the predicted probability of cesarean and dividing them into ten equal-sized groups. We then test whether observed and predicted (expected) number of cesareans in each group are significantly different, using the chi-squared statistic. If they are different, we find a better-fitting model.

We introduced the variables in Table 1 as part of our attempt to find well-fitting models. There were strong interactions between the categories and the other independent variables, so we found it necessary to fit separate equations for each of the four major clinical categories. Even then, some of the models estimated using logistic regression fit poorly, but using probit regression to estimate these models led to satisfactory fits.

The final equations included those variables that were significant in the 25 percent sample in the equations for any of the four major clinical categories. We summarize the fit of the final risk-of-cesarean models in several ways for each major clinical group. We display the observed and expected cesarean rates in each of the ten Hosmer-Lemeshow groups, together with the p-value for the differences between observed and expected. We also calculate R-squared on a natural probability scale and the C-statistics on areas under the ROC curve (Iezzoni et al. 1994: ch. 9).

CALCULATING PROVIDER CASE MIX FOR CESAREAN

We used the equations that describe the clinical risk of having a cesarean to calculate the expected probability of cesarean for each delivery. These expected probabilities, averaged over the deliveries for each hospital, are a measure of each hospital's obstetrical case mix.

Going from simpler to more complicated models improves hospital adjustment only if the added factors have a strong impact on individual chances of getting a cesarean and if their prevalence varies from one hospital to another. We express the strength of simpler adjustments by the fraction of variance in the average full-clinical-model case mix in hospitals they explain. This is computed as the R-squared of regressions constrained to have slope 1 [TABULAR DATA FOR TABLE 2 OMITTED] and intercept 0 of average full clinical case mix on the average case mix from simpler adjustments, weighted by the number of deliveries performed.

RESULTS

CLINICAL RISK OF CESAREAN

The columns of Table 2 are split into four sets of three. The sets represent the four major clinical categories. There are no total mother columns, because all of our estimates are made after the mothers are divided into these four categories. The first column of each set shows the risk of cesarean regression coefficients for each of the four major clinical categories, estimated using the 75 percent sample.

The middle column interprets these coefficients by showing the effect of a one-unit increase in each of the independent variables on the probability of cesarean for a delivery that, before the increase in the independent variable, had an average probability of cesarean. For example, consider a mother with a prior cesarean but no breech presentation. Suppose the mother's other clinical characteristics were such that her predicted probability of cesarean was the average for all prior cesarean deliveries, .69 (Table 1, top row). If such a mother had a breech presentation, her probability of cesarean would rise to .87 = .69 + .18 (Table 2, Breech row, Prior Cesarean Effect column = 18 percent).

Breech presentation for prior cesarean deliveries and first births for breech deliveries both increase the already high probability of cesarean by almost 20 percentage points. Other variables with large effects (greater than 15 percentage points) in one or more of the major clinical categories include fetal conditions (polyhydramnios, oligohydramnios or Rh-sensitization), prepregnancy diabetes, active herpes, placental or cord problems, and amnionitis. These variables are rare, but are strong indicators for cesarean. The last column in each set shows that most of the variables remain significant on the 75 percent sample (z scores [greater than] 1.96 imply p [less than] .05). Some important and well-known predictors are highly significant, even though they may not be decisive in particular cases: mother's age (especially for nulliparous women), baby's weight (the effect rises sharply past 4,750 grams), and hypertension. The significance (i.e., the z-statistic) of such variables is smaller for prior cesarean and breech deliveries for two reasons: the numbers of mothers are smaller, and the impact of a predisposing factor is less if many mothers were going to get a cesarean anyway.

Cesarean rates for deliveries with prior cesareans decreased by 3 percentage points (see date of admission row), indicating an increase in successful VBAC rates over a single year. Cesarean rates for the other clinical categories decreased insignificantly.

Table 3 summarizes the fit of the equations shown in Table 2. Overall, the equations predict a wide range of probabilities of cesarean (averaging from 1 to 73 percent in the ten groups), and the observed group average rates match the predicted very closely (p = .86). It is striking that so many women (the bottom three groups, who are the multiparas with no serious complications) can be identified in advance as having such a low chance of cesarean (less than 2 percent). The R-squared and the area under the ROC curve are high in the combined group. The estimates of these statistics from the validation sample differed only in the third decimal place from those of the estimation sample, presumably because the models fit well and the sample size is so large.

Within the strata defined by the four major clinical categories, the ranges of predictions are not as broad, but the relationship between observed and expected averages is still close. In no case did the observed rates differ from the expected rates in a group by more than 1.4 percentage points. R-squared values within the high-risk clinical categories are low.

Table 3: Fit of Model for Clinical Risk of Cesarean; Observed Cesarean Rates in Groups Ranked by Expected Rates (75 percent) Expected Prior Rate Decile All Cesarean Breech First Birth Other 1 .01 .58 .40 .06 .01 2 .02 .63 .47 .07 .01 3 .02 .64 .50 .09 .01 4 .04 .65 .60 .11 .01 5 .07 .68 .61 .12 .02 6 .11 .69 .65 .15 .02 7 .16 .70 .65 .18 .03 8 .29 .76 .67 .24 .04 9 .57 .76 .71 .31 .05 10 .73 .87 .80 .51 .18 p-value(*) .86 .99 .99 .86 .90 ROC area .88 .60 .63 .73 .78 R-squared .37 .03 .05 .12 .12 * Difference of observed from expected rates.

[TABULAR DATA FOR TABLE 4 OMITTED]

OVERVIEW OF VARIANCE EXPLAINED

Table 4 shows the predictive power of a variety of potential adjustment scales. The first column gives the R-squared values for models predicting cesarean (a zero-one variable) in individual delivery data.

Using prior cesarean alone explains 19 percent, and using prior cesarean together with nulliparity explains 23 percent of the variance of individual decisions. The simple partition of deliveries into our four major clinical categories (prior cesarean, breech, first birth, and other) explains over 30 percent of the variance in cesareans. Additional clinical details about each delivery (with the equations shown in Table 2) bring the total up to almost 37 percent.

The relative value of these adjustments on average hospital case mix is shown in the last column of Table 4. Differences from the values for individual explanations are due to how much the predictors differ from one hospital to another. Prior cesarean is an important predictor of individual method of delivery, but its prevalence does not differ much across Washington hospitals. By contrast, first births are more important for hospital adjustments because hospitals differ widely on the proportion of first births among deliveries.

IMPACT OF ADJUSTMENT ON HOSPITAL RANKINGS

Figure 1 shows the impact of full clinical adjustment on the cesarean rates of the 80 hospitals delivering babies in Washington in the two years of our data. The raw cesarean rates are on the horizontal axis, and adjusted rates are on the vertical axis. The size of the circles is proportional to the number of deliveries. If adjustment had no effect, the hospitals would line up on the 45 [degrees] line. They do not do so, showing that adjustment has some effect on rankings. The adjustments (differences from the 45 [degrees] line) are small except for a few small hospitals, whose very low unadjusted cesarean rates are due in part to their serving mothers with little expectation of cesarean. The observed slope is less than 45 [degrees] because hospitals with higher cesarean rates tend to have more difficult cases.

DISCUSSION

Our scales of the expected probability of cesarean reflect average obstetrical practice in Washington state during 1989-1990; they are useful because they give us a way to adjust cesarean rates to account for clinical case-mix differences across providers. However, the expected probability of cesarean is a norm, not an ideal rate; it does not say whether or not the procedure was appropriate. The average expected probability for our sample is equal to the actual cesarean rate (over 20 percent of our singleton non-low-birthweight deliveries), which many have argued is too high. For example, reflecting current practice, hospitals get credit for .692 of a cesarean for a typical woman with a prior cesarean, but with best practice, fewer than half of such women might get a cesarean. Conversely, there may be reasons for doing a cesarean in a particular case that are not captured by the clinical variables in our equations.

We developed the scales to assess variation in cesarean rates after adjusting for clinical factors, but such scales might have other uses. It is remarkable that 35 percent of mothers (the non-breech multiparas without serious complications) had predictably less than a 2 percent chance of getting a cesarean. Such information might be useful in deciding who is suitable for alternative birthing arrangements or for delivery in small rural hospitals (with adequate contingency plans). All variables in the scale except for the few concerning difficulties with labor would be known in advance. High cesarean rates on these low-risk mothers might also be used to pinpoint providers with unusual policies.

How should someone wanting to monitor providers prospectively, trying to give feedback to doctors, or trying to make a report card for patients use our results? Adjustment of rates did not greatly alter hospital rankings, but the adjustments are fair, improve face validity, and work surprisingly well in explaining which mothers get cesareans. So they should improve the acceptance of monitoring of rates. Providers mainly object to performance measures when they feel that their patients are sicker than most: adjusting for sickness improves acceptability, whether or not the results fully vindicate the providers.

A central analytic group that can collect data should rerun regressions on their data. The regression coefficients will change in a new data set because of the changed time and place, and because variables such as hypertension might differ if based on different data sources, definitions, or recording cultures (Notzon, Chattingius, Bergsjo, et al. 1994).

Our main message is the care needed to screen and specify adjustment variables and the relative value of possible predictor variables in scales. For each variable that passes the tests, one should weigh the costs of collection against the added predictive power. Data already on-line are cheapest; hospitals could use such data to generate simple reports for each provider delivering there. Even if variables were not collected electronically yet, it might not be hard to design a system that would collect the major predictive variables at low cost.

The power of variables from our merged data shows the value of being opportunistic in putting together data from different sources. Agencies or researchers that have electronic files of both birth certificates and hospital discharge data can merge them to obtain better data. It was surprising how often diagnoses from each source appeared equally valid with respect to predicting cesareans and other outcomes. The observed results are consistent with random undercoding of diagnoses and procedures in each data set.

However, using two data sets can more than double the work. In addition to checking each variable separately, analysts have to decide how to handle inconsistencies in the data. In the case of cesareans, simple adjustments worked almost as well as complex, and neither greatly altered the raw hospital rankings. In a nonadversarial setting, such as "internal" quality improvement, adjustments using one data set should suffice. But in dealing with people who are challenged by the findings, or with picky researchers, making adjustments as precise as possible may keep people from rejecting findings because "the ratings did not control well for X." Such resistance undermined the HCFA mortality data (Berwick and Wald 1990).

Although diagnoses of dystocia are strongly associated with cesareans, they are too subjective to be used in adjustments. We chose instead to adjust for more objective diagnoses associated with dystocia such as maternal weight gain, infant birth weight, and malposition. Reducing the incidence of dystocia with different childbirth management strategies is one key to reducing unnecessary cesareans (Paul and Miller 1995). Other diagnoses may also be associated with provider management. For example, the prevalence of breech might be reduced by greater use of cephalic version prior to delivery (Gifford, Keeler, and Kahn 1995). Adjusting for such variables in reported cesarean rates reduces the incentives to try to affect them, but we have to judge how much providers can realistically do, before dropping them. Even if we do not adjust for dystocia or breech, collecting data on such indications - and on such processes as whether dystocia was defined according to agreed-upon guidelines or whether version was attempted - would help us to understand the causes of variation in cesarean rates.

Some variables kept in the scales are not essential because they are only weakly connected with cesareans (the extreme-age variables, date of admission). Further, one might worry about scales that encourage gaming of response. For example, we adjusted for missing risk data because women with missing risk data in Washington were more likely to get cesareans than the average, and such adjustments improved our retrospective analysis. However, one would not want to give future hospitals extra credit for not collecting or choosing not to code such data.

The central statistical group reporting adjusted rates can either collect electronic files with all of the data on each case or, more simply, collect rates for a few categories and directly standardize them to get an overall rate (based on the overall proportion of women in each category). Initially, providers might be given the choice between providing individual data or rates, with the carrot of a better understanding of their own behavior to induce them to provide individual data. With proper system design, there should be little incremental effort in getting the data electronically, and substantial payoff even if they contain only a few explanatory variables per case.

The number of maternal subgroups whose rates are reported is closely related to the number of variables going into the case mix used for comparison of observed and expected cesarean rates. Because of cost and custom, hospitals have typically reported only primary cesarean rates, that is, rates for mothers with no prior cesarean, and repeat cesarean rates. If only two rates are to be reported, it is more informative to report repeat and nulliparous cesarean rates than to report the repeat and primary, because the primary rate is so heavily influenced by parity. The rare strong indications for cesareans, such as active herpes or placental problems, could be handled either by adjustment or by dropping such cases from the rate calculation.

For cesarean deliveries, as for any procedure, rates depend on the patient characteristics in the group under study. Depending on the data and resources available, either simple or complex adjustments for patient characteristics can be done. Although better data lead to better adjustments, not all variables related to procedure rates should be used. Proper adjustments may not alter reported results greatly, but they will improve their validity and acceptability.

REFERENCES

ACOG Committee (Committee on Obstetric Practice). 1994. "Vaginal Delivery After a Previous Cesarean Birth," Number 143: October. Washington, DC: The American College of Obstetricians and Gynecologists.

Berwick, D. M., and D. L. Wald. 1990. "Hospital Leaders' Opinions of the HCFA Mortality Data." Journal of the American Medical Association 263 (2): 247-49.

Braveman, P., S. Egerter, F. Edmunston, and M. Verdon. 1995. "Racial/Ethnic Differences in the Likelihood of Cesarean Delivery, California." American Journal of Public Health 85 (5): 625-30.

Burns, L. R., S. E. Geller, and D. R. Wholey. 1995. "The Effect of Physician Factors on the Cesarean Section Decision." Medical Care 33 (4): 365-82.

Gifford, D. S., E. Keeler, and K. Kahn. 1995. "Reductions in Cost and Cesarean Rate by Routine Use of External Cephalic Version: A Decision Analysis." Obstetrics and Gynecology 85 (6): 965-68.

Hueston, W.J. 1994. "Development of a Cesarean Delivery Risk Score." Obstetrics and Gynecology 84 (6): 965-68.

Iezzoni, L. I. (ed.). 1994. Risk Adjustment for Measuring Health Outcomes. Chicago: Health Administration Press.

Jones, L.,J. LoGerfo, K. Shy, F. Connell, V. Holt, K. Parrish, and K. McCandless. 1993. "StORQS: Washington's Statewide Obstetrical Review and Quality System: Overview and Provider Evaluation." Quality Review Bulletin 19 (April): 110-18.

Lemeshow, S., and D. W. Hosmer, Jr. 1982. "A Review of Goodness of Fit Statistics for Use in the Development of Logistic Regression Models." American Journal of Epidemiology 115 (1): 92-106.

National Committee for Quality Assurance. 1995. "Technical Report: Report Card Pilot Project." Washington, DC: National Committee for Quality Assurance.

Notzon, F. C., S. Chattingius, P. Bergsjo, S. Cole, S. Taffel, L. Irgens, and A. K. Daltveit. 1994. "Cesarean Section Delivery in the 1980s: International Comparison by Indication." American Journal of Obstetrics and Gynecology 170 (February): 495-504.

Parrish, K. M., V. L. Holt, T. R. Easterling, F. A. Connell, and J. P. LoGerfo. 1994. "Effect of Changes in Maternal Age, Parity, and Birth Weight Distribution on Primary Cesarean Delivery Rates." Journal of the American Medical Association 271 (6): 443-47.

Paul, R. H., and D. A. Miller. 1995. "Cesarean Birth: How to Reduce the Rate." American Journal of Obstetrics and Gynecology 172: 1903-11.

Stafford, R. 1991. "The Impact of Nonclinical Factors on Repeat Cesarean Section." Journal of the American Medical Association 265 (1): 59-63.

Tussing, A.D., and M. A. Wojtowycz. 1992. "The Cesarean Decision in New York State, 1986: Economic and Noneconomic Aspects." Medical Care 30 (6): 529-40.

United States Department of Health and Human Services. 1990. Healthy People 2000: National Health Promotion and Disease Prevention Objectives. Publication No. 9150212. Washington, DC: Government Printing Office.

Williams, R., and R. Wroblewski. 1991. "1984-1988 Maternal and Child Health Data Base: Descriptive Narrative." Community and Organization Research Institute (CORI), University of California, Santa Barbara.

Witter, F. R., L. E. Caulfield, and R.J. Stoltzfus. 1995. "Influence of Maternal Anthropomorphic Status and Birth Weight on the Risk of Cesarean Delivery." Obstetrics and Gynecology 85 (6): 947-51.

Emmett B. Keeler, Ph.D., Rolla Edward Park, Ph.D., Robert M. Bell, Ph.D., and Joan Keesey, B.A. are with RAND Health Sciences Program, Santa Monica, CA. Deidre Spelliscy Gifford, M.D., M.P.H., at RAND during the work on this project, is now at Brown University School of Medicine, Department of Obstetrics and Gynecology. Address correspondence and requests for reprints to Emmett B. Keeler, Ph.D., RAND, 1700 Main Street, Santa Monica, CA 90407-2138; tel. 310/393-7660, ext. 7239; fax 310/393-4818; e-mail emmett_keeler@rand.org. This article, submitted to Health Services Research on July 1, 1996, was revised and accepted for publication on November 5, 1996.