Choosing models for health care cost analyses: issues of nonlinearity and endogeneity.
Following the American Recovery and Reinvestment Act's support for comparative effectiveness research, concerns regarding conclusions drawn from analyses trying to model health care costs have grown (Weinstein and Skinner 2010). Consideration of the statistical techniques used in modeling is important to the validity of conclusions, causal implications, and usefulness of these cost studies. Problems of skewed data, nonnegative outcomes, and censoring inherent in cost analyses for treatments have been documented in great detail (Manning and Mullahy 2001; Buntin and Zaslavsky 2004; Basu and Manning 2009; Huang 2009), and many methods have been employed, including two-part models, generalized linear models, and maximum likelihood models, to attempt to generate consistent and unbiased cost estimates. Endogeneity issues that challenge researchers attempting to draw causal implications of health care cost studies also are well known (Deb and Trivedi 2006b; Stukel et al. 2007; Terza, Basu, and Rathouz 2008); treatment and insurance modalities are often correlated with both health care use and costs via unobserved patient characteristics.
Existing comparisons of statistical models for estimating health care use and costs have typically focused on skewed data (e.g., Manning 1998; Manning and Mtfllahy 2001) or endogeneity (Terza 1998), but few have addressed both problems at the same time. O'Malley, Frank, and Normand (2011) and Terza, Basu, and Rathouz (2008) are exceptions, but both of these analyses only include one special case of the control function (CF) approach, two-stage residual inclusion (2SRI). Other CF approaches may have more useful properties. In this study, we compare and contrast a variety of ways for analyzing nonlinear data with endogeneity: 2SLS on costs and the natural log of costs, CF methods employing a variety of residual types and functions of residuals (including the 2SRI approach), and a fully structural model estimated using full information maximum simulated likelihood (FIMSL). Our example analysis focuses on costs in a specific palliative care (PC) application, but these methods could be applied to any nonlinear continuous problem as well. Most studies also focus on single measures of response such as the average of marginal effects or the marginal effect calculated at the means of the covariates, despite the well-known property of nonlinear models that marginal effects can vary substantially for different choices of covariates. In this study, we compare those estimates as well as distributions of incremental effects from the different models we consider. Our goal is to empirically illustrate the impact of model specification on the cost outcome obtained and encourage others to carefully consider which models are most appropriate for answering their research questions.
As is well known, the health care cost distribution is not easy to characterize using parametric forms, and there is little insight in the literature as to what the joint distribution of error terms of treatment and costs might be. Therefore, while it would be relatively easy to design a Monte Carlo experiment that explicitly (or implicitly) favored one of the models under consideration relative to the others, it would not be easy to design one that would be neutral across modeling choices. By using real data, we give up the ability to benchmark on the truth, but we gain an understanding of how the alternative methods would perform in similar real data situations.
To illustrate differences between models, we use data from the Veterans Health Administration (VHA) regarding care costs for veterans who were eligible for PC consultations. PC focuses on symptom control, care coordination, and assistance with decision making for people facing life-limiting illnesses (Morrison et al. 2008). Utilization data were obtained from the VHA Medical SAS Inpatient dataset, and cost data were obtained from the Veterans Affairs (VA) Decision Support System National Data Extracts. We focused on inpatient stays for adults 18 years old and over who were discharged from a Veterans Integrated Service Network 3 acute care facility during fiscal years 2005 and 2006.
Our sample was restricted to those with a primary or secondary International Classification of Diseases-9 code for one of the following illnesses: advanced cancer (metastatic solid tumor, central nervous system malignancies, metastatic melanoma, locally advanced head and neck or pancreatic cancers); advanced human immunodeficiency syndrome/acquired immune deficiency syndrome (HIV/AIDS) with a secondary diagnosis of cirrhosis, cachexia, or cancer; or congestive heart failure (CHF) or chronic obstructive pulmonary disease (COPD) with multiple hospitalizations within a 6-month period or at least one intensive care unit (ICU) admission for CHF or COPD during the study period. Hospitalizations for chemotherapy or that lasted less than 48 hours were excluded, leaving us with a sample of 6,716 inpatient stays (3,389 unique patients) where a life-limiting illness left the patient eligible for a PC consult. This study was approved by the Institutional Review Boards of the Mount Sinai School of Medicine and the VA Medical Centers, where the data originated or where researchers were involved in the study. Further details about these datasets and sample selection are available elsewhere (Penrod et al. 2010).
Main Outcome and Control Variables
Our outcome variable is total direct costs per day during the inpatient admission. The main explanatory variable is whether the patient received a PC consultation. We control for advanced disease (advanced cancer, HIV/AIDS, CHF, COPD), principal admitting diagnosis (cardiovascular, pulmonary, cancer, gastrointestinal disease, genitourinary disease, infections), number of comorbidities, whether the patient died during the study period, number of days from discharge to death, age (<65, 65-74, 75-84, 85+), race (black, white, other), Hispanic ethnicity, and VA enrollment priority group. We also include fixed effects for admission in FY 2004 and sites of care.
MODELS AND ANALYSES
In this study, we explore a comprehensive set of analytic strategies, including relatively new and more commonly used methods, that can be applied in studies of health care costs when the regressor of interest is endogenous. Specifically, we compare two-stage least squares (2SLS) on costs and the natural log of costs (log-2SLS), CF methods in the context of generalized linear models employing the Gamma family of distributions, including the specific case of 2SRI, and a parametric model of Gamma distributed costs estimated using FIMSL. We briefly describe the pros and cons of each of these methods, as well as issues affecting model specification. In each analysis, standard errors are adjusted for clustering by patient to account for possibly dependent observations among patients who had more than one hospital stay. Analyses were conducted with Stata 11 (StataCorp 2009).
The structural forms of statistical models with a continuous outcome and an endogenous binary treatment can be represented with the following two equations. First, let the probability of treatment [d.sub.i] be denoted by
Pr([d.sub.i] = 1 | [z.sub.i], [I.sub.i]) = g([z'.sub.i][alpha] + [delta][I.sub.i]) (1)
where z denotes observed covariates, [I.sub.i] denotes latent (unobserved) characteristics, and g is the distribution of the implied error term. Next, let the expected value of the outcome [Y.sub.i] be
E([Y.sub.i] | [x.sub.i], [d.sub.i], [I.sub.i]) = f([x'.sub.i][beta] + [gamma][d.sub.i] + [lambda][I.sub.i]) (2)
where x denotes a vector of observed characteristics and f denotes the distribution of the implied error term in the outcome equation. Endogeneity of treatment probability ([d.sub.i]) in the outcome equation is introduced via the common unobserved characteristics ([I.sub.i]) in each step as long as [lambda] and [delta] are each not equal to zero.
Two-stage least squares, CF, and FIMSL models can all be conceptualized in this way. Each uses a different set of assumptions about f and g, but each requires that z include at least one variable, known as the instrumental variable (IV), that is not in x and that does not affect Y except through its effect on d. In other words, the IV is a covariate that is correlated with treatment likelihood but not outcome likelihood. In our case, the IV conceptually measures the propensity for requesting a PC consultation by an admitting physician based on his/her own characteristics and beliefs, and independent of his/her experience with the current patient, but influenced by past experience. Physician likelihood of requesting a PC consultation for a given patient was calculated as the ratio of number of within-sample patient encounters with a recorded PC consultation out of the total number of within-sample patient encounters prior to that patient. For example, if we wanted the likelihood of a physician requesting a consultation for his/her fourth patient, and consultations were recorded for two of his/her last three patients, the likelihood was 0.67. The likelihood for the first and second patients seen by each physician was computed as the grand mean of the likelihood of a PC consultation for all physicians seeing their first and second patient, respectively. This essentially is a heuristic Bayesian approach where we initially shrink to the grand mean for all physicians and then shift to no shrinkage once sufficient experience is recorded. The value of the IV varies within and across physicians; more information about physician practice patterns became available with each successive patient encounter. For observations where physician identity was missing, the grand mean of the likelihood of a PC consultation for all physicians seeing their first patient was substituted for PC consultation likelihood.
In our 2SLS models, the first-stage F statistic for our IV was 10.95 (p < .001), which is larger than the rule-of-thumb value of 10, and the chi-square value for the Anderson canonical correlation likelihood ratio test was significant (p < .001), suggesting that our IV is sufficiently strong and relevant. Similar measures of historical physician practice patterns have been used as strong IVs with non-VA data (Rassen et al. 2009). Additionally, we believe our IV is exogenous because the schedule of admitting physicians is independent of patient health (Penrod et al. 2010). Concerns about patients "shopping" for providers and the impact of this on the IV's validity in non-VA settings (Rassen et al. 2009) do not apply here, as attending physicians are scheduled up to a year in advance and are assigned to wards on rotating time blocks (William Hung, MD, personal communication). Patients do not choose the attending physician when they are hospitalized. Moreover, the sites included in our study had sufficient numbers of attending physicians to ensure within-site variation in the IV (the number of attending physicians per site ranged from 53 to 101; the number of unique values of the IV per site varied from 124 to 222).
Two-Stage Least Squares
Two-stage least squares is an IV model that extends the linear ordinary least-squares model to account for endogeneity of key regressors. In typical health care cost implementations, it is assumed that treatment modality (in our case, PC consultation) or insurance status is endogenous. Although statistical packages implement 2SLS as a "one step" procedure, it is instructive to think of the two steps involved in the model. First, PC consultation likelihood is modeled using all the exogenous covariates and the IV of physician likelihood of requesting a PC consultation. PC consultation likelihood is then included in a second linear model with cost as the outcome. 2SLS is often used because it is simple, is straightforward to interpret, and requires only minimal assumptions on the error term's distribution. For severely skewed outcomes such as health care costs, however, 2SLS is usually grossly inefficient and thus can produce substantially misleading estimates in finite samples.
Two-Stage Least Squares on Transformed Dependent Variables (log 2SLS)
To address the problems of skewed data involved in 2SLS models of costs, many researchers use natural log transformations of costs. Problems arise when one tries to retransform the outcome back to costs to calculate marginal or incremental effects. Simply taking the antilog of the predicted value will lead to biased estimates (Manning 1998). "Smearing" estimators such as the Duan smearing estimator have been developed to account for this bias, but these are generally only appropriate when error terms are homoskedastic (Duan 1983; Duan et al. 1983). They can be extended so that separate smearing estimators are calculated for each source of heteroskedasticity when sources can be easily identified (Manning 1998). Empirically, it appears that heteroskedasticity is common, but that sources of heteroskedasticity cannot be easily identified, making the use of smearing estimators with 2SLS ill-advised in this case.
The CF approach is a general procedure for addressing regressor endogeneity. The theory, developed by Heckman and Robb (1985) and Newey, Powell, and Vella (1999), shows that there exists a function of the residuals--the CF--from the model that predicts treatment, which produces the correct adjustment for its endogeneity in the outcome equation. The appropriate specification of the CF depends, in general, on the distributions of the treatment and outcome (Newey, Powell, and Vella 1999; Lee 2007). This approach uses some principles of IV regression, but rather than including the predicted probability of treatment in the outcome equation, a flexible function of the residuals from the treatment equation is substituted into the outcome equation.
Two-stage residual inclusion is a special case of the CF approach that has gained popularity in recent years (Lee 2007; Terza, Basu, and Rathouz 2008). In 2SRI, analysts use the response or raw residual from the treatment equation and add it into the outcome equation (e.g., Stuart, Doshi, and Terza 2008; Hadley et al. 2010; Trogdon, Nurmagambetov, and Thompson 2010). Except in certain special cases, however, 2SRI is a misapplication of the CF approach because the theory only demonstrates that some function of the residuals is the appropriate CF, not that the appropriate CF is the residual itself (i.e., a linear function of the residuals).
Basu and Manning (2009) noted that more research needs to be done on the appropriate functional form and specification of residuals in order for CF methods to behave optimally in nonlinear studies. In nonlinear models, not only is there the issue of which function of residuals to choose but also the issue of how to form and interpret residuals. For a binary treatment, residuals can be constructed in at least four ways: response, Pearson, Anscombe, and deviance. These are equivalent in linear but not in nonlinear settings (Gill 2001). Response residuals are the difference between the binary indicator for observed treatment and the predicted probability of treatment. Pearson residuals are constructed by dividing the response residual by the standard error of the predicted treatment probability. Anscombe residuals are calculated using a complex transformation of the observed indicator for treatment, the predicted treatment probability, and the variance of that prediction. The Anscombe residual transformation is chosen to achieve normality of the residuals (Gill 2001). Deviance residuals are often numerically similar to Anscombe residuals but are calculated more simply, by taking the square root of an observation's contribution to the deviance that is minimized to obtain GLM parameter estimates (Pierce and Schafer 1986; Gill 2001).
O'Malley, Frank, and Normand (2011) found that CFs are robust to the treatment equation's distribution of response residuals if the outcome equation's distribution of response residuals is symmetric. However, the outcome equation's distribution of response residuals will not be symmetric in the case of a skewed outcome such as health care costs, so it is unclear whether the CF model would remain robust to the treatment equation's distribution of response residuals.
In our case, we model treatment with GLM using a binomial distribution and logit link and outcome with GLM using a gamma distribution and log link. The GLM with the gamma distribution is chosen for its increasing popularity in cost studies (e.g., Blough, Madden, and Hornbrook 1999; Manning and Mullahy 2001). For each residual type, we estimated models using first-through third-degree polynomials of residuals as reasonable ways to add nonlinearity and flexibility relative to the simple 2SRI approach.
Full Information Maximum Simulated Likelihood
Nonlinear models for treatment effects also can be estimated directly from the structural forms shown in equations (1) and (2). In the linear case, the solution yields the familiar 2SLS model. In the nonlinear case, while the joint distribution of selection and outcome variables conditional on the common latent factors can be derived, the estimation problem arises because the [I.sub.i] are unknown. It is not always feasible to write down the unconditional likelihood in closed form, but a simulated log-likelihood function for the data can be defined and maximized as in Deb and Trivedi (2006a, b) and as applied in Buntin et al. (2010). We use quasi-random methods involving Halton sequences to generate the simulated likelihood. These are more efficient than pseudo-random number draws; they cover the distribution of the integral's domain more evenly and have lower variance because successive draws are negatively correlated with each other (Deb and Trivedi 2006a; Haan and Uhlendorff 2006). Generally, they are also more computationally efficient than using quadrature-based numerical integration methods (Haan and Uhlendorff 2006).
To compare the results of the methods described above, we examine several measures of treatment effects. For each method, the goal is to understand the partial effect of a treatment (PC consultation) on health care costs and to use the instrument effectively to make plausible causal inferences. In nonlinear models, partial effects differ across observations. Thus, comparing marginal or incremental effects at the means of a distribution or presenting average marginal or incremental effects can be quite misleading.
For whom the marginal or incremental effect is calculated depends on the research question. Because this paper is focused on overall model comparison, we do not focus on a particular treatment or incremental effect. Instead, we show how models differ in several measures of treatment effects, including the local average treatment effect (LATE), the average treatment effect on the treated (ATET), and the sample distribution of incremental effects. For the LATE and ATET, we also report estimates of empirical standard errors obtained via nonparametric bootstrap replication.
While the importance of the distribution of marginal and incremental effects across sample observations has been acknowledged by others (e.g., Hoderlein, Klemela, and Mammen 2010), few studies report characteristics of these distributions in their results. It is important to note, however, that distributions of incremental effects are sensitive to assumptions made about the parametric form of the data-generating process. Here, we compare incremental effect distributions across models with similar parametric forms for the conditional mean. Thus, we expect differences in results to be due to features of the models other than their parametric forms.
The ATET is the treatment effect for patients who had a PC consultation. The LATE represents the treatment effect for two patient groups: (1) those who received a PC consultation when ordered by a VA physician and (2) those who did not receive a PC consultation when one was not ordered. We are unable to observe "noncompliers" who received a PC consultation outside of the VA. Because we are unable to observe compliance, we calculate a LATE. In some cases, the LATE can equal the ATET; this occurs if no one with a treatment value of 0 receives treatment and if there is complete compliance among those with a treatment value of 1 (Angrist, Imbens, and Rubin 1996). Other treatment effects that could be reported include conditional ATE and conditional ATET. More detail about treatment effect interpretations is available elsewhere (Angrist, Imbens, and Rubin 1996; MaCurdy, Chen, and Hong 2011; O'Malley, Frank, and Normand 2011).
In our study, 671 patients received a PC consultation, and 2,718 patients did not. Most (67 percent) patients were age 65 or older, 25 percent had diagnoses of advanced cancer, 2 percent had advanced HIV/AIDS, 46 percent had advanced CHF, and 51 percent had COPD (Table 1). The mean number of comorbidities at first hospitalization was 2.1 (standard deviation [SD] = 1.3), and 51 percent died during the study period (92 percent of PC patients and 41 percent of non-PC patients). A breakdown of characteristics by PC and non-PC users is available elsewhere (Penrod et al. 2010).
Mean cost per day was $1,232.93 (SD = 746.00), with a median of $1,020.50 and interquartile range of $813.53-$1,429.79 (Figure 1). The mean natural log of costs per day was 7.01 (SD - 0.43), with a median of 6.93 and interquartile range of 6.70-7.27. While costs are severely skewed, the distributions of the natural log of costs and of an OLS regression of log costs on exogenous variables are fairly symmetric, although they still fail tests of normality (results available from authors). Note, however, that the CF and FIMSL methods we discuss do not require the natural log of costs to be normally distributed. Thus, while the relative symmetry of the distribution of natural log of costs is a useful observation and may increase model stability, it is not technically required.
The mean and median of the distribution of LATEs of a PC consultation on costs, and the range and variance of the incremental effect distributions, were quite sensitive to model specification, with CF approaches and FIMSL being the most similar to each other. Further details about the model estimates are described below.
2SLS on Costs and on Natural Log of Costs. The mean and median LATE of PC consultation on costs per day was -4,183.73 under the 2SLS cost model (bootstrapped standard error [SE] = 1,899.63; Table 2). Because the incremental effect and slope of the regression model are equal in linear models, the estimated incremental effect is identical for all individuals.
For log-2SLS, the median LATE of PC consultations on costs per day was -1,743.87, and the mean was -2,180.00. The bootstrapped SEs of the median and mean LATE from this model are larger than those obtained from 2SLS (257,453.1; 9,749,947, respectively). Note that estimates from the log-2SLS model were re-transformed into dollars with a homoskedastic nonparametric retransformation (Duan 1983).
[FIGURE 1 OMITTED]
CF Approaches. We calculated CF estimates using response, Pearson, Anscombe, and deviance residuals. We included first-through third-degree polynomials of each residual type to allow flexible nonlinearities. In each case, the LATE distribution was much tighter than those obtained via 2SLS; standard errors were at least an order of magnitude smaller. The models with first-degree response residuals are the 2SRI estimates currently popular in the literature (e.g., Pizer 2009; Hadley et al. 2010). With first-degree response residuals, the mean LATE of PC consultations on costs was -73.07 (SE = 98.45), and the median was -72.47 (SE = 97.23). With first-degree Pearson residuals, the mean LATE was -33.31 (SE - 44.90) and the median LATE was -32.94 (SE = 44.42). With third-degree Pearson residuals, the mean LATE was -70.93 (SE = 87.88) and the median LATE was -70.39 (SE = 86.81). The mean and median LATEs were more substantial in both the third-degree Anscombe and deviance residuals (third-degree Anscombe: mean LATE = -183.26 [SE = 145.60], median LATE = -181.50 [SE = 143.05]; third-degree deviance: mean LATE =-278.35 [SE = 166.42], median LATE =-274.72 [SE = 161.83]). The specifications with Anscombe and deviance residuals had the largest standard errors among the estimates obtained from CF models. As previously noted, Anscombe and deviance residuals are often similar; thus, it is not surprising that results using those residuals are similar.
Full Information Maximum Simulated Likelihood. The mean and median LATEs estimated from FIMSL (-432.11 [SE = 18.47], -426.02 [SE = 18.24], respectively) were most similar to the third-degree polynomial CF estimates. The standard errors of the FIMSL model were, as theoretically expected, the smallest of any model we tested.
Average Treatment Effect on the Treated. The incremental effect of a PC consultation on costs for those who received a consultation (ATET) was sensitive to model specification and followed the same pattern as the LATEs, as expected. The magnitude of the median ATETs was ranked the same as the magnitude of median LATEs (i.e., 2SRI provided smaller estimated treatment effects than CF with third-degree Anscombe or deviance residuals, and FIMSL provided effects most similar to those obtained from CF with third-degree Anscombe or deviance residuals; Table 2). Because ATETs focus on those who did receive treatment, we expected them to be of larger magnitude than the LATEs, which cover more of the sample. For example, FIMSL produced a median LATE of -426.02 and a median ATET of -459.70 (Table 2). However, this trend was only witnessed for log-2SLS, third-degree response, Anscombe and deviance CF models, and FIMSL.
Incremental Effect Distributions. The distribution of LATEs of log-2SLS was much different than the distributions from any of the CF models or FIMSL (Figure 2). The 2SLS and log-2SLS estimates were larger (had larger negative values) than the estimates from other models, and the range of log-2SLS incremental effects was wider than the range of any other model estimates. The distributions of the FIMSL, third-degree Anscombe CF and third-degree deviance CF were clustered relatively closely together with smaller bootstrapped standard errors (Figure 3).
By estimating treatment effects from an empirical dataset using a variety of econometric methods, we obtained considerable insight into the relative performance of methods used to estimate nonlinear models with endogenous variables. Some, such as 2SLS on costs and log costs, produce larger point estimates with considerably larger standard errors as compared with all the other models. The others approaches, namely CFs and FIMSL, respect the binary nature of the treatment and skewness of the outcome's distribution and pro duce estimates that are closer to each other with standard errors that are also similar to each other. Our recommendations for other researchers performing cost analyses will focus on these two approaches that account for both endogeneity and distributional complexity in potentially useful ways.
[FIGURE 2 OMITTED]
As mentioned above, the estimates obtained from CF methods were different when using first-degree response residuals (2SRI) compared with third-degree Anscombe or deviance residuals in the second equation. This is concerning because the majority of applications of CF approaches in the health services literature have been using response residuals. Response residuals from binary choice models are known to be heteroskedastic and skewed, and residual inclusion (i.e., first-degree polynomials) may not be a sufficiently flexible special case of the CF approach. This calls into question the use of 2SRI when modeling nonlinear outcomes such as health care costs.
[FIGURE 3 OMITTED]
The estimates from the FIMSL model were the most similar to those of the CF third-degree Anscombe and deviance residuals. The clustering of these estimate distributions suggests that we have robust estimates of the effect of PC consultations on costs and these have the most plausible causal implications in our data. In our case, FIMSL provided similar results to models with flexible CFs and with lower variance for estimates. Further examination with other datasets is needed to determine how often FIMSL results mirror those from CF models.
A limitation of IV methods with a single instrument is the inability to test the exclusion restriction. While we believe that physician identity is related to treatment likelihood and not to costs, it is possible that physicians with more favorable attitudes toward PC also have different attitudes about the use of other treatments that could affect cost. If this is the case, the exclusion restriction would be violated for our instrument. It is important to note that the results of each model may change in different directions or magnitude in response to a biased instrument; the impact of instrument bias on the relative performance of these models is an important area for further research.
Recommendations for Health Care Cost Analyses
When reporting results of cost analyses, researchers should begin with an understanding of the assumptions under which estimates are valid. Because it is impossible to test all assumptions, one should consider checking the robustness of these results by obtaining estimates from a variety of models with assumptions that are reasonable for the particular research question. One way to evaluate the robustness of results is through careful examination of the marginal or incremental effect distribution, while keeping in mind that these distributions are sensitive to assumptions made in the underlying models. Through Figures 2 and 3, for example, we saw that the distributions of incremental effects estimated from 2SRI overlap with those from CF with third-degree Pearson residuals but hardly overlap with the distributions of the other CF and FIMSL models.
Furthermore, we stress the importance of accounting for nonlinearity and endogeneity at the same time. Models that account for endogeneity only, such as 2SLS, are not useful when they are not designed for nonlinear models. Models that are not sufficiently flexible, such as 2SRI, may not be the optimal choice for cost models.
In this study, we empirically compared several different methods of cost analyses and the degree to which they take into account nonlinearity and endogeneity to develop plausible causal implications of an important health services research question. 2SLS models provide grossly different and inaccurate estimates in the case of cost data as compared with a variety of other models. While most nonlinear methods produce similar mean and median effects, the incremental effect distributions still vary in range, variance, and skewness. We echo others in stating that no single model is a priori optimal (e.g., Manning and Mullahy 2001), but researchers should take care to model their data in several different ways, and they should pay attention to the definition and implications of their chosen treatment effect estimates.
Joint Acknowledgment/Disclosure Statement: This material is based on work supported in part by the Department of Veterans Affairs, Veterans Health Administration, Office of Research and Development, Health Services Research and Development Service (IAD-06-060-2; REA 08-260). This work was also supported in part by a NIH/NIA Claude D. Pepper Older Americans Independence Center (1P30AG28741-01). The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs or the United States government.
Portions of this work were presented at the 2011 AcademyHealth Annual Research Meeting and a VA Health Economics Resource Center cyber seminar.
Angrist, J. D., G. W. Imbens, and D. B. Rubin. 1996. "Identification of Causal Effects Using Instrumental Variables." Journal of the American Statistical Association 91 (434): 444-55.
Basu, A., and W. G. Manning. 2009. "Issues for the Next Generation of Cost Analyses." Medical Care 47 (7 suppl 1): S109-14.
Blough, D. K., C. W. Madden, and M. C. Hornbrook. 1999. "Modeling Risk Using Generalized Linear Models." Journal of Health Economics 18: 153-71.
Buntin, M., and A. M. Zaslavsky. 2004. "Too Much Ado about Two-Part Models and Transformation? Comparing Methods of Modeling Medicare Expenditures." Journal of Health Economics 23: 525-42.
Buntin, M. B., C. H. Colla, P. Deb, N. Sood, andJ.J. Escarce. 2010. "Medicare Spending and Outcomes after Postacute Care for Stroke and Hip Fracture." Medical Care 48: 776-84.
Deb, P., and P. K. Trivedi. 2006a. "Maximum Simulated Likelihood Estimation of a Negative Binomial Regression Model with Multinomial Endogenous Treatment." Stata Journal 6 (2):246-55.
--. 2006b. "Specification and Simulated Likelihood Estimation of a Non-Normal Treatment-Outcome Model with Selection: Application to Health Care Utilization." Econometrics Journal 9: 307-31.
Duan, N. 1983. "Smearing Estimate: A Nonparametric Retransformation Method." Journal of the American Statistical Association 78 (383): 605-10.
Duan, N., W. G. Manning, C. N. Morris, and J. P. Newhouse. 1983. "A Comparison of Alternative Models for the Demand for Medical Care." Journal of Business & Economic Statistics 1 (2): 115-26.
Gill, J. 2001. Generalized Linear Models: A Unified Approach. Sage University Paper Series on Quantitative Applications in the Social Sciences, Series no. 07-134. Thousand Oaks, CA: Sage.
Haan, P., and A. Uhlendorff. 2006. "Estimation of Multinomial Logit Models with Unobserved Heterogeneity Using Maximum Simulated Likelihood." Stata Journal 6 (2): 229-45.
Hadley, J, K. R. Yabroff, M.J. Barrett, D. F. Penson, C. S. Saigal, and A. L. Potosky. 2010. "Comparative Effectiveness of Prostate Cancer Treatments: Evaluating Statistical Adjustments for Confounding in Observational Data." Journal of the National Cancer Institute 102: 1-14.
Heckman, J. J., and R. Robb. 1985. "Alternative Methods for Evaluating the Impact of Interventions: An Overview." Journal of Econometrics 30: 239-67.
Hoderlein, S.,J. Klemela, and E. Mammen. 2010. "Analyzing the Random Coefficient Model Nonparametrically." Econometric Theory 26: 804-37.
Huang, Y. 2009. "Cost Analysis with Censored Data." Medical Care 47 (7 suppl 1): S115-9.
Lee, S. 2007. "Endogeneity in Quantile Regression Models: A Control Function Approach." Journal of Econometrics 141 : 1131-58.
MaCurdy, T., X. Chen, and H. Hong. 2011. "Flexible Estimation of Treatment Effect Parameters." American Economic Review: Papers & Proceedings 101 (3): 544-51.
Manning, W. G. 1998. "The Logged Dependent Variable, Heteroscedasticity, and the Retransformation Problem." Journal of Health Economics 17: 283-95.
Manning, W. G., and J. Mullahy. 2001. "Estimating Log Models: To Transform or Not to Transform ?" Journal of Health Economics 20:461-94.
Morrison, R. S., J. D. Penrod, J. B. Cassel, M. Caust-Ellenbogen, A. Litke, L. Spragens, and D. E. Meier. 2008. "Cost Savings Associated with US Hospital Palliative Care Consultation Programs." Archives of Internal Medicine 168 (16): 1783-90.
Newey, W. K., J. L. Powell, and F. Vella. 1999. "Nonparametric Estimation of Triangular Simultaneous Equations Models." Econometrica 67 (3): 565-603.
O'Malley, A. J., R. G. Frank, and S.-L. T. Normand. 2011. "Estimating Cost-Offsets of New Medications: Use of New Antipsychotics and Mental Health Costs for Schizophrenia." Statistics in Medicine 30: 1971-88.
Penrod, J. D., P. Deb, C. Dellenbaugh, J. F. Burgess, C. W. Zhu, C. L. Christiansen, C. A. Luhrs, T. Cortez, E. Livote, V. Allen, and R. S. Morrison. 2010. "Hospital-Based Palliative Care Consultation: Effects on Hospital Cost." Journal of Palliative Medicine 13 (8): 973-9.
Pierce, D. A., and D. W. Schafer. 1986. "Residuals in Generalized Linear Models." Journal of the American Statistical Association 81 (396): 977-86.
Pizer, S. D. 2009. "An Intuitive Review of Methods for Observational Studies of Comparative Effectiveness." Health Services and Outcomes Research Methodology 9: 54-68.
Rassen, J. A., M. A. Brookhart, R.J. Glynn, M. A. Mittleman, and S. Schneeweiss. 2009. "Instrumental Variables II: Instrumental Variable Application--In 25 Variations, the Physician Prescribing Preference Generally Was Strong and Reduced Covariate Imbalance." Journal of Clinical Epidemiology 62 (12): 1233-41.
StataCorp. 2009. Stata Statistical Software: Release 11. College Station, TX: StataCorp LP.
Stuart, B. C., J. A. Doshi, and J. V. Terza. 2008. "Assessing the Impact of Drug Use on Hospital Costs." Health Services Research 44 (1): 128-44.
Stukel, T. A., E. S. Fisher, D. E. Wennberg, D. A. Alter, D.J. Gottlieb, and M. J. Vermeulen. 2007. "Analysis of Observational Studies in the Presence of Treatment Selection Bias: Effects of Invasive Cardiac Management on AMI Survival Using Propensity Score and Instrumental Variable Methods." Journal of the American Medical Association 297: 278-85.
Terza, J. V. 1998. "Estimating Count Data Models with Endogenous Switching: Sample Selection and Endogenous Treatment Effects." Journal of Econometrics 84: 129-54.
Terza, J. V., A. Basu, and P. J. Rathouz. 2008. "Two-Stage Residual Inclusion Estimation: Addressing Endogeneity in Health Econometric Modeling." Journal of Health Economics 27:531-43.
Trogdon, J. G., T. A. Nurmagambetov, and H. F. Thompson. 2010. "The Economic Implications of Influenza Vaccination for Adults with Asthma." American Journal of Preventive Medicine 39 (5): 403-10.
Weinstein, M. C., and J. A. Skinner. 2010. "Comparative Effectiveness and Health Care Spending - Implications for Reform." New England Journal of Medicine 362 (5): 460-5.
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.
Address correspondence to Melissa M. Garrido, Ph.D., GRECC/REAP, James J. Peters VA Medical Center, Bronx, NY, and Brookdale Department of Geriatrics and Palliative Medicine, Mount Sinai School of Medicine, 130 W. Kingsbridge Road, Bronx, NY 10468; e-mail: email@example.com. Partha Deb, Ph.D., is with the Department of Economics, Hunter College and the Graduate Center, City University of New York, New York, NY. National Bureau of Economic Research, Cambridge, MA. James F. Burgess, Jr., Ph.D., is with the Center for Organization, Leadership and Management Research, VA Boston Healthcare System and Department of Health Policy and Management, Boston University School of Public Health, Boston, MA. Joan D. Penrod, Ph.D., is with the GRECC/REAP, James J. Peters VA Medical Center and Brookdale Department of Geriatrics and Palliative Medicine, Mount Sinai School of Medicine, Bronx, NY.
Table 1: Characteristics of Sample (N = 3,389 Patients) N (%) or Variable Mean (SD) Age <65 1,104 (32.6%) 65-74 833(24.6%) 75-84 1,092 (32.2%) 85+ 360(10.6%) Race White 2,211(65.2%) Black 978 (28.9%) Other 194(5.7%) Hispanic ethnicity 142(4.2%) Married 1,129 (33.3%) VA enrollment priority 1-6 (lower income, disabled) 3,026 (89.3%) 7-8 (higher income, not disabled) 345(10.2%) Advanced disease diagnosis * Cancer 844(24.9%) HIV/AIDS 56(l.6%) COPD 1,720 (50.7%) CHF 1,562 (46.1%) Number of comorbidities at initial hospitalization 2.1 (1.3) Principal reason for initial hospitalization Cardiovascular 1,043 (30.8%) Pulmonary 635(18.7%) Cancer 683(20.1%) Gastrointestinal disease 217(6.4%) Genitourinary disease 167(4.9%) Infections 79(2.3%) Number of hospitalizations 1 1,685 (49.7%) 2 963(28.4%) 3 355(10.5%) 4 or more 386(11.4%) Died during study period 1,722 (50.8%) Had a palliative care consultation 671(19.8%) * Patients could have more than one advanced disease diagnosis. CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease; HIV/AIDS, human immunodeficiency virus/ acquired immunodeficiency syndrome; VA, Veterans Affairs. Table 2: Effects of Palliative Care Consultation on Costs Bootstrapped Standard Model Mean Errors Local average treatment effects (LATE) 2SLS Cost -4,183.73 1,899.63 Natural log of cost * -2,180.00 9,794,947 CF with response residuals First-degree polynomial (2SRI) -73.07 98.45 Second-degree polynomial -74.54 98.38 Third-degree polynomial -118.02 10724 CF with Pearson residuals First-degree polynomial -33.31 44.90 Second-degree polynomial -68.03 77.42 Third-degree polynomial -70.93 8788 CF with Anscombe residuals First-degree polynomial -11.74 108.44 Second-degree polynomial -37.23 135.81 Third-degree polynomial -183.26 145.60 CF with deviance residuals First-degree polynomial -31.31 138.23 Second-degree polynomial -41.12 152.87 Third-degree polynomial -278.35 166.42 FIMSL -432.11 18.47 Average treatment effects on the treated (ATET) 2SLS Cost -4,183.73 1,899.63 Natural log of cost * -3,539.11 4.56 x [10.sup.7] CF with response residuals First-degree polynomial (2SRI) -70.98 98.83 Second-degree polynomial -72.49 98.97 Third-degree polynomial -118.53 114.00 CF with Pearson residuals First-degree polynomial -31.42 43.35 Second-degree polynomial -65.84 7747 Third-degree polynomial -68.79 88.13 CF with Anscombe residuals First-degree polynomial -10.90 100.75 Second-degree polynomial -35.23 128.56 Third-degree polynomial -193.53 167.93 CF with deviance residuals First-degree polynomial -29.50 130.55 Second-degree polynomial -39.01 144.82 Third-degree polynomial -316.98 218.24 FIMSL -460.87 22.59 Bootstrapped Standard Model Median Errors Local average treatment effects (LATE) 2SLS Cost -4,183.73 1,899.63 Natural log of cost * -1,743.87 257,453.1 CF with response residuals First-degree polynomial (2SRI) -72.47 9723 Second-degree polynomial -73.92 97.17 Third-degree polynomial -117.03 105.86 CF with Pearson residuals First-degree polynomial -32.94 44.42 Second-degree polynomial -67.53 76.52 Third-degree polynomial -70.39 86.81 CF with Anscombe residuals First-degree polynomial -11.59 107.07 Second-degree polynomial -36.86 134.06 Third-degree polynomial -181.50 143.05 CF with deviance residuals First-degree polynomial -30.97 136.46 Second-degree polynomial -40.69 150.92 Third-degree polynomial -274.72 161.83 FIMSL -426.02 18.24 Average treatment effects on the treated (ATET) 2SLS Cost -4,183.73 1,899.63 Natural log of cost * -3,408.63 1.41 x [10.sup.7] CF with response residuals First-degree polynomial (2SRI) -71.03 98.47 Second-degree polynomial -72.61 98.47 Third-degree polynomial -118.61 113.16 CF with Pearson residuals First-degree polynomial -31.46 43.31 Second-degree polynomial -65.81 77.31 Third-degree polynomial -68.76 87.94 CF with Anscombe residuals First-degree polynomial -10.90 100.56 Second-degree polynomial -35.27 128.33 Third-degree polynomial -192.98 166.42 CF with deviance residuals First-degree polynomial -29.49 130.20 Second-degree polynomial -38.96 144.47 Third-degree polynomial -315.28 215.51 FIMSL -459.70 22.86 Notes. Analyses controlled for age, race, ethnicity, marital status, principal diagnosis, number of comorbidities, advanced disease diagnosis, death during study period, days from hospital discharge until death, veteran priority status, facility, and year of admission. * Applied Duan's smearing estimator. 2SLS, two-stage least squares; CF, control function; 2SRI, two- stage residual inclusion; FIMSL, full information maximum simulated likelihood.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||METHODS ARTICLE|
|Author:||Garrido, Melissa M.; Deb, Partha; Burgess, James F., Jr.; Penrod, Joan D.|
|Publication:||Health Services Research|
|Date:||Dec 1, 2012|
|Previous Article:||Residential segregation and the availability of primary care physicians.|
|Next Article:||A nonparametric statistical method that improves physician cost of care analysis.|