Printer Friendly

A comparative study of severity indexes.

A Comparative Study of Severity Indexes

A number of severity-of-illness indexing sy stems have been devised over the past several years. Initially, the systems were entirely manual, but more recently use of the computer has advanced the sophistication of the systems. There are two basic types of systems. The first type utilizes the ICD-9 codes to make a statement regarding severity, while the second type uses clinical data abstracted from the patient chart. Of the latter type, the two best known are Computerized Severity Index (CSI) by Health Systems International (HSI) and MedisGroups (MGP) by MediQual.

In 1987, the Iowa Health Data Commission (IHDC), prodded by Health Policy Corporation of Iowa (HPCI), developed an interest in employing such a system to adjust Iowa cost data. A second, but not necessarily less important, motive was to compare outcomes in an effort to learn more about quality.

A task force was formed to evaluate the available systems. At about the same time, it became apparent that this sort of system may have internal value for resource utilization data and possibly to assist with a variety of quality management functions. Because it appeared likely that we would eventually be required to use such a system and that we would likely have a choice of which system to use, we decided to try one. When both HSI and MedisGroups offered to let us try their software without charge, we chose to implement both systems to see which functioned best at our hospital.

CSI was installed on a Personal System 2-80 computer, which also runs our utilization management system. MedisGroups was installed on a Personal System 2-70 computer loaned to us by MediQual. Three people were trained on each system. The educational background and number of years experience in quality review or data retrieval for these staff members are shown in table 1 below. A group of DRGs was selected for study that included several krgan systems and medical specialties that were sufficiently common to provide adequate numbers of charts in a relatively short time and that represented the typical types of cases at St. Luke's. All charts were abstracted by both teams so that the two resulting databases contained identical cases. The review schedules were done strictly according to the protocols established by the vendor (table 2, page 12).


Abstracting was accomplished for both systems on 902 patient records. The DRGs that contained 10 or more patients are shown in table 3, page 13. The numbers of patients in each severity group are shown in figure 1, page 17. A preponderance of cases fall into severity group 1 on CSI but sere much more evenly distributed among the severity groups in MedisGroups. Of interest is that there were only 18 level four patients on CSI and just two on MedisGroups. This is due in large part to the DRGs selected for the study.

Figure 2, page 17, and figure 3, page 18, show average charges and average length of stay for each severity group. In both cases, average charges increase nearly linearly with severity, as does the average length of stay except for severity group four. In severity group four patients, the average charges on CSI did not increase significantly and the length of stay actually fell on both systems. Most likely, this reflects the fact that there are more deaths in this group of patients, although in our study the numbers are too small to be eaningful.

Figure 4, page 18 shows the relationship between admission severity and mortality. Although the numbers are small, the data suggest that, not surprisingly, mortality increases with admission severity. The high mortality rates in severity group four with both systems indicates that they correctly identify the sickest patients.

We also attempted to compare the severity scores from each system for the same patient. Table 4, page 14, and table 5, page 14, show vividly that substantial disagreement exists between the two systems. For instance, the 131 patients in CSI admission severity group two seem nearly randomly distributed among all the MedisGroups severity groups. It is remarkable that one and four, respectively, of the 18 patients in CSI admission severity group four fall into MedisGroups severity groups zero and one.

Further, 110 patients (12 percent) had differences for the two systems of two or more in severity scores (table 6, page 15). These differences are found throughout the scores for both systems and are not confined to any one severity level.

We also looked at variation in length of stay (LOS) and charges within severity groups in specific DRGs (tables 7-8, pages 15-16). Even within these groups, substantial variation is obvious. For example, in DRG 89 (pneumonia), patients in MedisGroups severity level 3 had length of stays that ranged from 3 to 28 days and charges that ranged from $1,953 to $11,687. Clearly, on an individual case basis, the severity level is far from perfect for either system at predicting LOS or charges.

DRG 359 (hysterectomy) illustrates another point. All 48 patients fell into CSI severity group 1 and 44 of 48 into MedisGroups severity group 0. It would seem, therefore, that adjusting for severity in this DRG does not add much information and may not be necessary.

Regression analysis was performed to see if admission severity did, indeed, have predictive value with regard to charges and length of stay. (Admission severity was selected because it is the only severity score that is directly comparable for the two systems selected for the study.) When average charges and average length of stay are regressed on admission severity, the reduction in variance ([r.sup.2]) is very high (table 9, page 16). However, when charges and length of stay for individual patients are regressed on admission severity, without regard for diagnosis or DRG, the [r.sup.2] are very low.

Regression analysis was also performed for individual DRGs. This portion of the study was limited to those DRGs with at least 30 cases (table 10, page 16). In our study, the reduction in variance is low throughout the selected DRGs and confirms other reports.


Both systems functioned as described by vendors. Vendor support and cooperation were excellent in both cases. MedisGroups system was more difficult to learn, however, and required about two months to achieve proficiency, compared to just one month with CSI. This finding has practical implications, because it is expected that the abstracting task, being somewhat tedious, will have higher than average turnover. CSIs is easier to learn because the system, given the DRG, generates a list of questions to be answered, thus simplifying the task. On the other hand, MedisGroups requires the abstracter to approach each chart with nothing more than the need to search out certain "key clinical findings." After two months, however, abstracting times were fairly similar.

There is a greater concentration of patients in the lowest severity group with CSI, while there is a more even distribution among severity groups with MedisGroups. The more even distribution allows for greater dispersion, thus creating more opportunity for comparison. On the other hand, so few patients fit into severity group 4 that it may be a useless group.

During Data Commission discussions that have taken place over the past two years, there has been a strong desire expressed by the provider community that more than one system be allowed. That idea has been resisted by the payer side, because using different systems may destroy comparability among hospitals and even among individual physicians. The results of this study certainly support the latter contention.

When we looked at variations in severity scores between systems for individual patients, the differences were striking, particularly in that 12 percent of the cases showed differences of at least two severity levels. This variation is accounted for at least partially by the different ways in which the systems approach the notion of severity.

CSI's approach is disease specific. That is, for specific diseases or groups of diseases, the various historical, clinical, laboratory, and radiology findings that would indicate the relative sickness of the patient have been identified and ranked on a four-point scale from mild to severe. This is collected in a large compendium that the abstracter uses in reviewing the chart. Findings that are most severe are used to determine the ultimate severity group to which the patient is assigned. CSI feels strongly that doctors think in terms of individual disease entities and that a judgment about severity cannot be made independently of diagnosis. An example the company often cites is that a fever of 102[degrees] in a patient with leukemia on chemotherapy is an entirely different matter than a fever of 102[degrees] in an otherwise healthy individual with influenza.

MedisGroups, on the other hand, has identified a group of what they call "key clinical findings" (KCFs), which are somewhat generic rather than disease-specific and which, in the Medisgroups methodology, indicate potential for organ failure. They are independent of diagnosis; indeed, the abstracting can be done without any knowledge kf the patient's ultimate diagnosis. A high pulse rate, for example, is an undesirable finding whether the patient has heart disease, pneumonia, or anemia.

While the differences between the two systems seem subtle in many respects, certain practical implications are apparent. For example, a patient with cancer admitted for terminal care may be a CSI severity group three or four but be a MedisGroups zero or one.

Realizing that there is reluctance to compare hospitals and physicians, it seems obvious that, if comparisons are to be made, they should be made in such a way that adds to knowledge rather than worsens confusion. Even after now having considerable experience with both systems, se make no judgment about which is better or more appropriate. We have, however, come to believe that the idea of all users employing the same system is a rational one if the data are to be used for comparisons purposes.

Another important caveat that needs to be clearly articulated is that substantial variation exists within severity groups and within DRGs. As table 7 illustrates, some fairly ill patents will respon quickly, thus having a fairly short LOS and consuming few financial resources, while, at the other extreme, some patients who don't seem very ill on admission will have a stormy course with long LOS and high cost. As a result, to employ these statistics on an individual basis for comparison among hospitals kr doctors could be misleading at best and downright damaging at worst. It is fear of this kind of inappropriate use of statistics by certain factions that explains the reluctance on the part of the provider community to participate actively in this process.

By the same token, table 8, page 16, illustrates that, for some DRGs, severity adjustment really doesn't add much. Even though these may be relatively easy charts to abstract, they nevertheless require the use of time that might be used more productively elsewhere. Given the labor intensity of both systems, some thought needs to be given to which cases will produce the most helpful information. Data overload is prevalent in today's environment, but good information is in short supply and the potential for unnecessary work is great with these systems.

The low [r.sup.2.s] are also notable in that it becomes obvious that severity of illness, at least as defined by CSI and MedisGroups, is only a small determine in the ultimate LOS and the cost of hospitalization. Whether this is due to difficulties in defining severity of illness or to practice patterns is not addressed by this study and will obviously need to be the subject of much future research. Again, however, the limited knowledge in this area should preclude too much emphasis from being place on these kinds of statistics until more is learned.

Finally, it should be pointed out that this study has largely addressed the practical implications of implementing such a system and the vagaries of the numbers derived from it. All work in the study has used admission severity as a point of comparison. It is at this point that similarities between the two systems ends.

CSI generates two more severity scores--one on discharge and a maximum severity score that employs the most severe indicators of severity from throughout the hospital stay. MedisGroups, on the other hand, generates a judgment about the outcome of hospitalization in terms of whether morbidity or mortality was part of the outcome. This judgment is made by the computer on the basis of an internally generated second severity score, which is then compared to the admission severity score. Technically, MedisGroups does employ a second severity score, but only internally and only as a part of its outcome methodology. No judgment is made about the efficacy of either method, only that comparison is not possible.

The usefulness of these systems for internal quality purposes and the usefulness or validity of MedisGroups outcomes were not systematically studied, and both aspects represent major differences between the systems. MedisGroups Procedure Review monitoring is interesting, but our experience to this point is limited. The CSI algorithm is public while the MedisGroups algorithm is proprietary. In the final analysis, however, MediQual was willing to share the logic fully with an implementation task force, an action that removed much of the mystery and suspicion surrounding this issue.


Both systems work well and are capably supported and well-documented, but they are expensive and labor intensive. CSI was easier to learn to use, but, in time, both systems required about the same time for abstracting and data entry. There is substantial difference between severity scores generated by each system on the same patient, and there is substantial variation among severity groups and DRGs in length of stay and charges. Severity of illness explains only a small part of LOS and charges. Both systems have features that we did not study and that may be relatively more or less attractive for different reasons in different institutions.


[1] Brewster, A., and others. "MEDISGRPS: A Clinically Based Approach to Classifying Hospital Patients at Admission." Inquiry 22(4):377-87, Winter 1985.

[2] Horn, S. "Measuring Severity: How Sick Is Sick? How Well Is Well? Healthcare Financial Management 40(10):21,24-32, Oct. 1986.

[3] Horn, S., and others. "Reliability and Validity of the Severity of Illness Index." Medical Care 24(2):159-78, Feb. 1986.

[4] Iezzoni L. and Moskowitz, M. "A Clinical Assessment of MedisGroups." JAMA 260(21):3159-63, Dec. 2, 1988.

[5] Jacobs, C., and others. "Severity of Illness in the Cost/Quality Equation." Internist 28(8):16-29,31, Sept. 1987.

[6] Knaus, W., and others. "APACHE II: A Severity of Disease Classification System." Critical Care Medicine 13(10):818-29, Oct. 1985.

Stephen E. Vanourny, MD, is Senior Vice President of Medical Staff Services/Medical Director at St. Luke's Hospial, Cedar Rapids, Iowa. Shirley Holtey, RN, BA, is Director of Quality and Risk Management at the hospital. The authors acknowledge and thank the following persons for their assistance with the severity project summarized in this article: Kay Detweiler, Jack Fritts, Roberta Haley, Debbie Mauck, Diane Moscrip, Freeda Muhl, Leigh Swaney, Karen Vossberg, and Jo Beth Wiese.
COPYRIGHT 1990 American College of Physician Executives
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1990, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:comparison of computerized cost indexes
Author:Holtey, Shirley
Publication:Physician Executive
Date:May 1, 1990
Previous Article:Why physician managers fail - part one.
Next Article:An overview for medical directors - part two.

Related Articles
Toward a definition of quality.
Severity of illness: red herring or horse of a different color?
An overview for medical directors.
An overview for medical directors - part two.
Utilization management in Alberta's new funding environment.
Different types of welfare states? A methodological deconstruction of comparative research.
Predictive power of cardiovascular risk factors for detecting peripheral vascular disease.
Figuratively speaking.
Comparative constructions in English (1).

Terms of use | Copyright © 2016 Farlex, Inc. | Feedback | For webmasters