Classification and low back pain: a review of the literature and critical analysis of selected systems.Key words: Classification, Diagnosis, Low back pain. The concept of classification has recently gained interest among researchers and clinicians involved in the care of patients with low back pain (LBP LBP - Land Bank of the Philippines LBP - Land Based Plant LBP - Land-Based Prototype LBP - Laser Beam Printer LBP - Lead-Based Paint LBP - Lebanon Pound (Currency Unit, ISO) LBP - Length Between Perpendiculars LBP - Lester B. Pearson (High School) LBP - Linux Business Pavilion (COMDEX) LBP - Lipopolysaccharide-Binding Protein LBP - Local Backprojection LBP - Local Business Process(es) (US Army Corps of Engineers)). This interest is due, in part, to the fact that LBP is most often not attributable to pathologies known to cause pain. Patients with LBP of unknown origin represent 85% or more of all patients treated for their LBP by primary care practitioners.[1-6] Patients with LBP of unknown etiology also have been reported to represent many heterogeneous subgroups.[1,7] Interest in classification stems from the notion that this very large group of patients would likely be treated more effectively if valid criteria could be established to assign these patients to homogeneous subgroups. In 1995, an international forum was held to discuss the management of patients with LBP by primary care practitioners.7 Among the attendees were 38 LBP researchers. Physicians, chiropractors, and physical therapists were asked to prioritize an agenda for research in the area of primary care for patients with LBP. The item given the highest priority by the group dealt with the concept of classification. The forum summarized this priority by posing the question: "Can different varieties, natural courses, or subgroups of LBP be identified and, if they can, what criteria can be used to differentiate among them?" Physicians referring patients for physical therapy typically assign a diagnosis (a form of classification) to most patients with LBP. Based on historical data, an examination, and other diagnostic tests, the physician uses some form of decision-making process to establish a diagnosis. The diagnostic label may indicate the presence of an impairment such as LBP or the presence of a pathological condition such as a herniated disk. The diagnosis may reflect an abnormal physiologic process such as myofascial pain syndrome. The usefulness of these types of labels in many cases is limited. Traditional medical diagnostic labels seldom guide physical therapist decisions related to the prognosis or treatment of patients with LBP.[8] In this article, I examine the usefulness of diagnostic labels and other forms of classification for patients with LBP. Many classification systems have been proposed for patients with LBP, and I will attempt to review most of these approaches. My primary purpose is to review those classification systems that were designed for the majority of patients with LBP. A MEDLINE search for the period 1985 to May 1997 was conducted using the key words "classification and low back pain" in combination. The reference lists of the relevant articles found in the MEDLINE search also were reviewed. Criteria for selection of classification systems for review were the following: (1) the system had to be published in English, (2) the authors had to provide sufficient descriptive detail of the structure of the system to allow for a summary description of the system, and (3) the system had to have a relatively broad focus that addresses the majority of patients with LBP. Systems that dealt only with very specifically defined subgroups of patients with LBP were not critically reviewed. For example, LBP classification systems designed only for patients with psychosocial disorders[9-15] or pathologies such as spinal stenosis or spondylolisthesis[16,17] were not reviewed. Systems applied to very large groups such as patients with chronic musculoskeletal pain also were not reviewed.[18,19] Following the MEDLINE search, the classification systems selected for review were examined to identify those systems that were most relevant for physical therapists. Four classification systems were judged to be most relevant to physical therapists and were thoroughly reviewed using a critical appraisal approach recommended by Buchbinder and colleagues[20,21] The 4 systems selected were proposed by Bernard and Kirkaldy-Willis,[22] Delitto and colleagues,[23,24] McKenzie,[25] and the Quebec Task Force on Spinal Disorders (QTF)[26] (Tab. 1). These 4 classification systems were judged to be most appropriate for critical evaluation because the systems are thoroughly described in the literature[23-25] and in continuing education courses,[25] they are reported to be used in clinical practice,[23-25] or they use diagnostic terms that are familiar to physical therapists.[22,26] Table 1. Description of 4 Common Classification Systems Reviewed in the Critical Appraisal
Bernard and Kirkaldy- Delitto and
Willis[22] Colleagues[23,24]
Professional or Orthopedic surgery Physical therapy
discipline
orientation of
system
developer
Type Status index Clinical guideline
index
Method of Judgment approach Judgment approach
development(a)
Purpose To determine the pathology To determine the
causing the problem appropriate treatment
Setting Not specified Not specified
Domain of All patients with LBP All patients with LBP
interest
Patients None None
excluded
Categories 23 categories: Three levels of
Group A classification:
Herniated nucleus Not all categories
pulposus have been described
Lateral stenosis
Central stenosis For stage 1:
Spondylolisthesis Extension
Segmental instability Flexion
Lateral shift (2)
Group B Immobilization (4)
Traction (5)
Sacroiliac joint Mobilization (5)
syndrome
Posterior joint
syndrome
Maigne syndrome
Muscle syndromes (6)
Group C
Chronic pain
syndrome
Pseudarthrosis
Nonspecific
Postfusion stenosis
Ankylosing spondylitis
Infection
Tumor
Arachnoiditis
Lateral femoral nerve
entrapment
McKenzie[25] Quebec Task Force[26]
Professional or Physical Therapy Many medical and
discipline nonmedical disciplines
orientation of
system
developer
Type Clinical guideline index Mixed index
Method of Judgment approach Judgment approach
development(a)
Purpose To determine the For clinical decision
appropriate treatment making, establishing a
prognosis, quality
control, research
Setting Not specified Occupational health
Domain of Most patients with LBP All patients with LBP
interest
Patients Patients with severe None
excluded sciatica and neurological
deficits and patients
whose symptoms cannot be
reduced or centralized
Categories 13 categories: 11 categories with 2
axes:
Postural syndrome
Pain without radiation
4 dysfunction Pain + radiation
syndromes proximal
(flexion, extension, Pain + radiation
side-gliding, distal
adherent nerve root) Pain + radiation +
neurological signs
7 derangement (<7, 7-49, >49 days)
syndromes (working, idle)
Presumptive root
Hip joint or sacroiliac compression + image
joint problem Root compression +
image
Spinal stenosis
Postsurgical <6 mo
Postsurgical >6 mo
Chronic pain syndrome
Other (W or I)(c)
(a) A statistical approach to developing a classification system relies primarily on statistical procedures to guide decisions about how to group patients. A judgment approach relies primarily on the clinical experience of the developer or on commonly accepted clinical knowledge to assign patients to groups. (b) LBP=low back pain. (c) W=working, I=idle. Prior to examining the 4 classification systems selected for critical appraisal, I will review some background material. I will present arguments as to why classification systems should enhance the care of patients with LBP. I will review the terminology proposed by Buchbinder et al[20] to standardize the descriptions of the classification systems discussed in this article. Some of the work of Feinstein[27] relating to the types of classification systems and how they are derived will be reviewed. I will use Feinstein's work to discuss other classification systems not selected for critical review. Classification systems described by Moffroid et al,[28] Coste et al,[29,30] Marras et al,[31] Binkley et al,[32] Mooney,[33] and Sikorski[34] will be discussed and are summarized in Table 2. [TABULAR DATA 2 NOT REPRODUCIBLE IN ASCII] Why Classify? Perhaps the most compelling argument for developing and using classification systems is that our current system for grouping patients appears to be inadequate.[8] The most common classification used by physicians and physical therapists is the International Classification of Diseases (ICD).[35] The ICD is a taxonomy of diagnostic labels used by many practitioners for the purposes of standardizing the nomenclature for patient diagnoses for statistical and administrative purposes.[35] Because the ICD does not describe the procedures used to apply diagnostic labels, the reliability and validity of assigning ICD codes are quite low.[36] The ICD, therefore, would appear to have very limited use for making judgments about treatment, prognosis, or the presence of pathology. The ICD-9, for example, lists 66 codes for use on patients with LBP.[37] This large number of codes would appear to be excessive and impractical for routine clinical use. The use of clearly described classification systems may enhance the effectiveness of treatment. Data suggest that patients treated with an approach based on an assigned classification do better than patients whose treatment is not based on their pretreatment classification.[38-41] Although these studies should be considered to be preliminary, they suggest that patients classified using a system designed to guide treatment may be treated more effectively than patients treated without regard to classification. Some researchers[42,43] contend that randomized clinical trials (RCTs) could be better conducted if patients with idiopathic LBP were placed into homogeneous groups prior to treatment. Most RCTs have lumped apparently heterogeneous patients with either acute or chronic LBP into one group prior to randomly assigning the patients for treatment.[44-46] Because most RCTs have considered patients with LBP as belonging to a homogeneous group, these studies probably have not measured a treatment effect that might be expected from a truly homogeneous sample of patients. Not all researchers, however, agree that the identification of homogeneous subgroups of patients with LBP for RCTs is necessary. Faas[47] argued, for example, that no evidence exists to support the argument for classification prior to randomly assigning patients to exercise therapy groups in a RCT. Based on the research priorities established by the International Forum for Primary Care Research on Low Back Pain, other researchers[7] apparently do not agree with the assertions of Faas. In my view, the LBP research community appears to be strongly in favor of developing classification systems for use in RCTs. Terminology for Classification Systems The terms proposed by Buchbinder and colleagues[20] serve to operationally define the different parts of classification systems and can assist the user in understanding the organizational framework of classification systems. In this article, therefore, I will use the terms defined by Buchbinder and colleagues when describing the various classification systems designed for patients with LBP. Buchbinder et al[20] proposed a set of terms originally described by Feinstein[27,48] to describe the various parts of a classification system (Fig. 1): (1) domain of interest, (2) categories, (3) criteria, and (4) definitions. The domain describes the type of patients the classification system is designed to classify. The QTF classification system, for example, was designed to classify patients with work-related disorders of the spine (Fig. 2).[26] The domain is subdivided into 2 or more categories. McKenzie's postural and flexion dysfunction syndromes are examples of categories of patients with LBP in the McKenzie system (Fig. 3).[25] The criteria are the procedures used to make decisions about the category to which a patient should be assigned. For example, the criteria used to place a patient into category 4 of the QTF system are pain radiating into a lower limb and neurological signs. The definitions describe the examination findings that must be present for each criterion. For example, the definition of neurological signs in the QTF system is one or more of the following: focal muscular weakness; asymmetric reflexes; dermatomal sensory loss; or loss of intestinal, bladder, or sexual function. [Figure 1 to 3 ILLUSTRATIONS OMITTED] Some classification systems further classify patients based on additional domains. For example, the QTF system uses 2 additional domains (time since onset of symptoms and work status) to further classify patients in categories 1 to 4 and 10 and 11 (Fig. 2).[26] The role of these additional domains is to guide clinical decisions related to prognosis. Feinstein[48] used the term "axis" to describe these additional domains within a classification system. Axes are essentially separate and distinct classification systems within a larger classification system. Axes have their own domain, categories, criteria, and definitions. Figure 2 illustrates the use of axes in the QTF system. The Different Types of Classification Systems Feinstein, in his book Clinimetrics,[48] identified the various uses for what he described as clinimetric indexes. Clinimetric indexes are rating scales and other expressions that are used to measure symptoms, physical signs, and other phenomena in clinical medicine. Classification systems are one type of clinimetric index. Feinstein[48] described 3 major types of clinimetric indexes that are relevant to classification systems used for patients with LBP. These are the status index, the prognostic index, and the clinical guideline index. Status indexes are likely the most common type of classification system used for patients with LBP. Classification systems that are considered to be status indexes are used to define patient problems. The most common type of status index is the diagnostic index. The ICD-9 classification system is a form of status index and is the most common classification system used by clinicians. The pathology-based LBP classification systems of Bernard and Kirkaldy-Willis[22] and Kirkaldy-Willis and Hill[49] also are commonly cited examples of status indexes. The second type of index described by Feinstein[48] is the prognostic index. Classification systems designed to be prognostic indexes are used to predict the future status of the patient. Most prognostic indexes for patients with LBP are designed to aid the clinician in making predictions about the likelihood of a poor outcome.[15,50,51] The third type of classification system is the clinical guideline index. This type of index is designed primarily for providing instructions about treatment. Clinical guideline indexes can be thought of as being designed to manage patient problems. Two clinical guideline indexes described in the physical therapy literature are the classification systems described by Delitto and colleagues[23,24] and McKenzie.[25] Some indexes are designed for multiple uses and are called "mixed indexes." The QTF system, for example, was designed to aid in making clinical decisions, establishing a prognosis, and evaluating the quality of care for patients with LBP.[26] The QTF system, therefore, can be considered a mixed index. Methods Used to Derive Classification Systems for Patients With LBP According to Feinstein,[48] classification systems have been developed using 2 approaches. These 2 approaches are the statistical approach and the judgment approach. The Statistical Approach Feinstein[48] suggested that a statistical approach may be the ideal way to develop a classification system. According to Feinstein, if statistical procedures can be used to group patients with similar attributes and demonstrate that patients in different groups do not have overlapping attributes, then the classification system has promise for clinical use. Feinstein contended, however, that the developer of a classification system also must demonstrate the clinical utility of the classification system. According to Feinstein, it must be demonstrated that clinically useful inferences can be made based on the patient groupings if the classification system is to have clinical utility. The statistical approach relies on one or a combination of statistical procedures designed to identify variables that can be used to distinguish various subgroups of patients.[52-54] The statistical approach, for example, has been used extensively in LBP research to identify homogeneous groups of patients at varying risk for a poor outcome[15,51] and with varying levels of psychological involvement.[10-15] Other researchers[28,29] have used a statistical approach to identify subgroups with similar levels of severity for a variety of physical impairments. Moffroid and colleagues[28] used a statistical approach to develop a classification system. They used physical impairment measures obtained from the National Institute of Occupational Safety and Health (NIOSH) Low Back Atlas[55] to identify homogeneous groups of patients. In the study of Moffroid et al, 115 patients with LBP underwent 53 tests of mobility, alignment, and muscle force production, as described in the NIOSH Low Back Atlas. Of these 53 tests, data from a total of 24 tests were used in the data analysis. Moffroid et al[28] used a cluster analysis to determine patient grouping based on the 24 impairment variables. Cluster analysis is used to determine whether individuals are similar enough on various attributes to be divided into groups.[56] Cluster analysis is a multivariate statistical approach designed to maximize between-group variance and minimize within-group variance on the variables of interest. The system of Moffroid et al is strengthened by the use of cluster analysis. The authors found that the patients impairment measurements varied among the 4 groups (Fig. 4) The "very unfit" group, for example, tended to have less hip and abdominal muscle force, less mobility at the hip, and less lumbar spine motion than the other groups had, whereas the "flexible" group tended to have more hip motion than the other groups had. [Figure 4 ILLUSTRATION OMITTED] The data of Moffroid and colleagues[28] lend some insight into the usefulness of physical impairment measures for identifying homogeneous subgroups of patients with LBP. Physical impairment measures can be used in isolation to classify patients with LBP into various subgroups. Other potentially relevant data, however, were not used. For example, data related to chronicity, pathology, and pain behavior--factors commonly accounted for in other classification systems--were not used by Moffroid and colleagues. Because factors such as chronicity and pain behavior influence outcome,[51,57] the usefulness of this classification system for clinical practice appears to be limited. Moffroid and colleagues did not provide data to indicate how their classification system might be used to guide clinical decisions. A summary description of the system proposed by Moffroid and colleagues is presented in Table 2. Coste and colleagues[29,30,58] conducted a series of studies designed to create a classification system for patients with LBP who either do or do not have evidence of psychological impairment. They used an approach similar to that used by Moffroid and colleagues,[28] a statistical approach to divide patients with LBP into homogeneous categories. Coste and colleagues, like Moffroid et al, relied entirely on statistical procedures for group assignments. Coste and colleagues[29,30] collected demographic and physical examination data on 330 patients referred for treatment of LBP. Patients reporting pain below the area of the gluteal gluteal /glu·te·al/ (gloo´te-al) pertaining to the buttocks. glu·te·al (gl ![]() t folds, and patients with
neurological involvement were not admitted to the studies. The DSM-III
criteria were used to identify patients with evidence of psychiatric
disease.[59,60] Of the 330 patients admitted to the studies, 136
patients were found to have evidence of psychiatric disease. The authors
divided the sample into those subjects with no evidence of psychiatric
disease (purely organic LBP) and those subjects diagnosed with a
psychiatric illness in addition to their LBP.For the group of patients with purely organic LBP, Coste et al[29] collected a large amount of demographic and pain behavior data (23 variables) and physical examination data (10 variables). They used the data in a cluster analysis to determine whether homogeneous categories of patients could be identified. Figure 5 summarizes the 7 different categories of patients identified in the cluster analysis. The authors did not report why they assigned names to only 5 of the 7 categories. [Figure 5 ILLUSTRATION OMITTED] In one study, to classify the patients into categories, Coste and colleagues[30] used data obtained on 136 patients determined to have a psychiatric disorder. The authors used the data obtained on 19 variables related to the patients' pain complaints and 10 examination variables to perform a cluster analysis. The cluster analysis revealed 3 categories of patients (Fig. 6). [Figure 6 ILLUSTRATION OMITTED] The data by Coste and colleagues[29,30] clearly indicate that patients with LBP can be classified into several different groups with homogeneous characteristics. The authors, however, did not present data or theoretical arguments for how these homogeneous categories might guide clinical decision making. For example, it is not clear how these patient categories might be used to guide decisions related to treatment selection or prognosis. More research is needed to determine how the categories proposed by Coste and colleagues might influence decisions made in clinical practice. Table 2 provides a brief description of the 2 systems proposed by Coste and colleagues. Marras and colleagues[31] developed a classification system based on the QTF system and then determined whether trunk motion measures could be used to predict the category to which the patients were assigned (Fig. 7). The authors used a device designed to measure the 3-dimensional motion characteristics of the trunk. An electrogoniometer attached to a harness was strapped to the patients' trunk, and the speed and acceleration of the trunk in 3 dimensions were quantified.[61] The authors hypothesized that with pathology of the lumbar spine, predictable asymmetric motions of the lumbar spine would be found, especially during bending movements requiring precision (eg, forward bending while in 15 [degrees] of rotation to the right). That is, the authors hypothesized that a patient's movement patterns would be affected in predictable ways when a patient moved in precisely defined planes of movement. [Figure 7 ILLUSTRATION OMITTED] Marras and colleagues[31] tested their hypothesis by determining whether patterns of limitations would be found for each of the categories described in their classification system. A total of 171 patients with LBP of greater than 7 weeks' duration were admitted to the study. The patients were approximately equally distributed among the 10 categories. Several different statistical approaches were used to determine how accurately the correct category could be predicted for each patient. The statistical approach that gave the best prediction rate was an approach called Modified Classification Using Splines (MCUS).[62] The MCUS approach to classification reportedly allows for greater interaction among variables and is similar to the neural networks approach.[68] The sensitivity and specificity of the predictions using the MCUS were generally high, with sensitivity ranging from 91% to 99% and specificity ranging from 55% to 88%. Patients were correctly classified 70% of the time. The data of Marras and colleagues[31] indicate that trunk motion (primarily speed and acceleration) measurements obtained in different planes show promise for use in classification systems. The authors suggested that their data support the use of lumbar motion measures for prioritizing requests for diagnostic tests and for diagnosis. They also suggested that the data obtained with their device could be used for making decisions related to prognosis and outcomes of care, but no data were provided. The authors indicated that their results should be considered preliminary. The method requires further refinement and testing on larger numbers of patients. Table 2 briefly summarizes the system of Marras and colleagues. The Judgment Approach The second approach to developing a classification system was described by Feinstein[48] as the judgment approach. Feinstein asserted that if no statistical data exist to guide the development of a classification system, the system developer must rely on 3 forms of judgment. Decisions are made based on (1) traditional custom, (2) conventional wisdom, and (3) personal experience. The traditional custom method requires the system developer to identify the variables in the literature that have been suggested to be the most important. For example, most published opinions by experts suggest that distribution of pain is an important variable to assess when examining patients with LBP. It might be argued, based on traditional custom, that a classification system should include the assessment of pain distribution. For the conventional wisdom method, the system developer would rely on common, but unpublished, beliefs of the clinical community to guide decisions about what variables should be included. The system developer might simply survey local clinicians informally to identify variables. For example, if the clinical community believed strongly in the use of screening tests designed to identify patients believed to be malingerers, then a system developer may choose to include those screening tests in a classification system. For the personal experience method, the system developers would rely on their past clinical experiences to guide decisions about the structure of a classification system. Delitto and colleagues,[23,24] for example, relied on their clinical experience to guide decisions related to the type and number of categories in their classification system. Several examples of classification systems based on the judgment approach have appeared in the literature. Binkley and colleagues[32] reviewed the literature related to classification systems and concluded that there was a need to determine whether consensus could be developed for a classification system. They stated that a classification system should (1) facilitate communication among clinicians, (2) guide clinical decisions, and (3) create homogeneous subgroups for effectiveness studies. Binkley and colleagues[32] surveyed physical therapist experts to develop their classification system (Tab. 2). The authors used a modification of the Delphi technique, a method designed to develop consensus among a group of experts.[64] Twenty-four physical therapist experts who met criteria of proficiency were subjects in the study. The majority of experts (70%) were from Canada. The experts completed 2 rounds of surveys designed to assess the extent of agreement on a group of 25 LBP categories (usually a diagnostic label) and criteria (usually a sign, symptom, or diagnostic test result) identified from the literature. The experts rated the degree of importance for each category and criterion. In order for a category or criterion to be judged as relevant for the classification system that Binkley and colleagues were developing, 75% of the experts had to rate the item 3 or higher on a 5-point scale, with 1 being "unrelated" and 5 being "essential" for the classification system. Nineteen categories, each with a varying number of criteria, were judged by the group of experts to be included in the classification system (Fig. 8). [Figure 8 ILLUSTRATION OMITTED] The method used by Binkley and colleagues[32] to develop their system departs from methods used to develop other LBP classification systems. The authors reviewed the literature and identified the most commonly described categories and the criteria associated with each category. They then chose to survey a relatively small, but well-defined, group of physical therapist experts. As they suggested, their classification system appears to be incomplete. Another problem the authors identified is that the experts likely varied in the way they defined the terminology used in the system, and these differences may have contributed error to the study. More work is needed to determine whether more categories are needed for the classification system described by Binkley et al. As Binkley and colleagues identified, classification systems must be exhaustive; that is, they must include all relevant categories to be clinically useful. The classification system also must be studied for reliability to determine whether other clinicians interpret the system similarly. Perhaps most importantly, Binkley et al[32] found that experts from around the world agreed on a large number of categories and criteria despite potential differences in practice patterns, terminology, and culture. This study was an important first step in identifying what appear to be homogeneous clusters of patients with different pathological bases for their LBP. Mooney,[33] an orthopedic surgeon, used a judgment approach to develop a classification system based on the assumption that the majority of cases of LBP are due to disk pathology (Tab. 2). The purpose of Mooney's system is to guide treatment, although the treatments suggested by Mooney were not fully defined. The system has 9 categories (Fig. 9). Mooney did not describe whether the system he proposed should be used for all patients with LBP or only for some patients. [Figure 9 ILLUSTRATION OMITTED] The usefulness of Mooney's system appears to be very limited for physical therapists because it is based entirely on symptom duration and distribution. All of the other classification systems described in this article used many other variables. The domain was not clearly defined, and the system does not appear to account for patients with serious pathology or patients with pathology unrelated to the disk. The assumption that disk pathology is responsible for almost all cases of LBP appears to restrict the use of this system to an unclearly defined and relatively small group of patients. Sikorski[34] proposed a classification system designed to guide physical therapy. There are 8 categories that group patients based on similarities in pain duration and pain behavior (Fig. 10). Sikorski argued that the categories are based on symptoms, which could then be linked to various pathologies. For example, patients in the "chronic anterior element" category were thought to have disk pathology. Sikorski provided no data to support the notion that the various categories in the classification system represented homogenous pathologies. The medical history and examination procedures used in Sikorski's system were not adequately defined. Therapists, therefore, may not be able to reliably classify patients into the various categories he proposed. [Figure 10 ILLUSTRATION OMITTED] Further refinement of the system described by Sikorski[34] appears to be needed. The history-taking, examination, and treatment procedures need clarification. When the system is well-defined, reliability of the classifications should be examined. Sikorski's system is briefly summarized in Table 2. Most of the classification systems that are in clinical use were developed using the judgment approach and not primarily a statistical approach. Clinicians, therefore, might be tempted to avoid using existing classification systems because they were not developed using approaches grounded in sound measurement science. Buchbinder and colleagues[20,21] and Feinstein[48] have suggested, however, that what ultimately determines the usefulness of a classification system is how well the classification system functions given the purpose for which it was designed. Introduction to Classification Systems Selected for Critical Appraisal The Classification System of Bernard and Kirkaldy-Willis The classification system proposed by Kirkaldy-Willis and Hill[49] and later modified by Bernard and Kirkaldy-Willis[22] is a status index and a classic example of a pathology-based system (Tab. 1). This system is shown in Figure 11. In their original article in 1979, Kirkaldy-Willis and Hill briefly described what they believed to be the medical history, physical examination, and radiological examination findings for each of 5 syndromes.[49] No data were reported to support the descriptions. [Figure 11 ILLUSTRATION OMITTED] Based on a retrospective review of 1,293 cases, Bernard and Kirkaldy-Willis[22] revised the diagnostic classification originally proposed by Kirkaldy-Willis and Hill.[49] The classification developed by Bernard and Kirkaldy-Willis has 23 categories divided into 3 groups (Tab. 1).[22] Group A consists of 5 categories described by the authors as "well-recognized syndromes." Group B consists of 9 categories described as "less-recognized syndromes." Group C has 9 categories described as the "remaining syndromes." One of the most important procedures used by Kirkaldy-Willis and Hill[49] for establishing a diagnosis was the use of the results of radiological examinations. Classification is particularly dependent on the results of routine plain films in addition to other radiological testing. (See the article by Beattie and Meyers in this for a discussion of the validity of some radiological tests for making inferences about pathology.) The Classification System of Delitto and Colleagues The classification system proposed by Delitto and colleagues[23,24] is a clinical guideline index designed to guide treatment for patients with LBP (Tab. 1). The system requires the therapist to collect historical and disability questionnaire data to aid in determining whether the patient's condition is amenable to physical therapy intervention or requires care of another practitioner. Examination procedures are designed to assess the effect of movements on symptom behavior and to assess the alignment of various body structures. Figure 12 illustrates the structure of the classification system of Delitto and colleagues. Figure 12 illustrates only the domain and categories of the classification system of Delitto and colleagues. The criteria were too extensive to place in the figure. The reader is referred to the original work for a more thorough description of the system. [Figure 12 ILLUSTRATION OMITTED] The classification system of Delitto and colleagues[23,24] has 3 levels involving different types of clinical decisions (Fig. 13). The first level requires the therapist to use various instruments to decide whether the patient (1) can be managed independently by a physical therapist, (2) cannot be managed by a physical therapist, or (3) can be managed by a physical therapist in consultation with another practitioner. The second level of clinical decision making requires the therapist to stage the patient into 1 of 3 groups (stage I, stage II, or stage III) based on the presence and severity of various functional limitations and disabilities, work status information, and scores on a disability scale. When making decisions at the second level, therapists use only historical and disability data obtained from the patient. The examination is not done until the therapist is prepared to make clinical decisions at the third level. [Figure 13 ILLUSTRATION OMITTED] The third level of clinical decision making involves the assignment of the patient, after being assigned to a stage, to one of the syndromes (categories) described for each stage. The examination procedures for stage I were described in a recent article.[23] A more elaborate description of the stage I categories and treatments as well as examination and treatment information for stages II and III appear in a recently published book chapter.[24] The categories described in the recently published book chapter for stage I syndromes are slightly different from those described in the article. For example, the book chapter described 3 different extension syndrome categories, whereas the article described only I extension syndrome. Apparently, the classification system of Delitto and colleagues is undergoing a process of continued development. The categories for stage II and stage III have yet to be described in the peer-reviewed literature. The Classification System of McKenzie The McKenzie system is a clinical guideline index designed for most, but not all, patients with LBP (Tab. 1).[25] The structure of the McKenzie system is shown in Figure 3. The medical history consists of questions related to symptom onset and symptom behavior associated with several different postures. The examination requires the therapist to observe the patient's posture and the alignment of several bony landmarks. Trunk movements are observed for limitations and frontal-plane deviations. Movements of the trunk are observed, and the patient is questioned about the effect of the movements on symptom location and intensity. The therapist is also required to complete a neurological examination and to examine the patient's hip and sacroiliac joints. McKenzie's classification system requires the clinician to classify the patient's problem into 1 of 13 categories.[25] The most commonly discussed categories are the postural syndrome, the 4 dysfunction syndromes, and the 7 derangement syndromes. In addition, a category exists for those patients classified as having a hip or sacroiliac joint problem. The dysfunction syndrome is further subdivided into flexion dysfunction, extension dysfunction, side-gliding dysfunction, and adherent nerve root dysfunction. The derangement syndrome is subdivided into 7 derangement syndromes that are numbered consecutively from 1 to 7. McKenzie described these various syndromes in the way that he did apparently because he believed each syndrome requires a different treatment. McKenzie also suggested that patients may be classified as having a sacroiliac joint problem or a hip problem, but he did not describe the examination procedures or treatments for these conditions. McKenzie[25] indicated that some patients may have a more serious problem not amenable to conservative treatment, but he argued that these patients typically are identified by the referring physician and are not referred for physical therapy. Patients with "constant severe sciatica with neurological deficit," patients thought to have a serious pathology, and patients whose symptoms cannot be centralized during the examination are labeled as unclassifiable.[25] The Quebec Task Force Classification System The QTF was a group of experts in various fields brought together by the Quebec Worker's Health and Safety Commission.[65] The QTF report should be of great interest to physical therapists. The commission was formed, in part, because of the large increase in the number of physical therapy treatments for LBP in Quebec in the years prior to the formation of the QTF. Because the QTF was interested in many aspects of the care of patients with LBP, the classification system proposed by this group was designed with many different purposes in mind (Tab. 1). The classification system designed by the QTF was intended to "help in making a clinical decision, establishing a prognosis, evaluating the quality of care and conducting scientific research."[26(pS16)] The QTF classification system is therefore an example of a mixed index.[26] Figure 2 depicts the structure of the QTF classification system. The developers of the QTF classification system argued that because the majority of patients with LBP have a disorder with an unidentified etiology, a classification system should be designed based primarily on pain data.[26] They also argued that only in the minority of cases can the origin of the pain be identified (ie, the pathology causing the disability can be determined). The classification system, therefore, is composed of data collected from a variety of sources, including (1) a combination of signs and symptoms (pain and neurological examination data), (2) radiological data (designed to identify pathology), (3) response to treatment (postsurgical status and failure to respond to conservative treatment), (4) work status (working, not working), and (5) symptom duration ([is less than] 7 days, 7 days to 7 weeks, [is greater than] 7 weeks). The work status and symptom duration data were used to form 2 additional axes of classification. Those patients classified into categories 1 through 4 were further classified based on duration of symptoms and work status. Patients are classified into different categories depending on whether their symptoms have been present for less than 7 days, 7 days to 7 weeks, and longer than 7 weeks. For the work status axis, patients are classified as either working or idle, which the developers of the QTF system defined as being absent from work, unemployed, or inactive.[26] For patients classified into category 11, an axis of classification based on work status was added. These separate axes were added because the developers of the QTF system believed, based on data collected on patients by the Quebec Worker's Compensation Board, that prognosis is influenced by both symptom duration and work status.[57] The QTF system does not require the use of other impairment data (eg, spinal flexion or extension range of motion [RAM]) or the patient's report of pain due to movement (eg, centralization or peripheralization). Combining data on signs and symptoms, radiological tests, symptom duration, and work status would appear to result in a rather complex classification system. For example, a patient in category 4 (pain with radiation to a lower limb and neurological signs) could be essentially identical, in pathology and signs and symptoms, to a patient in category 6 (compression of a spinal nerve root confirmed by a radiographic test such as magnetic resonance imaging). The developers of the QTF system apparently believed that the addition of a radiological test confirming the presence of a compressed nerve root required a separate category. From the perspective of prognosis and physical therapy treatment, patients in these 2 categories may not differ. From the spine surgeon's perspective, the patient with a radiologically confirmed nerve root compression may be considered a candidate for surgery, whereas the patient with identical signs and symptoms but no radiologically confirmed nerve root compression will likely not be a surgical candidate. The QTF classification system was designed to account for those patients who may be candidates for surgery. Treatments for each category are not defined in the QTF system. Instead, the QTF reviewed the literature related to treatment efficacy and made general recommendations about treatment approaches.[66] An Approach for Critically Appraising Existing Classification Systems Buchbinder and colleagues[20,21] developed an approach for appraising classification systems. This approach to critical appraisal consists of 7 concepts: (1) appropriateness of purpose, (2) content validity, (3) face validity, (4) feasibility, (5) construct validity, (6) reliability, and (7) generalizability. The authors adapted their approach for examining classification systems from the psychological literature[67] and from work done to construct health status measures.[48,68-70] A summary of the approach to critical appraisal developed by Buchbinder et al[21] is presented in Table 3. The table lists the items used to judge each of the 7 concepts. Some concepts have only one item (eg, purpose), whereas other concepts have several items (eg, content validity) to judge whether the classification system adequately meets the concept. Each item is written in the form of a question and is generally self-explanatory, although some items require elaboration. Table 3. Critical Appraisal of Classifications Systems Described by Buchbinder and Colleagues[20, 21] Purpose Are the purpose, population, and setting clearly specified? Content validity Are the domain and all specific exclusions from this domain clearly specified? Are all relevant categories included? Is the breakdown of categories appropriate, considering the purpose? Are the categories mutually exclusive? Was the method of development appropriate? If multiaxial, are criteria of content validity satisfied for each additional axis? Face validity Is the nomenclature used to label the categories satisfactory? Are the criteria for determining inclusion into each category clearly specified? If yes, do these criteria appear reasonable? Have the criteria been demonstrated to have validity? Have the criteria been demonstrated to have reliability? Are the definitions of criteria clearly specified? If multiaxial, are criteria of face validity satisfied for each additional axis? Feasibility Is the classification simple to understand? Is the classification easy to perform? Does it rely on clinical examination alone? Are special skills, tools, or training required? How long does it take to perform? Construct validity Does it discriminate between entities that are thought to be different in a way appropriate for the purpose? Does it perform satisfactorily when compared with other classification systems that classify the same domain? Reliability Does the classification system provide consistent results? Generalizability Has it been used in other studies or settings? Content validity deals with whether the instrument of interest includes everything needed to describe the concept of interest (ie, the thing being measured).[71] For the concept of content validity, one item poses the following question: "Was the method of development appropriate?" Buchbinder and colleagues[20,21] suggested that classification systems should undergo a development process similar to health status measures.[48,72,73] The categories in a classification system, in their view, should be chosen based on the opinions of a committee of experts, not on the opinion of an individual. According to their system, a formal group consensus technique should be used to identify the categories. In addition, they contended that a review of the literature should be used to supplement the classification system and that statistical techniques should be used in the process of development. For the concept of face validity, an item states, "Is the nomenclature used to label the categories satisfactory?" Some categories in a classification system imply the presence a specific pathology. For example, in the classification system of Bernard and Kirkaldy-Willis,[22] there is a "piriformis syndrome" category. Feinstein,[74] who is a physician, suggested that diagnostic labels are appropriate only for entities that can be verified with valid diagnostic tests. Buchbinder and colleagues[21] agreed and suggested that categories implying the presence of unverifiable pathology should not be used in classification systems. Diagnostic tests for piriformis syndrome have not been studied for validity. The use of the term "piriformis syndrome," therefore, would appear to be inappropriate. An item under the concept of construct validity asks, "Does it discriminate between entities that are thought to be different in a way appropriate for the purpose?" A construct is a conceptual idea that might be used to explain a phenomenon.[48] Construct validation may demonstrate that a proposed construct actually exists or that a new classification system differs from an existing one.[48] When determining whether a classification system discriminates between entities that are thought to be different, hypotheses should be tested. To test hypotheses, data need to be collected and examined for relationships. For example, if a classification system were designed to identify an effective treatment for patients, a study demonstrating that the treatment was more effective than other treatments would need to be done. This would be a study of prescriptive validity.[75] The critical appraisal approach proposed by Buchbinder and colleagues[21] is used in this article to critique 4 of the more commonly discussed classification systems for patients with LBP: the systems proposed by Bernard and Kirkaldy-Willis,[22] Delitto and colleagues,[23,24] McKenzie,[25] and the QTF.[26] A summary description of the 4 classification systems is presented in Table 1. Readers should note that I was the only person who reviewed the classification systems. The reliability of judgments made using the critical appraisal approach was not assessed for this article. For a more thorough description of the critical appraisal, the reader is referred to the article by Buchbinder et al.[21] Purpose The purpose is well-defined for the 4 classification systems (Tab. 1). Ultimately, each classification must be judged in the context of the purpose for which it was designed. A summary of judgments related to the purpose of the 4 classification systems is given in Table 4.
Table 4.
Critical Appraisal of Purpose and Content Validity
Bernard and Delitto and
Concepts and Items Kirkaldy-Willis[22] Colleagues[23,24]
Purpose
Are the purpose, Yes Yes
population, and
setting clearly
specified?
Content validity
Are domain of interest Yes Yes
and all specific
exclusions specified?
Are all relevant Yes Unknown
categories included?
Are the categories Yes Unknown
mutually exclusive?
Are categories No Unknown
appropriate, given the
purpose?
Was method of No Yes
development
appropriate?
If multiaxial, are N/A(a) N/A
criteria of content
validity satisfied
for each axis?
Quebec
Concepts and Items McKenzie[25] Task Force[26]
Purpose
Are the purpose, Yes Yes
population, and
setting clearly
specified?
Content validity
Are domain of interest Yes Yes
and all specific
exclusions specified?
Are all relevant No No
categories included?
Are the categories Yes No
mutually exclusive?
Are categories Yes Yes
appropriate, given the
purpose?
Was method of No Yes
development
appropriate?
If multiaxial, are N/A Yes
criteria of content
validity satisfied
for each axis?
(a) N/A = not applicable. Content Validity Domain of interest and inclusion of relevant categories. The 4 classification systems differ with respect to the method of development and inclusivity of the system. All classification systems clearly defined the domain of interest, although only the system of Bernard and Kirkaldy-Willis[22] appeared to include all relevant categories based on the purpose. The QTF system used an additional axis to classify patients as either working or not working. The QTF did not report why they chose not to use the axis related to work status for all categories, as work status has been shown to influence prognosis.[15,51] Atlas and colleagues[76] concurred that work status should be assessed in other categories of the QTF system. The systems developed by Bernard and Kirkaldy-Willis[22] and the QTF[26] both include a category that accounts for patients who do not meet the criteria of the other categories in the systems. The system of Bernard and Kirkaldy-Willis has a category called "nonspecific," and the QTF system has a category called "other." These categories permit the placement of patients into a category when they do not meet the criteria of any other category in the classification system. For example, in the QTF system, patients with evidence of cancer, visceral disease, compression fractures, or other diseases or conditions requiring non-therapist care are placed in category 11, the "other diseases" category. When a classification system accounts for all possible patient types, the system is said to be exhaustive. The McKenzie system[25] does not appear to be exhaustive in nature. McKenzie's categories do not appear to account for those patients who stay the same or are worsened by the examination procedures designed to alter a patient's pain.[23] How these types of patients might be classified is unclear. Also unclear is how patients with commonly accepted pathology-based diagnoses (eg, segmental instability, spinal stenosis) might be classified. For example, McKenzie's methods would appear to be of questionable usefulness for patients with segmental instability. Patients with instability would appear to require treatment designed to restrict motion of the involved segment.[77] In my view, the exercises advocated by McKenzie would not assist in stabilizing hypermobile segments of the spine. McKenzie[25] implied that patients with serious pathology are typically identified by the referring physician and are therefore not referred for treatment. McKenzie and Donelson stated in a later text that patients with inflammatory disorders, fractures, and other pathologies not amenable to treatment with the McKenzie approach are "quickly recognized when tested appropriately."[78(p1006)] A clear method for screening patients not suitable for treatment with the McKenzie approach appears to be lacking. McKenzie's system does not have a category for patients suspected of having serious pathology. The other systems have such a category. McKenzie's system, therefore, does not appear to have an exhaustive number of categories to account for patients with LBP who might be seen by physical therapists. The system of Delitto and colleagues[23,24] was rated as "unknown" for the item that asks whether all relevant categories are included because the system has not been thoroughly described in the peer-reviewed literature. Mutual exclusiveness. Another item used to judge content validity asks whether the categories are mutually exclusive. Theoretically, a patient should only be able to be classified into one category. If a patient fits more than one category, the decision rules for the system should indicate how the patient should be classified. McKenzie[25] implies in his text that if a patient is classified as having 2 syndromes, the focus of treatment is initially directed to the more serious syndrome (eg, a derangement 3 syndrome would be addressed first in a patient classified as having both a dysfunction syndrome and a derangement 3 syndrome). The QTF system was designed to be a hierarchical scale.[26] That is, patients are classified into category 1 unless they fulfill criteria for category 2 and so on. Classification systems designed to be hierarchical tend to have mutually exclusive categories. The QTF system, however, was judged to have categories that are not mutually exclusive because patients could potentially fall into more than one category. For example, patients who are classified as having chronic pain syndrome (category 10) and who also report pain radiating to a distal lower extremity (category 3) could be assigned to either category 10 or category 3. The QTF system does not provide the instructions necessary to determine which category a patient should be assigned to when more than one category appears to be applicable. Again, because the system of Delitto and colleagues[23,24] has not been described completely in the peer-reviewed literature, I could only rate it as "unknown" for 3 of the items under content validity. Delitto and colleagues appear to have developed categories to address most known patient groups, including patients with spinal stenosis and patients with segmental instability. The system of Delitto and colleagues can be used to classify patients with serious disease or problems not amenable to physical therapy. Whether the categories Delitto and colleagues have chosen are mutually exclusive and adequately capture the critical categories of patients with LBP will have to be determined after the entire system has been described in the literature and data are provided. Appropriateness of categories. Categories must be judged for appropriateness from the context of the purpose of the classification system. The system of Bernard and Kirkaldy-Willis[22] is inappropriate, in my opinion, because of the extensive use of nonverifiable pathology-based categories. I judged both the McKenzie system[25] and the QTF system[26] to be appropriate for this item. Method of development. I believe the method of development was inappropriate for 2 systems: the Bernard and Kirkaldy-Willis system[22] and the McKenzie system.[25] These systems apparently were developed based on the clinical experience of the developers. The system of Delitto and colleagues[23,24] and the QTF system[26] were developed based on a process of obtaining some form of expert consensus and through use of a literature review. These 2 systems, therefore, were judged to meet the criteria for development. Delitto and colleagues relied on the input of approximately a dozen clinicians, including physical therapists, physicians, and chiropractors, when developing the medical history and examination portions of their classification system. Delitto and colleagues, however, also relied on personal experience to develop decision rules for classifying patients into various treatment categories. The McKenzie system and the Bernard and Kirkaldy-Willis system appear to be based primarily on the clinical experiences of the developers and therefore, in my view, have not undergone an appropriate method of development. A summary of the content validity judgments for the 4 classification systems is given in Table 4. Face Validity Face validity is judged from a variety of different perspectives. Most of the items (see Tab. 5) relate to the criteria used to place patients into the various categories. One item addresses the category labels. The nomenclature used to label categories was judged to be unsatisfactory only for the Bernard and Kirkaldy-Willis system.[22] Because the Bernard and Kirkaldy-Willis system relies on pathology-based diagnostic labels, users must deduce the presence of a pathology when using the system. Because many of these category labels have not been studied for validity, I contend that the terms are unsatisfactory.
Table 5.
Critical Appraisal of Face Validity and Feasibility
Bernard and
Kirkaldy- Delitto and
Concepts and Items Willis(22) Colleagues(23,24)
Face validity
Is nomenclature used to label No Yes
categories satisfactory?(a)
Are criteria for inclusion into No Unknown
categories specified?
If yes, are the criteria N/A(e) N/A
reasonable?(b)
Do the criteria have demonstrated No No
validity?
Do the criteria have demonstrated No No
reliability?
Are the definitions of the No No
criteria clearly specified?
If multiaxial, are criteria of N/A N/A
face validity satisfied for each
additional axis?
Feasibility
Is classification simple to Yes Unknown
understand?(c)
Is classification easy to perform? Yes Unknown
Does it rely only on clinical No Yes
examination?
Are specials skills, tools, or Yes Unknown
training required?
How long does it take to perform? Unknown Unknown
Quebec
Concepts and Items McKenzie(25) Task Force(26)
Face validity
Is nomenclature used to label Yes Yes
categories satisfactory?(a)
Are criteria for inclusion into Yes Yes
categories specified?
If yes, are the criteria Yes Yes
reasonable?(b)
Do the criteria have demonstrated No No
validity?
Do the criteria have demonstrated No No
reliability?
Are the definitions of the No No
criteria clearly specified?
If multiaxial, are criteria of N/A No
face validity satisfied for each
additional axis?
Feasibility
Is classification simple to Yes Yes
understand?(c)
Is classification easy to perform? Yes Yes
Does it rely only on clinical Yes No
examination?
Are specials skills, tools, or No Yes
training required?
How long does it take to perform? 45 min 15 min
(a) This item asks whether tile category labels require the clinician to inter the presence of a specific pathology that cannot be verified. (b) This item asks whether any data exist to support the reliability or validity, of the criteria. (c) This item asks whether the system is relatively easy to comprehend for a clinician who treats patients with low back pain. (d) This item asks whether the steps required when applying tire system are reasonably simple and clearly described. (e) N/A = not applicable. I believe that all 4 classification systems have unsatisfactory data supporting the reliability and validity of the criteria. Some data exist to support the reliability of some of the criteria in the classification systems of Delitto and colleagues[79-81] and McKenzie,[83] but many of the criteria in the 4 classification systems have not been studied for reliability. The definitions for all of the criteria are not clearly specified for all 4 classification systems. Delitto and colleagues,[23] for example, did not define how to interpret performance on the side-bending test, a critical examination procedure used in stage I. McKenzie[25] did not clearly define how to differentiate between an accentuated, normal, and reduced lumbar lordoses. Bernard and Kirkaldy-Willis[22] did not define procedures for the majority of categories in their classification system. The developers of the QTF system[26] did not define the procedures used to determine when a patient should be assigned to the "other diagnoses" category. A patient, for example, may have pain in the area of the lumbar spine without radiation (category 1) but may also have a tumor in the lumbar spine (category 11). The QTF system does not define the procedures used to classify patients who may have characteristics consistent with more than one category. A summary of the face validity judgments for the 4 classification systems is given in Table 5. Feasibility Three of the 4 classification systems, in my opinion, met most of the feasibility criteria. I scored the system of Delitto and colleagues[23,24] as "unknown" on 3 of the items because the classification system has not been described completely. Whether advanced training is needed to use the system of Delitto et al is not known. Because the Bernard and Kirkaldy-Willis system[22] and the QTF system[26] rely on the use of radiological data for classification, I judged them both as requiring special skills. The system of McKenzie[25] does not appear to require advanced training, although reliability studies[82,83] suggest that the system is unreliable for clinicians who are inexperienced in the use of the McKenzie system and for clinicians with some advanced training in the McKenzie system. A summary of the feasibility judgments for the 4 classification systems is presented in Table 5. Construct Validity The items listed under construct validity were assessed by examining the results of published research. The systems of Delitto and colleagues,[23,24] McKenzie,[25] and the QTF[26] partially met the criteria for the construct validity item that deals with whether a classification system discriminates between entities thought to be different in a way appropriate for the purpose. To support the notion that the system discriminates among categories, research had to have been done to demonstrate that patients assigned to different categories were meaningfully different from each other. Approaches that have been studied using a cluster analysis, for example, may provide data to suggest that the system is able to discriminate among patients in different categories. What cluster analysis does not do, however, is indicate whether the differences among categories are clinically meaningful. The final item under construct validity deals with studies that have compared the utility of classification systems. No studies were found that made head-to-head comparisons of classification systems. Several studies have examined aspects of the construct validity of the systems of Delitto et al,[23,24] McKenzie,[25] and the QTF.[26] These studies are reviewed in the sections that follow. The system of Delitto and colleagues. Delitto and colleagues[28,39] provided data to support the notion that treatment designed for one category of patients was more effective than treatment that was not matched to a category. Delitto and colleagues conducted 2 studies[38,39] designed to examine the treatment effectiveness of one of the many categories they described. They examined the treatment effectiveness of patients classified into the extension-mobilization category, apparently a stage I syndrome in their classification system. How this extension-mobilization category relates to the 6 categories listed under stage 1 in Figure 12 is not clear. The 2 studies suggested that patients classified into the extension-mobilization category responded better, in the short term (approximately 1 week), to treatment designed for these patients than to treatment that was not matched to the classification category. No other studies have been conducted on the system described by Delitto and colleagues. As Delitto and colleagues[38,39] noted, limitations to these 2 studies exist. They examined only one category of patients, so the results cannot be generalized to the remainder of the classification system. All examiners received training from one of the authors, so the results may not be generalizable to other therapists. The studies examined small numbers of patients, so the results may not be generalizable to other patients. In addition, only the short-term (approximately 1 week) outcomes of care were measured, so the long-term results of this form of treatment are unknown. Most of the treatments for the various categories described by Delitto and colleagues[23,24] have not been studied for efficacy or effectiveness. For example, no data exist to support the use of autotraction as a treatment for one of the categories in the system proposed by Delitto and colleagues. More work is needed by Delitto and colleagues and by other researchers to further determine the usefulness of this system in clinical practice. The system of McKenzie. Several studies[84-86] have been done to support the centralization phenomenon as a useful construct for discriminating among patients with different conditions using the McKenzie system.[25] Although data exist to support the construct of the centralization phenomenon, it is not clear how much impact these studies have on the usefulness of the McKenzie system. Several studies[87-90] have examined the treatment efficacy of the McKenzie approach.[25] Nwuga and Nwuga[88] alternately assigned 62 female patients with acute LBP to 1 of 2 groups: a group treated using the McKenzie approach[25] or a group treated using an approach proposed by Williams.[91] Williams[91] advocated the use of exercises designed to decrease lumbar lordosis 1. the anterior concavity in the curvature of the lumbar and cervical spine as viewed from the side. 2. abnormal increase in this curvature. lor·do·sis (lôr-d . McKenzie[25] advocated the use of exercise and
postures based on examination findings in many cases to increase
lordosis. Nwuga and Nwuga determined which treatment was more effective
at decreasing pain and increasing spinal ROM. They found that the
patients treated with the McKenzie approach had greater improvements in
ROM and pain intensity as compared with the patients treated with the
Williams approach. Nwuga and Nwuga did not have a control group. In
addition, only one therapist applied the treatments to the patients,
limiting the generalizability of the study.Ponte et al[87] also compared the efficacy of the McKenzie approach[25] and the Williams approach.[91] A group of 22 patients with acute LBP were admitted to the study. The physicians referring the patients were responsible for assigning patients to a group treated with the McKenzie approach or a group treated with the Williams approach. A physical therapist assigned to each group applied all treatments for the patients in the group. Dependent variables that were assessed were pain intensity, ROM of the spine, and the straight leg raise. The patients treated with the McKenzie approach showed greater gains in ROM and decreases in pain intensity compared with the patients treated with the Williams approach. The usefulness of the study by Ponte et al[87] is limited by several factors. The method of assigning patients to the treatment groups was biased. Patients were not assigned randomly to the treatment groups. Instead, the referring physicians assigned patients to the treatment groups. The sample size was small, with only 10 patients in one group and 12 patients in the other group, and only one therapist participated for each group of patients. Ponte et al[87] and Nwuga and Nwuga[88] used measures of impairment as the dependent variables, not an unusual approach for studies published more than a decade ago. Most experts now recommend that measures of disability, health status, and work status be used as the dependent measures of choice in efficacy studies. Impairment measures may not accurately reflect important changes in a patient's condition. Stankovic and Johnell conducted 2 studies[89,90] designed to examine the long-term effects of McKenzie treatment versus a patient education treatment approach for patients with acute LBP. In the first study,[89] the authors randomly assigned 100 patients to either a group treated with the McKenzie approach[25] or a group that received education. The patients treated with the McKenzie approach were treated an average of 5 times (range=2-20) times. The patients in the education group were instructed one time for approximately 45 minutes in the anatomy and function of the spine. The authors found that patients treated with the McKenzie approach returned to work faster (mean sick leave of 11.9 days) than those receiving education (mean sick leave of 21.6 days), but they provided no data to suggest that both groups were equally disabled prior to the start of the study. All of the subjects in both groups reportedly returned to work by 11 weeks after the start of the study. In a 1-year follow-up, 95 of the 100 patients were surveyed, and the 2 groups were found to be no different in their recreational activity levels. The second study by Stankovic and Johnell[90] was a follow-up to their first study. The authors contacted 89 of the 95 patients who completed the first study to determine the long-term effects of treatment. Patients treated with the McKenzie approach[25] were reported to have fewer recurrences of LBP during the preceding 4 years, but, as the authors point out, the reliability of recall data over a 4-year period should be questioned. Patients treated with the McKenzie approach also had fewer episodes of missed work due to LBP compared with the education group. The 2 studies by Stankovic and Johnell[89,90] suggest that the McKenzie treatment approach[25] may have some long-term beneficial effects, although the second study probably contained some bias because the patients were asked to recall events that may have occurred up to 4 years previously. The authors also did not control for many factors that could influence the rate of injury in the 2 groups. For example, the 2 groups may have differed in their work demands, which could have influenced the rate of recurrence. The QTF system. Atlas and colleagues[76] examined a variety of issues related to the construct validity of the QTF system.[26] The authors determined whether there were differences between QTF system categories for a variety of patient characteristics, including duration of pain and disability level. The QTF implied that, for categories 1 through 4 and 6, an ordering effect existed such that the disability reported by patients classified into category, 2 would be higher than that reported by patients in category 1 and so on.[26] Atlas and colleagues also determined whether the category to which a patient was assigned was associated with the likelihood of surgical versus nonsurgical treatment. The developers of the QTF system suggested that patients classified into categories 6 and 7 were most likely to have surgical treatment because of the positive radiological tests. Atlas and colleagues also assessed the prognoses for patients assigned to different categories to determine whether category assignment was associated with prognosis. The QTF proposed that the QTF system could be used for making judgments related to prognosis. The QTF implied that prognosis becomes worse with assignment to higher categories (eg, patients in category 2 have a worse prognosis than patients in category 1) and that prognosis worsens as the duration of symptoms becomes longer. In the study by Atlas and colleagues,[76] the patients (N = 516) reportedly were diagnosed with sciatica or spinal stenosis and had to have had at least 2 weeks of unsuccessful conservative treatment (not defined by the authors) within 2 months of their first visit to a surgeon. Only patients who were assigned to categories 1 through 4, 6, and 7 of the QTF system were admitted to the study. The authors were primarily interested in studying patients who were considered to be candidates for surgery. Baseline data on demographic information, symptom characteristics, and disability were collected. Questionnaires were sent to patients 1 year following the initial visit to determine the level of disability, work status, and type of treatment received. Atlas et al[76] found an ordering effect for the likelihood of surgical treatment. Patients classified into category 6 were more likely to have surgical treatment than patients in categories 1 through 4. Approximately the same proportion of patients in categories 2 through 4 were treated surgically as were treated conservatively, suggesting the QTF system does not predict treatment decisions for those patients. The frequency of symptoms was found to increase from categories 2 to 6, but functional status, as measured by a modified Roland and Morris Scale[92] and the SF-36,[93] did not worsen from categories 2 through 6. The QTF system did not appear to indicate a worsening functional status with higher categories. There was no relationship between symptom duration and disability level or symptom severity at baseline measurement. The symptom duration axis of the QTF system was not a useful discriminator of functional status or symptom severity. When examining the issue of prognosis, the patients treated without surgery in the study of Atlas et al[76] showed a greater change in disability scores as the categories of the QTF system increased from category 2 to category 6. In addition, the percentage of patients with sciatica who were treated without surgery and who were asymptomatic at the end of 1 year increased from category 2 to category 6. That is, prognosis did not appear to become worse with assignment to categories 1 through 4 and 6. These data appear to refute the developers' claims that the QTF system can be used to estimate prognosis, especially for patients treated non-surgically. For patients treated without surgery, increasing symptom duration was associated with less improvement in symptoms and disability level, supporting the developers' contention that symptom chronicity can be used to predict prognosis. Work status also was shown to be a predictor of improvements in disability. Patients who were working at the time of the study improved more (indicated by larger change scores for the modified Roland and Morris Scale[92] and the SF-36[93]) than patients who were not working. Atlas and colleagues[76] collected data to both support and refute the construct validity of the QTF system. They found that symptom severity increased from classification categories 2 to 6, although functional status was similar among the categories. Patients classified into category 6 were more likely to be treated surgically than were patients in categories 1 through 4. Approximately half of the patients in categories 2 through 4 were treated with surgery, which suggests that the QTF system does not predict which patients receive surgery. The QTF system categories do not aid in determining a prognosis for patients treated without surgery as the originators have suggested. Patients with higher classifications (eg, 4 or 6) actually showed more improvement than patients classified into categories 2 or 3. Symptom duration and work status, however, influenced outcome, supporting the notion that symptom duration and work status are important for classification systems designed for making decisions related to prognosis. Table 6 summarizes the construct validity judgments for the 4 systems. Table 6. Critical Appraisal of the Construct Validity, Reliability, and Generalizability
Bernard and
Kirkaldy- Delitto and
Concepts and Items Willis[22] Colleagues[23,24]
Construct validity
Does it discriminate between No Partially
entities thought to be different
in a way appropriate for the
purpose?(a)
Does it perform satisfactorily(b) Unknown Unknown
compared with other systems with
similar purposes?
Reliability
Are the intratester and Unknown Unknown
intertester reliability
satisfactory?
Generalizability
Has it been used in other studies No Partially
and settings?
Quebec
Concepts and Items McKenzie[25] Task Force[26]
Construct validity
Does it discriminate between Partially Partially
entities thought to be different
in a way appropriate for the
purpose?(a)
Does it perform satisfactorily(b) Unknown Unknown
compared with other systems with
similar purposes?
Reliability
Are the intratester and No Unknown
intertester reliability
satisfactory?
Generalizability
Has it been used in other studies Yes Yes
and settings?
(a) This item asks whether there are any data to suggest the classification system can be used for its intended purpose. (b) Satisfactory, in this context, relates to whether data support the use of a classification system for clinical decision making. Reliability Reliability is another concept that requires data from the literature for making judgments. No studies were found that examined the reliability of classifications made using the systems developed by Bernard and Kirkaldy-Willis,[22] Delitto and colleagues,[23,24] or the QTF.[26] Errors can be common when using classification systems.[82,83] The reliability of classifications made using the systems described by the QTF, Bernard and Kirkaldy-Willis, and Delitto and colleagues, therefore, need to be examined. Two studies[82,83] have examined the reliability of classifications based on use of the McKenzie approach. Kilby and colleagues[83] developed an algorithm based on the McKenzie system and tested the reliability of assessments made based on the algorithm. The authors required one therapist to examine each of 41 patients while a second therapist observed the examinations of the first therapist. The therapists apparently were able to classify only 28 of the 41 patients admitted to the study. The therapists agreed on which syndrome was present in 58% of the patients. The internal validity of the study by Kilby and colleagues[83] is limited for several reasons. The design restricted the patient-therapist interaction to only one therapist. The other therapist who made judgments about syndrome type only observed the first therapist's examinations. Because interaction between the patient and the therapist was restricted for the second examiner, the authors artificially controlled for a major source of error. In addition, the number of patients and therapists participating in the study was small, which further limits the usefulness of the study. Riddle and Rothstein[82] examined the intertester reliability of classifications made on 363 patients with LBP referred to 1 of 8 clinics. Therapists (N=49) were given written summaries of the McKenzie system that were based on McKenzie's book.[25] Randomly paired therapists examined each patient independently. The kappa coefficient and percentage of agreement were used to describe reliability. Therapists agreed 39% of the time (K=.26) on which syndrome was present. Therapists with postgraduate training in the McKenzie system agreed on the type of syndrome 27% of the time (K=.15). These data suggest that classifications made using the McKenzie system are unreliable. Modifications of the criteria and definitions appear to be needed to enhance the reliability of classifications. Table 6 summarizes the reliability judgments for the 4 systems. Generalizability Generalizability is the final concept in the critical appraisal approach. To assess generalizability, I reviewed the literature to determine whether the classification systems had been used in other studies and settings. No other studies were found that examined the usefulness of the Bernard and Kirkaldy-Willis system.[22] The system of Delitto and colleagues[23,24] has been examined in other settings, but these studies were conducted by the system developers.[38,39] The generalizability of the system of Delitto and colleagues has yet to be demonstrated. One group of independent investigators has examined the QTF system.[76] The McKenzie system[25] has been studied by several groups and appears to have the strongest evidence for generalizability of the 4 classification systems that were reviewed. Table 6 summarizes the generalizability judgments for the 4 systems. Summery of Critical Appraisal The critical appraisal of the 4 classification systems demonstrates that each classification system has strengths and weaknesses. The 4 classification systems have a clearly defined purpose, and the population of interest and setting are either clearly defined or implied. In the area of content validity, the system of Delitto and colleagues[23,24] appears to hold promise, but much is unknown because the system has yet to be fully described in the peer-reviewed literature. The McKenzie system[25] demonstrated some problems in the area of content validity, primarily because of the issue of exhaustiveness. The QTF system[26] does not have mutually exclusive categories, and the work status and symptom duration axes are missing for some categories. The face validity is generally weak for all systems because of the lack of data supporting the reliability and validity of the criteria used to form the categories. Buchbinder et al[21] reported similar findings for classification systems of the neck and upper limb. The Bernard and Kirkaldy-Willis system[22] was especially weak in the area of face validity. With the exception of the system of Delitto and colleagues,[23,24] all systems scored fairly high for the concept of feasibility. More description of the system by Delitto et al is needed to make judgments related to feasibility. Construct validity, reliability, and generalizability are concepts that require published data for making judgments. All 4 of the classification systems that were appraised scored poorly for these concepts. Little published data exist to support the construct validity, reliability, and generalizability of any of the 4 classification systems that were assessed using the critical appraisal approach. Conclusions This article has reviewed many of the classification systems for patients with LBP that have been described in the literature. The review provided descriptions of the structure of each classification system as well as a critique of literature that examined the usefulness of the classification systems. Diagrammatic representations of the classification systems were presented and illustrate that similarities and differences exist among the classification systems. A major thrust of the article was to critically appraise 4 of the more common classification systems using the method described by Buchbinder and colleagues.[20,21] The intent of the critical appraisal was not to recommend one classification system over another. Because classification systems differ in purpose and in structure and because no single classification system is clearly dominant over the others, it is premature to suggest that one classification system is best. The critical appraisal highlights the limitations of 4 of the more commonly cited classification systems designed for patients with LBP. Classification systems in current use do not meet many of the measurement standards commonly used to develop health status measures and other instruments. Clinical utility, however, is not determined solely by whether a classification system was developed using sound measurement principles. Only well-designed clinical research can determine whether a classification system enhances the effectiveness of care. Future research in the areas of classification and LBP, therefore, should be done from 2 perspectives. First, future research should examine the construct validity and reliability of existing classification systems. The classification systems described in this article clearly require further study to determine their suitability for clinical practice. Second, research efforts should focus on developing new classification systems that fulfill basic measurement principles elucidated in the study by Buchbinder et al[21] and elaborated on in this article. Because classification is a form of measurement, developers of classification systems also should conform to the American Physical Therapy Association's Standards for Tests and Measurements in Physical Therapy Practice.[75] Acknowledgments I thank Dr Rachelle Buchbinder and Jill Binkley for their helpful comments on an earlier draft of this article. References [1] Deyo RA, Phillips WR. Low back pain: a primary care challenge. Spine. 1996;21:2826-2832. [2] White AA, Gordon SL. Synopsis: workshop on idiopathic low back pain. Spine. 1982;7:141-149. [3] Hart GL, Deyo RA, Gherkin gherkin (gûr`kĭn), species of gourd of the cucumber genus. DC. Physician office visits for low back pain: frequency, clinical evaluation, and treatment patterns from a national survey. Spine. 1995;20:11-19. [4] Deyo RA, Gherkin DC, Douglas C, Volinn E. Cost, controversy, crisis: low back pain and the health of the public. Annu Rev Public Health. 1991;12:141-156. [5] Frymoyer JW. Epidemiology. In: Frymoyer JW, ed. New Perspectives in Low Back Pain. Park Ridge, Ill: American Academy of Orthopaedic Surgeons; 1988:19-33. [6] Waddell G. Understanding the patient with backache. In: Jayson MI, ed. The Lumbar Spine and Back Pain. Edinburgh, Scotland: Churchill Livingstone; 1992:469-485. [7] Borkan JM, Cherkin DC. An agenda for primary care research on low back pain. Spine. 1996;24:2880-2884. [8] Sahrmann SA. Diagnosis by the physical therapist--a prerequisite for treatment: a special communication. Phys Ther. 1988;68:1703-1706. [9] McCreary C, Naliboff B, Cohen M. A comparison of clinically and empirically derived MMPI MMPI - MEDCOM Manpower Program Initiative MMPI - Michigan Chapter of Meeting Professionals International MMPI - Minnesota Multiphasic Personality Inventory (psychological test; University of Minnesota Press) groupings in low back pain patients. J Clin Psychol. 1989;45:560-570. [10] McNeill T, Sinkora G, Leavitt F. Psychologic classification of low-back pain patients: a prognostic tool. Spine. 1986;11:955-959. [11] Krishnan KR, France RD, Pelton S, et al. Chronic pain and depression, I: classification of depression in chronic low back pain patients. Pain. 1985;22:279-287. [12] Klapow JC, Slater MA, Patterson TL, et al. Psychosocial factors discriminate multidimensional clinical groups of chronic low back pain patients. Pain. 1995;62:340-355. [13] Main CJ, Wood PR, Hollis S, et al. The distress and risk assessment method: a simple patient classification to identify distress and evaluate the risk of poor outcome. Spine. 1992;17:42-51. [14] Talo S, Rytokoski U, Puuka P. Patient classification, a key to evaluate pain treatment: a psychological study in chronic low back pain patients. Spine. 1992;17:998-1011. [15] Krause N, Ragland DR. Occupational disability due to low back pain: a new interdisciplinary classification based on a phase model of disability. Spine. 1994;19:1011-1020. [16] Wiltse LL, Rothman SG. Lumbar and lumbosacral spondylolisthesis: classification, diagnosis, and natural history. In: Wiesel SW, Weinstein JN, Herkowitz H, et al, eds. The Lumbar Spine. Vol 2. 2nd ed. Philadelphia, Pa: WB Saunders Co; 1996:621-654. [17] Van Akkerveeken P. Classification and treatment of spinal stenosis. In: Wiesel SW, Weinstein JN, Herkowitz H, et al, eds. The Lumbar Spine. Vol 2. 2nd ed. Philadelphia, Pa: WB Saunders Co; 1996:724-737. [18] Turk DC, Rudy TE. The robustness of an empirically derived taxonomy of chronic pain patients. Pain. 1990;43:27-35. [19] Sanders SH. Cross-validation of the Back Pain Classification Scale with chronic, intractable pain patients. Pain. 22;3:271-277. [20] Buchbinder R, Goel V, Bombardier C. Working Paper #14: A Methodological Framework for the Critical Appraisal of Classification Systems. Toronto, Ontario, Canada: Institute for Work & Health; 1994. [21] Buchbinder R, Goel V, Bombardier C, Hogg-Johnson S. Classification systems of soft tissue disorders of the neck and upper limb: Do they satisfy methodological guidelines? J Clin Epidemiol. 1996;49:141-149. [22] Bernard TN, Kirkaldy-Willis WH. Recognizing specific characteristics of nonspecific low back pain. Clin Orthop. 1987;217:266-280. [23] Delitto A, Erhard RE, Bowling RW. A treatment-based classification approach to low back syndrome: identifying and staging patients for conservative treatment. Phys Ther. 1995;75:470-489. [24] Bowling RW, Truschel DW, Delitto A, Erhard RE. Conservative management of low back pain with physical therapy. In: Erdil M, Dickerson DB, eds. Cumulative Trauma Disorders: Prevention, Evaluation, and Treatment. New York, NY: Van Nostrand Reinhold; 1997:499-594. [25] McKenzie RA. The Lumbar Spine: Mechanical Diagnosis and Therapy. Waikanae, New Zealand: Spinal Publications; 1981. [26] Spitzer WO. Diagnosis of the problem (the problem of diagnosis). In: Scientific Approach to the Assessment and Measurement of Activity-Related Spinal Disorders: A Monograph for Clinicians--Report of the Quebec Task Force on Spinal Disorders. Spine. 1987; 12(suppl):S16-S21. [27] Feinstein AR. Clinical biostatistics, XIII: on homogeneity, taxonomy, and nosography. Clin Pharmacol Ther. 1972;13:114-129. [28] Moffroid MT, Haugh LD, Henry SM, Short B. Distinguishable groups of musculoskeletal low back pain patients and asymptomatic control subjects based on physical measures of the NIOSH Low Back Atlas. Spine. 1994;19:1350-1358. [29] Coste J, Paolaggi JB, Spira A. Classification of nonspecific low back pain, II: clinical diversity of organic forms. Spine. 1992;17:1038-1042. [30] Coste J, Paolaggi JB, Spira A. Classification of nonspecific low back pain, I: psychological involvement in low back pain. Spine. 1992;17: 1028-1037. [31] Marras WS, Parnianpour M, Ferguson SA, et al. The classification of anatomic and symptom-based low back disorders using motion measure models. Spine. 1995;20:2531-2546. [32] Binkley JM, Finch E, Hall J, et al. Diagnostic classification of patients with low back pain: report on a survey of physical therapy experts. Phys Ther. 1993;73:138-150. [33] Mooney V. The classification of low back pain. Ann Med. 1989;21: 321-325. [34] Sikorski JM. A rationalized approach to physiotherapy for low-back pain. Spine. 1985;10:571-579. [35] Manual of the International Classification of Diseases, Injuries, and Causes of Death. Vol 1. 9th rev ed. Geneva, Switzerland: World Health Organization; 1977. [36] Buchbinder R, Goel V, Bombardier C. Lack of concordance between the ICD-9 classification of soft tissue disorder of the neck and upper limb and chart review diagnosis: one steel mill's experience. Am J Ind Med. 1996;29:171-182. [37] Cherkin DC, Deyo RA, Volinn E, Lowser JD. Use of the International Classification of Diseases (ICD-9-CM) to identify hospitalization for mechanical low back problems in administrative databases. Spine. 1992;17:817-825. [38] Delitto A, Cibulka MT, Erhard RE, et al. Evidence for an extension/ mobilization category in acute low back pain: a prescriptive validity pilot study. Phys Ther. 1993;73:216-228. [39] Erhard RE, Delitto A, Cibulka MT. Relative effectiveness of an extension program and a combined program of manipulation and flexion and extension exercise in patients with acute low back syndrome. Phys Ther. 1994;74:1093-1100. [40] Sinaki M, Lutness MP, Ilstrup DM, et al. Lumbar spondylolisthesis: retrospective comparison and three-year follow-up of two conservative treatment programs, Arch Phys Med Rehabil. 1989;70:594-598. [41] Stankovic R, Johnell O. Conservative treatment of acute low back pain: a 5-year follow-up study of two methods of treatment. Spine. 1995;29:469-472. [42] Deyo RA, Phillips WR. Low back pain: a primary care challenge. Spine. 1996;21:2826-2832. [43] Deyo RA. Practice variations, treatment fads, rising disability: Do we need a new clinical research paradigm? Spine. 1993;18:2153-2162. [44] Triano JJ, McGregor M, Hondras MA, Brennan PC. Manipulative therapy versus education programs in chronic low back pain. Spine. 1995;20:948-955. [45] Dettori JR, Bullock SH, Sutlive TG, et al. The effects of spinal flexion and extension exercises and their associated postures in patients with acute low back pain. Spine. 1995;20:2303-2312. [46] Malmivaara A, Hakkinen U, Aro T, et al. The treatment of acute low back pain: bedrest, exercises, or ordinary activity? N Engl J Med. 1995;332:351-355. [47] Faas A. Exercises: Which ones are worth trying, for which patients, and when? Spine. 1995;21:2874-2879. [48] Feinstein AR. Clinimetrics. New Haven, Corm corm, short, thickened underground stem, usually covered with papery leaves. A corm grows vertically, producing buds at the upper nodes and roots from the lower surface. Corms serve as organs of food storage and in some plants (e.g., crocus and gladiolus) of asexual reproduction; they are often mistakenly called bulbs.: Yale University Press; 1987. [49] Kirkaldy-Willis WH, Hill RJ. A more precise diagnosis for low back pain. Spine. 1979;4:102-109. [50] Engel CC, Von Korff M, Katon WJ. Back pain in primary care: predictors of high health-care costs. Pain. 1996;65:197-204. [51] Dionne CE, Koepsell TD, Von Korff M, et al. Predicting long-term functional limitations among back pain patients in primary care settings. J Clin Epidemiol. 1997;50:31-43. [52] Altman R, Asch E, Bloch D, et al. Development of criteria for the classification and reporting of osteoarthritis: classification of osteoarthritis of the knee. Arthritis Rheum. 1986;29:1039-1049. [53] Cook EF, Goldman L. Empiric comparison of multivariate analytic techniques: advantages and disadvantages of recursive partitioning analysis. J Chronic Dis. 1984;37:721-731. [54] Breiman L, Friedman JH, Olshen RA, et al. Classification and Regression Trees. Belmont, Calif. Wordsworth International Group; 1984. [55] Moffroid MT, Haugh LD, Hodous T. Sensitivity and Specificity of the NIOSH Low Back Atlas. Washington, DC: US Dept of Health and Human Services, Public Health Service, Centers for Disease Control, National Institute for Occupational Safety and Health; May 1992. NIOSH Final Report RFP 200-89-2917. [56] Aldenderfer MS, Blashfield RK. Cluster Analysis. London, England: Sage Publications Ltd; 1984. [57] Spitzer WO. Magnitude of the problem. In: Scientific Approach to the Assessment and Measurement of Activity-Related Spinal Disorders: A Monograph for Clinicians--Report of the Quebec Task Force on Spinal Disorders. Spine. 1987;12(suppl):S12-S15. [58] Coste J, Spira A, Ducimetiere P, Paologgi JB. Clinical and psychological diversity of nonspecific low-back pain: a new approach towards the classification of clinical subgroups. J Clin Epidemiol. 1991;44: 1233-1245. [59] Diagnostic and Statistical Manual of Mental Disorders. 3rd ed. Washington, DC: American Psychiatric Association; 1980. [60] Robins LN, Helzer JE, Croughan J, Ratcliff KS. National Institute of Mental Health diagnostic interview schedule. Arch Gen Psychiatry. 1981;38:381-389. [61] Marras WS, Lavender SA, Luergans SE. et al. Biomechanical risk factors for occupationally related low back disorders. Ergonomics. 1995;38:377-410. [62] Bose S. Classification using splines. Computational Statistics and Data Analysis. In press. [63] Bounds DG, Lloyd PJ, Mathew BG. A comparison of neural network and other pattern recognition approaches to the diagnosis of low back disorders. Neural Networks. 1990;3:583-591. [64] Levine A. A model for health projections using knowledgeable informants. World Health Stat Q. 1984;37:306-317. [65] Spitzer WO. Scientific Approach to the Assessment and Measurement of Activity-Related Spinal Disorders: A Monograph for Clinicians--Report of the Quebec Task Force on Spinal Disorders. Spine. 1987;12 (suppl) :S1-S59. [66] Spitzer WO. Treatment of activity-related spinal disorders. In: Scientific Approach to the Assessment and Measurement of Activity-Related Spinal Disorders: A Monograph for Clinicians--Report of the Quebec Task Force on Spinal Disorders. Spine. 1987;12(suppl): S22-S30. [67] Nunnally JC. Psychometric Theory. 2nd ed. New York, NY: McGraw-Hill Inc; 1978. [68] Bombardier C, Tugwell PX. Methodological considerations in functional assessment. J Rheumatol. 1987;14:6-10. [69] Bergner M. Health status measures: an overview and guide for selection. Annu Rev Public Health. 1987;8:191-210. [70] Kirshner B, Guyatt GH. A methodological framework for assessing health indices. J Chronic Dis. 1985;38:27-36. [71] Kerlinger FN. Foundations of Behavioral Research. 2nd ed. New York, NY: Holt, Rinehart and Winston Inc; 1974. [72] Kopec JA, Esdaile JM, Abrahamowicz M, et al. The Quebec Back Pain Disability Scale: measurement properties. Spine. 1995;20:341-352. [73] Kopec JA, Esdaile JM, Abrahamowicz M, et al. The Quebec Back Pain Disability Scale: conceptualization and development. J Clin Epidemiol. 1996;49:151-161. [74] Feinstein AR. Clinical epidemiology, part 2: the identification of rates of disease. Ann Intern Med. 1968;69:1037-1061. [75] Task Force on Standards for Measurement in Physical Therapy. Standards for tests and measurements in physical therapy practice. Phys Ther. 1991;71:589-622. [76] Atlas SJ, Deyo RA, Patrick DL, et al. The Quebec Task Force classification for spinal disorders and the severity, treatment, and outcomes of sciatica and lumbar spinal stenosis. Spine. 1996;24: 2885-2892. [77] Frymoyer JW, Pope MH, Wilder DG. Segmental instability. In: Wiesel SW, Weinstein JN, Herkowitz H, et al, eds. The Lumbar Spine. Vol 2.2nd ed. Philadelphia, Pa: WB Saunders Co; 1996:783-795. [78] McKenzie R, Donelson R. Mechanical diagnosis and therapy for low back pain: toward a better understanding. In: Wiesel SW, Weinstein JN, Herkowitz H, et al, eds. The Lumbar Spine. Vol 2. 2nd ed. Philadelphia, Pa: WB Saunders Co; 1996:998-1011. [79] Delitto A, Shulman AD, Rose SJ, et al. Reliability of a physical examination to classify patients with low back syndrome. Physical Therapy Practice. 1992;1:1-9. [80] NIOSH Low Back Atlas of Standardized Tests and Measurements. Washington, DC: US Dept of Health and Human Services, Public Health Service, Centers for Disease Control, National Institute for Occupational Safety and Health; December 1988. [81] Cibulka MT, Delitto A, Koldohoff R. Changes in innominate innominate /in·nom·i·nate/ (i-nom´i-nat) nameless. in·nom·i·nate ( -n m tilt
after manipulation of the sacroiliac joint in patients with low back
pain: an experimental study. Phys Ther. 1988;68:1359-1363.[82] Riddle DL, Rothstein JM. Intertester reliability of McKenzie's classifications of the syndrome types present in patients with low back pain. Spine. 1992;18:1333-1344. [83] Kilby J, Stigant M, Roberts A. The reliability of back pain assessment by physiotherapists, using a "McKenzie algorithm." Physiotherapy Canada. 1990;76:579-583. [84] Donelson R, Silva G, Murphy K. Centralization phenomenon: its usefulness in evaluating and treating referred pain. Spine. 1990;15: 211-213. [85] Donelson R, Grant W, Kamps C, Medcalf R. Pain response to sagittal end-range spinal motion: a prospective, randomized, multicenter trial. Spine. 1991;16:S206-S212. [86] Long AL. The centralization phenomenon: its usefulness as a predictor of outcome in conservative treatment of chronic low back pain (a pilot study). Spine. 1995;20:2513-2521. [87] Ponte DJ, Jensen GJ, Kent BE. A preliminary, report on the use of the McKenzie protocol versus Williams protocol in the treatment of low back pain. J Orthop Sports Phys Ther. 1984;6:130-139. [88] Nwuga G, Nwuga V. Relative therapeutic efficacy of the Williams and McKenzie protocols in back pain management. Physiotherapy Practice. 1985;1:99-105. [89] Stankovic R, Johnell O. Conservative treatment of acute low-back pain--a prospective randomized trial: McKenzie method of treatment versus patient education in "mini back school." Spine. 1990;15: 120-123. [90] Stankovic R, Johnell O. Conservative treatment of acute low-back pain: a 5-year follow-up study of two methods of treatment. Spine. 1995;20:469-472. [91] Williams PC. Examination and conservative treatment for disc lesions of the lower spine. Clin Orthop. 1955;5:28-40. [92] Patrick DL, Deyo RA, Atlas SJ, et al. Assessing health-related quality of life in patients with sciatica. Spine. 1995;20:1899-1909. [93] Ware JE, Sherbourne CS. The MOS 36-Item Short-Form Survey: conceptual framework and item selection. Med Care. 1992;30:473-483. DL Riddle, PhD, PT, is Associate Professor, Department of Physical Therapy, Virginia Commonwealth University, 1200 East Broad, Richmond, VA 232984-0224 (USA) (driddle@hsc.vcu.edu). |
|
||||||||||||||||


t
-n
m
Printer friendly
Cite/link
Email
Feedback
Reader Opinion