Printer Friendly
The Free Library
14,718,654 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Examining Diagnostic Tests: An Evidence-Based Perspective.


Physical therapy has a rich history of dialogue concerning the meaning of diagnosis within the profession.[1,2] The Guide to Physical Therapist Practice (2nd ed)[3] (the Guide) identifies diagnosis as 1 of 5 interrelated in·ter·re·late  
tr. & intr.v. in·ter·re·lat·ed, in·ter·re·lat·ing, in·ter·re·lates
To place in or come into mutual relationship.



in
 elements of patient management (Fig. 1). The Guide describes diagnosis as composed of 2 aspects: first, the process of evaluating data obtained from the examination, and, second, the end result of such a process.[3] As noted by Delitto and Snyder-Mackler[4] in 1995, debate has focused mostly on clarifying the role and function of the end result of the diagnostic process, with little attention devoted to the first aspect of the Guide's definition: the process of diagnosis. Since the time this observation was made, the paucity pau·ci·ty  
n.
1. Smallness of number; fewness.

2. Scarcity; dearth: a paucity of natural resources.
 of discourse on the diagnostic process has persisted. The Guide identifies diagnosis as a keystone key·stone  
n.
1. Architecture The central wedge-shaped stone of an arch that locks its parts together. Also called headstone.

2. The central supporting element of a whole.
 in the process of maximizing patient outcomes, representing the culmination of the examination and evaluation process and directing subsequent decisions related to prognosis prognosis /prog·no·sis/ (prog-no´sis) a forecast of the probable course and outcome of a disorder.prognos´tic

prog·no·sis
n. pl. prog·no·ses
1.
 and interventions. In view of the role that diagnosis is given in the Guide and the identification of numerous priorities related to diagnosis within the Clinical Research Agenda for Physical Therapy of the American Physical Therapy Association The American Physical Therapy Association (APTA) is a national professional organization representing more than 66,000 members. Its goal is to foster advancements in physical therapy practice, research, and education.  (APTA APTA American Physical Therapy Association. ),[5] we believe that the need for further discussion of the diagnostic process is of paramount professional importance. The purpose of this article is to describe the diagnostic process in physical therapy from an evidence-based perspective. Issues relevant to the appraisal of evidence regarding diagnostic tests are presented, and the integration of evidence into clinical practice is discussed.

Figure 1. Five interrelated elements of patient management. (Reprinted with permission of the American Physical Therapy Association from the Guide to Physical Therapy Practice [2nd ed].[3])

EVALUATION

A dynamic process in which the physical therapist makes clinical judgments based on data gathered during the examination. This process also may identify possible problems that require consultation with or referral to another provider.

DIAGNOSIS

Both the process and the end result of evaluating examination data, which the physical therapist organizes into defined clusters, syndromes, or categories to help determine the prognosis (including the plan of care) and the most appropriate intervention strategies.

PROGNOSIS (Including Plan of Care)

Determination of the level of optimal improvement that may be attained through intervention and the amount of time required to reach that level. The plan of care specifies the interventions to be used and their timing and frequency.

INTERVENTION

Purposeful pur·pose·ful  
adj.
1. Having a purpose; intentional: a purposeful musician.

2. Having or manifesting purpose; determined: entered the room with a purposeful look.
 and skilled interaction of the physical therapist with the patient/client and, if appropriate, with other individuals involved in care of the patient/client, using various physical therapy methods and techniques to produce changes in the condition that are consistent with the diagnosis and prognosis. The physical therapist conducts a reexamination re·ex·am·ine also re-ex·am·ine  
tr.v. re·ex·am·ined, re·ex·am·in·ing, re·ex·am·ines
1. To examine again or anew; review.

2. Law To question (a witness) again after cross-examination.
 to determine changes in patient/client status and to modify or redirect re·di·rect  
tr.v. re·di·rect·ed, re·di·rect·ing, re·di·rects
To change the direction or course of.

n.
A redirect examination.



re
 intervention. The decision to reexamine re·ex·am·ine also re-ex·am·ine  
tr.v. re·ex·am·ined, re·ex·am·in·ing, re·ex·am·ines
1. To examine again or anew; review.

2. Law To question (a witness) again after cross-examination.
 may be based on new clinical findings or on lack of patient/client progress. The process of reexamination also may identify the need for consultation with or referral to another provider.

OUTCOMES

Results of patient/client management, which include the impact of physical therapy interventions in the following domains: pathology/ pathophysiology pathophysiology /patho·phys·i·ol·o·gy/ (-fiz?e-ol´ah-je) the physiology of disordered function.

path·o·phys·i·ol·o·gy
n.
1.
 (disease, disorder, or condition); impairments, functional limitations, and disabilities; risk reduction/prevention; health, wellness, and fitness; societal resources; and patient/client satisfaction.

EXAMINATION

The process of obtaining a history, performing a systems review, and selecting and administering tests and measures to gather data about the patient/client. The initial examination is a comprehensive screening and specific testing process that leads to a diagnostic classification. The examination process also may identify possible problems that require consultation with or referral to another provider.

The Diagnostic Process in Physical Therapy

AS explicated by the Guide, diagnosis requires gathering of data through examination. During the initial examination, data are obtained through the history, systems review, and selected tests and measures.[3] Therefore, questions of history and the screening procedures performed during the review of systems are also considered diagnostic tests, along with the various tests performed and measurements obtained. Throughout the examination, data are gathered to evaluate and to form clinical judgments. The result of this diagnostic process is a label, or classification, designed to specifically direct treatment. Individual pieces of data are collected for different purposes during the process.[6,7] Some data are collected to focus the examination on a region of the body or to identify a particular pathology (eg, screening tests). Other data are gathered for the purpose of selecting an intervention (eg, tests used for classification). In determining the accuracy of a diagnostic test, the intended purpose of the test should be considered.[8]

Although, according to according to
prep.
1. As stated or indicated by; on the authority of: according to historians.

2. In keeping with: according to instructions.

3.
 the Guide, the end result of the diagnostic process should most often be a classification grouping based largely on impairments and functional limitations instead of pathoanatomy, individual tests may be used to focus the examination or detect conditions not appropriate for physical therapy management. Tests used in this manner need to demonstrate accuracy for identifying the underlying pathoanatomy. An example of a test used for this purpose is the ankle-arm index,[9,10] a ratio of ankle to arm systolic blood pressure Systolic blood pressure
Blood pressure when the heart contracts (beats).

Mentioned in: Hypertension
, as a method of screening for atherosclerotic diseases Atherosclerotic disease
The progressive narrowing and hardening of the arteries over time.

Mentioned in: Retinal Artery Occlusion
. Some studies have shown that low ankle-arm index values are indicative of various atherosclerotic diseases,[9,10] and such a finding during an examination may indicate the need for referral of the patient to a physician. Another example would occur during the examination of an elderly patient with symptoms in both the lumbar lumbar /lum·bar/ (lum´bar) pertaining to the loins.

lum·bar
adj.
Of, near, or situated in the part of the back and sides between the lowest ribs and the pelvis.
 and hip regions. The therapist may want to determine whether the hip symptoms indicate degenerative de·gen·er·a·tive
adj.
Of, relating to, causing, or characterized by degeneration.


Degenerative
Degenerative disorders involve progressive impairment of both the structure and function of part of the body.
 changes of the hip or whether the symptoms are referred from the lumbar region (Anat.) the region of the loin; specifically, a region between the hypochondriac and iliac regions, and outside of the umbilical region.

See also: Lumbar
. Various tests and measures might be considered helpful in making this determination; however, the measurements with the highest diagnostic accuracy for detecting degenerative changes of the hip have been shown to be hip medial medial /me·di·al/ (me´de-il)
1. situated toward the median plane or midline of the body or a structure.

2. pertaining to the middle layer of structures.


me·di·al
adj.
 (internal) rotation range of motion of less than 15 degrees and hip flexion flexion /flex·ion/ (flek´shun) the act of bending or the condition of being bent.

flex·ion
n.
1. The act of bending a joint or limb in the body by the action of flexors.

2.
 range of motion of less than 115 degrees.[11] The occurrence of these impairments during the examination, therefore, could provide useful diagnostic information, indicating a need to focus the examination on the hip region.

Some diagnostic tests are performed by physical therapists because the results, singularly or in combination with other findings, are believed to indicate that a particular intervention will be most effective in maximizing the patient's outcome. Tests used in this manner form the foundation of classification systems and need to demonstrate accuracy for identifying which interventions might be useful. For example, the observation of frontal-plane displacement of the shoulders relative to the pelvis pelvis, bony, basin-shaped structure that supports the organs of the lower abdomen. It receives the weight of the upper body and distributes it to the legs; it also forms the base for numerous muscle attachments.  (ie, lumbar lateral shift) in a patient with low back pain (LBP LBP

In currencies, this is the abbreviation for the Lebanese Pound.

Notes:
The currency market, also known as the Foreign Exchange market, is the largest financial market in the world, with a daily average volume of over US $1 trillion.
) is frequently cited as an important examination finding.[12-16] This finding has been considered by some to be diagnostic of a lumbar disk herniation herniation /her·ni·a·tion/ (her?ne-a´shun) abnormal protrusion of an organ or other body structure through a defect or natural opening in a covering, membrane, muscle, or bone. [16,17]; however, the diagnostic accuracy of a lateral shift for detecting the presence of a disk herniation is poor.[14] Other measures, such as the straight-leg-raise test, serve as more accurate diagnostic tests for the presence of a lumbar disk herniation.[18,19] Despite the lack of accuracy for diagnosing a disk herniation, a lateral shift may be meaningful, not based on its ability to indicate a specific pathoanatomical origin, but because it may indicate which intervention (ie, correction of the lateral shift) will be most useful in reducing pain and disability.[15,20] Although it lacks accuracy for detecting a disk herniation, the presence of a lateral shift may still have diagnostic value if it can be demonstrated that patients judged to have a lateral shift who are treated with correction of the shift have outcomes superior to those of patients treated with alternative approaches. No studies to date have investigated this hypothesis.

In summary, both clinicians and researchers need to consider the purpose for which a diagnostic test is performed. Tests may serve to focus and refine the examination, or they may be used for classification with the goal of selecting effective interventions. The same test may have the potential to serve both purposes, whereas some tests may be useful for one purpose or neither purpose. This distinction is important in considering how to use the diagnostic process in an evidence-based manner. The purpose of a test has important implications for examining the evidence in support of the use of the test and applying the test to clinical practice.

Evidence-Based Practice and the Diagnostic Process

Recently, the term "evidence-based practice" has entered the lexicon of physical therapists, as it has for most medical professionals. Evidence-based practice has been defined by proponents as "the conscientious con·sci·en·tious  
adj.
1. Guided by or in accordance with the dictates of conscience; principled: a conscientious decision to speak out about injustice.

2.
 and judicious ju·di·cious  
adj.
Having or exhibiting sound judgment; prudent.



[From French judicieux, from Latin i
 use of current best evidence in making decisions about the care of individual patients."[21(p71)] Implicit in Adj. 1. implicit in - in the nature of something though not readily apparent; "shortcomings inherent in our approach"; "an underlying meaning"
underlying, inherent
 this definition is the need for a method of determining what constitutes the "best" evidence and how to apply evidence in clinical practice. Substantial effort has gone into the development and dissemination dissemination Medtalk The spread of a pernicious process–eg, CA, acute infection Oncology Metastasis, see there  of methods for grading evidence as it relates to treatment effectiveness. Several hierarchical schemes have been promulgated prom·ul·gate  
tr.v. prom·ul·gat·ed, prom·ul·gat·ing, prom·ul·gates
1. To make known (a decree, for example) by public declaration; announce officially. See Synonyms at announce.

2.
 for the purpose of ranking evidence from studies concerning treatment outcomes.[22-24] Although the schemes have some variations, all emphasize the importance of factors such as random assignment to treatment groups, completeness of follow-up, and blinding of examiners and patients in determining the quality of evidence. Although principles for evaluating the quality of an article on treatment outcomes are relatively well known, some authors[8] contend that the question being asked should determine the nature of the evidence to be sought. Therefore, when seeking to answer a diagnostic question, the rules governing the evaluation of studies regarding treatment outcomes are no longer applicable. Rules for judging evidence offered by a study of a diagnostic test have been elucidated; however, they tend to be less widely known and frequently remain unheeded by researchers designing and reporting studies in this area.[25-28] Knowledge of the issues that are important for determining the strength of evidence offered by studies of diagnostic tests is important if the professional dialogue on the diagnostic process in physical therapy is to move forward within a context of evidence-based practice.

Central to the concept of evidence-based practice is the integration of evidence into the management of patients. Integration cannot be reduced to a dichotomy di·chot·o·my  
n. pl. di·chot·o·mies
1. Division into two usually contradictory parts or opinions: "the dichotomy of the one and the many" Louis Auchincloss.
 (eg, "use the test or don't use the test") but instead involves a complex interaction between the strength of the evidence offered through use of a test and the unique presentation of an individual patient. Diagnostic tests cannot simply be deemed good or bad. The same test may provide important information for certain patients under certain conditions, but not for others. For example, testing vibration perception is useful for diagnosing a lack of protective sensation and an increased risk of ulceration ulceration /ul·cer·a·tion/ (ul?ser-a´shun)
1. the formation or development of an ulcer.

2. an ulcer.


ul·cer·a·tion
n.
1. Development of an ulcer.

2.
 in the feet of patients with diabetes.[29] However, vibration perception deficits are of more limited diagnostic value in the examination of a patient suspected of having lumbar spinal stenosis Spinal Stenosis Definition

Spinal stenosis is any narrowing of the spinal canal that causes compression of the spinal nerve cord. Spinal stenosis causes pain and may cause loss of some body functions.
.[30]

We will next examine further 2 aspects of evidence-based practice as they apply to the diagnostic process. First, we will discuss 2 of the most important considerations for the evaluation of the strength of evidence related to diagnostic tests: study design and data analysis.[25-27] Second, we will examine the integration of the evidence into the diagnostic process.

Evaluating the Evidence--Study Design

The strength of evidence provided by any study will be substantially affected, and potentially limited, by the study's design. The optimal study design is the one that most effectively reduces susceptibility susceptibility

the state of being susceptible. Refers usually to infectious disease but may be to physical factors such as wetting or to psychological factors such as harassment.
 to bias (ie, a deviation of the results from the truth in a consistent direction).[27,31] For studies investigating treatment outcomes, the design best accomplishing this objective is recognized as the randomized clinical trial randomized clinical trial,
n a clinical study where volunteer participants with comparable characteristics are randomly assigned to different test groups to compare the efficacy of therapies.
. However, if the research question is one of diagnosis, the randomized ran·dom·ize  
tr.v. ran·dom·ized, ran·dom·iz·ing, ran·dom·iz·es
To make random in arrangement, especially in order to control the variables in an experiment.
 trial is no longer the most desirable design. The optimal design for examining a diagnostic test, in the opinion of experts, is "a prospective, blind comparison of the test and the reference test in a consecutive series of patients from a relevant clinical population."[26(p1062)] That is, a study investigating a diagnostic test should utilize a prospective cohort design in which all subjects are evaluated using the diagnostic test or tests and a reference standard representing the definitive, or best, criteria for the condition of interest. When performed in this manner, the results of the test and the reference standard can be summarized in a 2 x 2 table, as depicted de·pict  
tr.v. de·pict·ed, de·pict·ing, de·picts
1. To represent in a picture or sculpture.

2. To represent in words; describe. See Synonyms at represent.
 in Table 1.
Table 1.
Contingency Table Created by Comparing the Results of the
Diagnostic Test and the Reference Standard

                  Reference                Reference
                  Standard                 Standard
                  Positive                 Negative

Diagnostic test   True positive results    False positive results
  positive                            A    B
                                      C    D
Diagnostic test   False negative results   True negative results
  negative


Issues beyond the basic design of a study are important for determining the extent to which the potential for bias has been minimized in a study and for determining the strength of the evidence. For studies of diagnostic tests, the most important issues are the reference standard, the diagnostic test, and the population studied. The most important considerations for each issue are summarized in Table 2 and are described below.
Table 2.
Potential Pitfalls for the 3 Most Important Variables Related to the
Design of a Study of a Diagnostic Test

Study Variable       Potential Pitfalls

Reference standard   Insufficiently definitive of the condition of
                       interest
                     Not consistent with the intended purpose
                       of the diagnostic test
                     Not applied consistently in all subjects
                       (verification bias)
                     Not independent of the diagnostic test
                       (incorporation bias)
                     Judged by an examiner who is not blinded
                       to the diagnostic test result and clinical
                       condition of the subjects (review bias)
Diagnostic test      Intended purpose of the test not clearly
                       defined
                     Lack of clarity in the description of the test
                       performance
                     Lack of clarity in the description of the test
                       interpretation
                     Judged by an examiner who is not blinded
                       to the results of the reference standard
                       (review bias)
Study population     Study subjects not representative of the
                       population on whom the test is used
                       clinically (spectrum bias)


The Reference Standard

In a study of a diagnostic test, the test of interest is compared with a reference standard. The reference standard is the criterion that best defines the condition of interest.[32] For example, if a test is performed to determine the presence of a meniscal tear in the knee, the most appropriate reference standard would be observation of the meniscus meniscus /me·nis·cus/ (me-nis´kus) pl. menis´ci   [L.] something of crescent shape, as the concave or convex surface of a column of liquid in a pipet or buret, or a crescent-shaped cartilage in the knee joint.  with arthroscopy Arthroscopy Definition

Arthroscopy is the examination of a joint, specifically, the inside structures. The procedure is performed by inserting a specifically designed illuminated device into the joint through a small incision.
. The reference standard should have demonstrated validity that justifies its use as a criterion measurement.[33] If the reference standard is determined to lack validity, little meaningful information can be derived from the comparison.[34] The validity of the reference standard may be compromised by several factors.

First, the reference standard should possess acceptable measurement characteristics, as defined by the APTA's standards for tests and measurements.[33,35] For example, the Ashworth scale has become a commonly used measure of "muscle spasticity spasticity /spas·tic·i·ty/ (spas-tis´i-te) the state of being spastic; see spastic (2).

spas·tic·i·ty
n.
1. A spastic state or condition.

2. Spastic paralysis.
."[36] Despite several studies questioning the reliability and construct validity construct validity,
n the degree to which an experimentally-determined definition matches the theoretical definition.
 of measurements obtained with the scale in either its original or modified form,[37-40] the Ashworth scale continues to be used as a reference standard.[41] If a reference standard is not reproducible, or lacks a strong conceptual basis for its use, it should not be used as the criterion against which to judge the adequacy of another test.[26]

The reference standard should also be consistent with the intended purpose of the diagnostic test. The majority of reference standards used to study diagnostic tests have been measures of pathoanatomy.[42] If a pathoanatomical reference standard is consistent with the test's purpose, this could serve as a valid measure for comparison. If a diagnostic test is used to select interventions with the goal of maximizing outcomes, a measure of pathoanatomy is unlikely to serve as an appropriate reference standard. As defined by the Guide,[3] outcomes are measures of functional limitations, disability, patient satisfaction, and prevention; therefore, diagnostic tests used to select interventions should be tested against a reference standard related to one of these measures.

An investigation by Burke et al[43] of the Phalen test, which is commonly used for patients with suspected carpal tunnel syndrome carpal tunnel syndrome: see repetitive stress injury.
carpal tunnel syndrome (CTS)

Painful condition caused by repetitive stress to the wrist over time.
 (CTS (1) (Clear To Send) The RS-232 signal sent from the receiving station to the transmitting station that indicates it is ready to accept data. Contrast with RTS.

(2) (Common Type System) The data typing used in .
), provides an example of selecting reference standards consistent with the purpose of the diagnostic test. The Phalen test could be examined as a screening test to detect compression of the median nerve median nerve
n.
A nerve that is formed by the union of the medial and lateral roots from the medial and lateral cords of the brachial plexus and supplies the muscular branches in the anterior region of the forearm and the muscular and cutaneous
 or as a test indicating the need for Specific interventions (eg, wrist splinting splinting /splint·ing/ (splin´ting)
1. application of a splint, or treatment by use of a splint.

2. in dentistry, the application of a fixed restoration to join two or more teeth into a single rigid unit.
).[43] To reflect these different purposes, Burke et al[43] compared the Phalen test against 2 reference standards: results of a nerve conduction nerve conduction
n.
The transmission of an impulse along a nerve fiber.


Nerve conduction
The speed and strength of a signal being transmitted by nerve cells.
 velocity study and patient-reported improvement after a 2-week course of wrist splinting. The nerve conduction study nerve conduction study Neurology A noninvasive method for assessing a nerve's ability to carry an impulse, which quantifies latency periods and conduction velocities; larger peripheral motor and sensory nerves are electrically stimulated at various intervals along  in which the distal distal /dis·tal/ (-t'l) remote; farther from any point of reference.

dis·tal
adj.
1. Anatomically located far from a point of reference, such as an origin or a point of attachment.
 motor and sensory latencies of the median nerve were measured served as the reference standard for an examination of the Phalen test's ability to detect nerve compression nerve compression,
n pressure on a nerve or nerves may often be caused by hypertonicity in adjacent muscles.
. Patient-reported improvement after 2 weeks served as a reference standard for the accuracy of the Phalen test as an indication of whether wrist splinting was useful as an intervention.

If the reference standard is not consistent with the purpose of the diagnostic test, the results become difficult to interpret. For example, 2 recent studies examined various tests for sacroiliac sacroiliac /sa·cro·il·i·ac/ (-il´e-ak) pertaining to the sacrum and ilium, or to their articulation.

sac·ro·il·i·ac
adj.
 (SI) region dysfunction dysfunction /dys·func·tion/ (dis-funk´shun) disturbance, impairment, or abnormality of functioning of an organ.dysfunc´tional

erectile dysfunction  impotence (2).
 in patients receiving physical therapy.[44,45] In both studies, the reference standard was the presence of LBP, judged as positive (patient consulting for LBP) or negative (patient consulting for an upper-extremity condition). By using this reference standard, the researchers examined the accuracy of the SI region tests in distinguishing between individuals with and without LBP. It does not appear, however, that this reference standard is consistent with the purpose of these tests. In the literature, SI region tests are proposed to distinguish patients thought to have SI region dysfunction from those with LBP related to other syndromes[46-48] or to determine whether a patient is likely to respond to a particular intervention designed for SI region dysfunction (eg, SI region manipulation).[12,49] The results of studies using a reference standard of the presence of LBP when the issue is whether there is SI region dysfunction are difficult to interpret because this standard is inconsistent with the purposes for which the tests are commonly used. Determining the usefulness of the tests based on the results of these studies may lead to erroneous erroneous adj. 1) in error, wrong. 2) not according to established law, particularly in a legal decision or court ruling.  conclusions.

Improper use of reference standards in a study may compromise the validity of the research. The reference standard should be applied consistently to all subjects.[25,26,32] If the reference standard is expensive or difficult to obtain, it may not be performed on subjects with a low probability of having the condition. Verification (or workup work·up
n. Abbr. w/u
A thorough medical examination for diagnostic purposes.
) bias occurs when not all subjects are assessed by use of the reference standard in the same way.[27,50] A common example of verification bias is demonstrated by a study of diagnostic accuracy of tests for posterior cruciate ligament posterior cruciate ligament
n. Abbr. PCL
The cruciate ligament of the knee that crosses from the posterior intercondylar area of the tibia to the anterior part of the medial condyle of the femur.
 (PCL (Printer Command Language) The page description language for HP LaserJet printers. It has become a de facto standard used in many printers and typesetters. PCL Level 5, introduced with the LaserJet III in 1990, also supports Compugraphic's Intellifont scalable fonts. ) integrity.[51] The reference standard was magnetic resonance imaging magnetic resonance imaging (MRI), noninvasive diagnostic technique that uses nuclear magnetic resonance to produce cross-sectional images of organs and other internal body structures.  (MRI 1. (application) MRI - Magnetic Resonance Imaging.
2. MRI - Measurement Requirements and Interface.
), an appropriate pathoanatomical reference standard for PCL integrity. A group of individuals with no history of knee injury were included in the study. These individuals were assumed to have an intact PCL without MRI verification.[51] Another example comes from a study of a screening examination using goniometry goniometry /go·ni·om·e·try/ (go?ne-om´e-tre) the measurement of angles, particularly those of range of motion of a joint.

goniometry

the measurement of range of motion in a joint.
 for detecting cerebral palsy cerebral palsy (sərē`brəl pôl`zē), disability caused by brain damage before or during birth or in the first years, resulting in a loss of voluntary muscular control and coordination.  in preterm infants preterm infant
n.
An infant born before the 37th week of gestation.


preterm infant Premature infant, see there
.[52] Goniometric go·ni·om·e·ter  
n.
1. An optical instrument for measuring crystal angles, as between crystal faces.

2. A radio receiver and directional antenna used as a system to determine the angular direction of incoming radio signals.
 measurements were taken at the hip, knee, and ankle. If the range of motion measurements fell outside a normal range, the child was believe to be at an increased risk of having cerebral palsy.[52] Infants with a high suspicion of cerebral palsy were referred to a neurologist Neurologist
A doctor who specializes in disorders of the brain and central nervous system.

Mentioned in: Cervical Disk Disease


neurologist

a specialist in neurology.
 whose evaluation then served as the reference standard. Only 97 of 721 infants were referred, and a less rigorous reference standard consisting of chart reviews was used for the remaining subjects.[52] The impact of verification bias is related to the likelihood that an individual not assessed with the reference standard could have the condition. It may be unlikely that an individual with no history of knee injury would have a compromised PCL. The adequacy of using a chart review for identifying cerebral palsy may leave this study more susceptible to verification bias. Verification bias can lead to an overestimation o·ver·es·ti·mate  
tr.v. o·ver·es·ti·mat·ed, o·ver·es·ti·mat·ing, o·ver·es·ti·mates
1. To estimate too highly.

2. To esteem too greatly.
 of diagnostic accuracy.[26,53]

The reference standard should also be independent of the diagnostic test. Incorporation bias occurs when the reference standard includes the diagnostic test being studied.[54] An example comes from a study of single-leg hop tests for diagnosing anterior cruciate ligament anterior cruciate ligament
n. Abbr. ACL
The cruciate ligament of the knee that crosses from the anterior intercondylar area of the tibia to the posterior part of the lateral condyle of the femur.
 (ACL See access control list.

1. ACL - Access Control List.
2. ACL - Association for Computational Linguistics.
3. ACL - A Coroutine Language.

A Pascal-based implementation of coroutines.

["Coroutines", C.D.
) integrity.[55] The authors evaluated 50 subjects with a chronic ACL-deficient knee and 60 subjects with no prior knee injury. All subjects performed the hop tests. The reference standard was defined as the mean ([+ or -] 2 standard deviations In statistics, the average amount a number varies from the average number in a series of numbers.

(statistics) standard deviation - (SD) A measure of the range of values in a set of numbers.
) of the absolute value of the right-to-left difference in time to complete the test in the subjects without knee injury. The authors then applied this standard to the results of all 110 subjects and found high levels of diagnostic accuracy for the tests in distinguishing the 2 groups of subjects.[55] This result is not surprising given that the interpretation of the reference standard was based on the test results of the subjects without knee injury. Incorporation bias is also likely to inflate inflate - deflate  the accuracy of a diagnostic test.[26]

The reference standard should be judged by an individual who does not know the diagnostic test results and the overall clinical presentation of the subject.[26,53,56] If blinding is not maintained, judgments of the reference standard may be influenced by expectations based on knowledge of the test results or by some other clinical information.[56] Review bias may occur if either the reference standard or the diagnostic test is judged by an individual with knowledge of the other result.[53]

The Diagnostic Test

Practitioners and researchers should be able to describe diagnostic tests in sufficient detail to permit replication of the tests by other therapists. We contend that test descriptions should cover 3 aspects: the intended use, physical performance, and scoring criteria. The intended clinical use of a test is an important consideration, although this aspect of the test description is often overlooked by researchers and practitioners.[25] As indicated previously, a diagnostic test may be used for a variety of purposes. If researchers do not clarify the intended purpose of a test under study, it is difficult to assess the appropriateness of the reference standard. When clinicians do not consider the purpose of diagnostic tests used in practice, they are susceptible to viewing tests as either good or bad, without recognition that a test may be useful for one purpose, but inappropriate for another purpose. For example, the KT-1000 knee arthrometer(*) possesses a high degree of diagnostic accuracy for distinguishing between individuals with and without ACL deficiency,[57,58] but it has not been shown to be useful for assisting in the selection of an intervention (surgical versus nonsurgical).[59]

The manner in which a test is performed should be detailed. A study's results can be generalized to a clinical setting only if a test is performed as it was performed in the study. For example, Katz and Fingeroth[60] compared various tests for ACL integrity against a reference standard of observation of the ligament ligament (lĭg`əmənt), strong band of white fibrous connective tissue that joins bones to other bones or to cartilage in the joint areas. The bundles of collagenous fibers that form ligaments tend to be pliable but not elastic.  during arthroscopy. The Lachman test Lachman test Sports medicine A clinical maneuver used to determine the effects of anterior shear loads applied to the knee at 30º flexion; the LT is preferred to the anterior drawer test for evaluating the integrity of the anterior cruciate ligament.  demonstrated very good diagnostic accuracy for ACL integrity; however, the test was performed with the subjects under anesthesia. If the results were accepted without consideration of the manner in which the test was performed, a clinician clinician /cli·ni·cian/ (kli-nish´in) an expert clinical physician and teacher.

cli·ni·cian
n.
 may have unrealistic expectations of the usefulness of test results when applying the test to patients who are not under anesthesia. This is illustrated by a study of the Lachman test performed by physical therapists in a clinical setting that led to lower levels of diagnostic accuracy.[61]

The description of a diagnostic test should include the criteria used to determine positive and negative results. Many tests used in physical therapy, though well known, may have varied or unclear grading criteria. Testing for centralization cen·tral·ize  
v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es

v.tr.
1. To draw into or toward a center; consolidate.

2.
 in patients with LBP is an example. There is general agreement that centralization is an important diagnostic finding,[62-64] but no such consensus exists on precisely what constitutes centralization. Some therapists use definitions strictly based on movement of symptoms from distal to proximal proximal /prox·i·mal/ (-mil) nearest to a point of reference, as to a center or median line or to the point of attachment or origin.

prox·i·mal
adj.
,[16,65] whereas other therapists define centralization to include diminishment of pain during testing.[63] Such disagreements are not unique to judgments of centralization, and it is crucial for authors to clarify how they defined positive and negative results. It is also important to indicate whether the test cannot be performed or the results are indeterminate That which is uncertain or not particularly designated.


INDETERMINATE. That which is uncertain or not particularly designated; as, if I sell you one hundred bushels of wheat, without stating what wheat. 1 Bouv. Inst. n. 950.
 for any subjects. Because these occurrences could influence the clinical use of a test, they should be reported and explained.[53,66] Measurements obtained with a test also are susceptible to review bias, as previously explained. Review bias can be avoided if the measurements and judgments are done by individuals who are blinded to the reference standard. Diagnostic accuracy may be overestimated if blinding is not maintained.[26]

The Study Population

Subjects included in a study of a diagnostic test should consist of individuals who would be likely to undergo the test in clinical practice.[26,53] This also means that individuals who are positive on the reference standard should reflect a continuum of severity, from mild to severe, whereas those who are negative with respect to the reference standard should have conditions commonly confused with the condition of interest and should not be a group of control subjects without impairments or disabilities.[34] Many of the tests already cited in this perspective have used groups of subjects without impairments or disabilities who were chosen out of convenience. When subjects without any symptoms, impairments, or disabilities are tested, this does not reflect the way most tests are applied clinically, where distinctions between individuals with similar symptoms are required. Any test should at least be expected to demonstrate greater diagnostic accuracy when attempting to distinguish between individuals without symptoms and those with severe conditions.[56] Spectrum (or selection) bias may occur when study subjects are not representative of the population on whom the test is typically applied in practice.[26] Spectrum bias, in our opinion, can profoundly affect the results of a study.[26]

The best method of ensuring a representative sample and avoiding spectrum bias is to utilize a prospective cohort design with a consecutive group of subjects from a clinical population. Use of a case-control design with retrospective selection of subjects for inclusion makes a study susceptible to spectrum bias.[53] This type of design occurs when a group of subjects with the condition of interest and a group of comparison subjects are assembled for examination. Even if the use of subjects without known impairments or disabilities is avoided, case-control designs can distort the typical mix of subjects seen in a clinical setting by artificially controlling the prevalence and presentation of the condition of interest, potentially affecting the accuracy and utility of a diagnostic test.[28,54,67,68]

A comparison of studies examining the diagnostic accuracy of the Phalen test for detecting median nerve compression in the carpal tunnel carpal tunnel
n.
The space between the flexor retinaculum of the wrist and the carpal bones, through which the median nerve and the flexor tendons of the fingers and thumb pass.
 provides an example of the impact of spectrum bias (Tab. 3). The study by Burke et al[43] and 2 other studies[69,70] compared the Phalen test against a reference standard involving nerve conduction velocity studies. Similar criteria for judging the reference standard were used in the 3 studies. The description of how the diagnostic test was performed and the grading criteria were nearly identical in 2 studies, but they were not reported in the third study. The greatest difference among the studies was the subjects. In 2 studies,[43,70] there were cohorts of subjects with symptoms consistent with CTS. In the third study,[69] subjects included those with symptoms consistent with CTS, a few with known diagnoses other than CTS but with a similar presentation (eg, diabetic peripheral neuropathy Diabetic peripheral neuropathy
A condition where the sensitivity of nerves to pain, temperature, and pressure is dulled, particularly in the legs and feet.

Mentioned in: Diabetes Mellitus
), and 25 subjects (50 hands tested) without symptoms consistent with CTS. Inclusion of people without symptoms creates a spectrum bias by assembling a population unrepresentative Adj. 1. unrepresentative - not exemplifying a class; "I soon tumbled to the fact that my weekends were atypical"; "behavior quite unrepresentative (or atypical) of the profession"  of the clinical population in which the test is typically used. As would be anticipated, the study most subject to spectrum bias also demonstrated the highest level of diagnostic accuracy for the Phalen test (Tab. 3).
Table 3.
Comparison of Studies Examining the Accuracy of the Phalen Test for
Diagnosing Compression of the Median Nerve Within the Carpal Tunnel(a)

                     Kulhman and
                     Hennessey[70]             Burke et al[43]

Reference standard   Nerve conduction study    Nerve conduction study
                       (any one of the           (any one of
                       following):               the following):
                     1. Median motor onset     1. Minimum median
                        distal latency [is        sensory distal
                        greater than or           latency measured at
                        equal to] 1.0 ms          14 cm of 4.1 ms and
                        longer than ulnar         a minimum motor
                        motor onset distal        distal latency at 8
                        latency                   cm of 4.4 ms
                     2. Median sensory peak    2. Median sensory
                        distal latency to         distal latency >0.5
                        to the thumb [is          ms longer than ulnar
                        greater than or           sensory distal
                        equal] 0.5 ms longer      latency
                        than radial sensory
                        peak distal latency
                        to the thumb
                     3. Median sensory peak
                        distal latency to
                        the long finger [is
                        greater than or
                        equal to] 0.5 ms
                        longer than ulnar
                        sensory peak distal
                        latency to the small
                        finger
Diagnostic test      The subject actively      Not described
  performance          places the wrists in
                       complete, but
                       unforced, flexion for
                       60 s
Diagnostic test      If numbness or            Not described
  grading              paresthesia is
                       produced or
                       exaggerated in the
                       hand, the test is
                       positive
Study population     180 consecutive           186 subjects (290
                       subjects (228 hands)      hands) referred for
                       referred for              splinting with a
                       electrodiagnostic         history consistent
                       consultation with         with CTS
                       suspected CTS
Sensitivity (%)      51 (43, 60)               51 (44, 58)
  (95% CI)
Specificity (%)      76 (66, 83)               54 (35, 71)
  (95% CI)
Overall accuracy     60                        52
  (%)

                     Gellman et al[69]

Reference standard   Nerve conduction study (any one of the
                       following):
                     1. Minimum median sensory distal
                        latency of >3.5 ms, or of
                        1 ms more than the opposite
                        side
                     2. Minimum median motor distal
                        latency of 4.5 ms, or of 1 ms more
                        than the opposite side
Diagnostic test      The subject actively places the wrist
  performance          in complete, but unforced, flexion
                       for 60 s
Diagnostic test      If numbness and tingling are produced
  grading              or exaggerated in the median nerve
                       distribution of the hand, the test
                       is positive
Study population     106 hands with symptoms consistent
                       with CTS, 16 hands with symptoms
                       commonly confused with CTS, 50
                       asymptomatic hands
Sensitivity (%)      71 (59, 81)
  (95% CI)
Specificity (%)      80 (67, 89)
  (95% CI)
Overall accuracy     72
  (%)

(a) Accuracy represents the percentage of correct results on the
diagnostic test when compared with the reference standard. CTS=carpal
tunnel syndrome, CI=confidence interval.


Evaluating the Evidence--Data Analysis

The basic layout for the data analysis in a study of a diagnostic test is depicted in Table 1. The result for each subject fits into only 1 of the 4 categories based on a comparison of the results of the diagnostic test and the diagnosis based on the reference standard. Results in categories "a" (true positive) and "d" (true negative) represent correct test results, whereas categories "b" (false positive) and "c" (false negative) contain erroneous results. From this basic layout, several statistics can be calculated (Tab. 4).[56]
Table 4.
Statistics Commonly Used to Examine Diagnostic Tests

Statistic             Formula                   Description

Overall accuracy      (a + d)/(a + b + c + d)   The proportion of test
                                                  results that are
                                                  correct
Positive predictive   1/(a + b)                 Given a positive test
  value                                           result, the probabi-
                                                  lity that the indivi-
                                                  dual has the condi-
                                                  tion
Negative predictive   d/(c + d)                 Given a negative test
  value                                           result, the probabi-
                                                  lity that the indivi-
                                                  dual does not have
                                                  the condition
Sensitivity           a/(a + c)                 Given that the indivi-
                                                  dual has the
                                                  condition, the proba-
                                                  bility that the
                                                  test will be positive
Specificity           d/(b + d)                 Given that the indivi-
                                                  dual does not have
                                                  the condition, the
                                                  probability that the
                                                  test will be negative
Positive likelihood   sensitivity/(1 -          Given a positive test
  ratio                 specificity)              result, the increase
                                                  in odds favoring the
                                                  condition
Negative likelihood   (1 - sensitivity)/        Given a negative test
  ratio                 specificity               result, the decrease
                                                  in odds favoring the
                                                  condition


The overall accuracy of a test can be determined by dividing the number of correct results by the total number of tests conducted.[56] A perfect test would have an overall accuracy of 100%; however, no test used in clinical practice can be expected to demonstrate this level of accuracy, and the goal is to characterize the nature of the errors.[71] The overall accuracy of a test does not distinguish between false positive and false negative results and therefore has limited usefulness.[72]

Sensitivity, Specificity, and Predictive Values pre·dic·tive value
n.
The likelihood that a positive test result indicates disease or that a negative test result excludes disease.



predictive value

a measure used by clinicians to interpret diagnostic test results.


Sensitivity and specificity values are calculated vertically from the 2 x 2 table and represent the proportion of correct test results among individuals with and without the condition, respectively. Sensitivity (or true positive rate) is the proportion of subjects with the condition who have a positive test result. Specificity (or true negative rate) is the proportion of subjects without the condition who have a negative test result.[42]

Predictive values are calculated horizontally from the 2 x 2 table and represent the proportion of subjects with a positive or negative test result that are correct results. The positive predictive value Positive predictive value (PPV)
The probability that a person with a positive test result has, or will get, the disease.

Mentioned in: Genetic Testing

positive predictive value 
 is the proportion of subjects with a positive test result who actually have the condition. The negative predictive value The negative predictive value is the proportion of patients with negative test results who are correctly diagnosed. Worked example
Relationships among terms:

Condition
(as determined by "Gold standard")

True False
 is the proportion of subjects with a negative test result who do not have the condition.[73]

Predictive values might appear to be more useful for applying the results of a study because these values relate to the way these tests are used in clinical decision making: given a test result (positive or negative), what is the probability that the result is correct? Sensitivity and specificity values work in the opposite direction: given the condition is present or absent, what is the probability that the correct test result will be obtained? Despite their apparent usefulness, predictive values can be deceptive de·cep·tive  
adj.
Deceptive or tending to deceive.



de·ceptive·ness n.
 because they are highly dependent on the prevalence of the condition of interest in the sample. Positive predictive values will be lower and negative predictive values will be higher in samples with a low prevalence of the condition. If prevalence is high, the trends reverse.[74]

Sensitivity and specificity values remain fairly consistent across different prevalence levels.[42] A comparison of 2 studies examining the diagnostic accuracy of weakness of the extensor hallucis longus muscle The Extensor hallucis longus is a thin muscle, situated between the Tibialis anterior and the Extensor digitorum longus.

It arises from the anterior surface of the fibula for about the middle two-fourths of its extent, medial to the origin of the Extensor digitorum longus;
 for detecting L5 radiculopathy illustrates this point. Lauder et al[75] studied consecutive patients referred to physical medicine physicians with a suspicion of lumbar radiculopathy (Tab. 5). The reference standard was electromyographic findings, and, based on this standard, the prevalence of L5 radiculopathy was 11% (10/94). Kortelainen et al[76] studied patients referred for surgery with symptoms of sciatica sciatica (sīăt`ĭkə), severe pain in the leg along the sciatic nerve and its branches. It may be caused by injury or pressure to the base of the nerve in the lower back, or by metabolic, toxic, or infectious disease.  (Tab. 6). Based on a reference standard of surgical observation of the nerve root, the prevalence of L5 radiculopathy was 57% (229/403). The sensitivity and specificity values remained fairly consistent. The predictive values, however, varied greatly between studies due to disparate prevalence rates, with the study with higher prevalence of radiculopathy showing a higher positive predictive value.
Table 5.
Accuracy of Weakness of the Extensor Hallucis Longus Muscle for
Diagnosing L5 Radiculopathy in the Study by Lauder et al[75],(a)

             L5 Radiculopathy   L5 Radiculopathy
                 Present          Not Present

Weakness     6                  38                 Positive predictive
  positive                  A   B                    value: (6/46)=
                                                     .14, 95% CI: .06,
                            C   D                    .27
Weakness     4                  46                 Negative predictive
  negative   Sensitivity (%):   Specificity (%):     value: (46/50)=
               (6/10) =.60        (46/84)=.55        .92, 95% CI: .81,
             95% CI: .31, .83   95% CI: .44, .65     .97

(a) Prevalence of L5 radiculopathy in this study was 11%. CI=confidence
interval.
Table 6.
Accuracy of Weakness of the Extensor Hallucis Longus Muscle
for Diagnosing L5 Radiculopathy in the Study by Kortelainen
et al[76](a)

                    L5 Radiculopathy Present

Weakness positive   126
                                                 A
                                                 C
Weakness negative   103
                    Sensitivity (%): (126/229)=.55
                      95% CI: .49, .61

                    L5 Radiculopathy Not Present

Weakness positive    54
                    B
                    D
Weakness negative   120
                    Specificity (%): (120/174)=.69
                      95% CI: .62, .75

Weakness positive   Positive predictive value:
                      (126/180)=.70, 95% CI: .63, .76

Weakness negative   Negative predictive value:
                      (120/223)=.54, 95% CI: .47, .60

(a) Prevalence of L5 radiculopathy in this study was 57%.
CI=confidence interval.


Sensitivity and specificity values provide useful information for interpreting the results of diagnostic tests. Sensitivity represents the ability of the test to recognize the condition when present. A highly sensitive Adj. 1. highly sensitive - readily affected by various agents; "a highly sensitive explosive is easily exploded by a shock"; "a sensitive colloid is readily coagulated"  test has relatively few false negative results. High test sensitivity, therefore, attests to the value of a negative test result.[77,78] Sackett et al[42] have advocated using the acronym acronym: see abbreviation.


A word typically made up of the first letters of two or more words; for example, BASIC stands for "Beginners All purpose Symbolic Instruction Code.
 "SnNout" (if sensitivity [Sn] is high, a negative [N] result is useful for ruling out [out] the condition). High sensitivity indicates that a test can be used for excluding, or ruling out, a condition when it is negative, but does not address the value of a positive test. Specificity indicates the ability to use a test to recognize when the condition is absent. A highly specific test has relatively few false positive results, and therefore speaks to the value of a positive test.[77,78] The acronym applicable in this case is "SpPin" (if specificity [Sp] is high, a positive [P] result is useful for ruling in [in] the condition).[42]

Unfortunately, few tests possess both high sensitivity and specificity. Knowledge of the sensitivity and specificity of a test can help clinicians refine clinical decision making by allowing them to weigh the relative value of positive or negative results. A recent study[79] examining the diagnostic accuracy of clinical tests for detecting subacromial impingement syndrome im·pinge·ment syndrome
n.
A group of symptoms in the shoulder including progressive pain and impaired function, resulting from injury to the rotator cuff caused by encroachment of surrounding bony structures and ligaments.
 provides an example. Six tests were compared against a reference standard of MRI of the supraspinatus tendon tendon, tough cord composed of closely packed white fibers of connective tissue that serves to attach muscles to internal structures such as bones or other muscles. . No test had high levels of both sensitivity,and specificity (Tab. 7). The Hawkin test was the most sensitive, and the drop arm test was most specific.[79] The high sensitivity (92%) indicates that a negative Hawkin test is useful for ruling out subacromial impingement impingement (impinj´mnt),
n the striking or application of excessive pressure to a tissue by food or a prosthesis.
. The low specificity (25%), however, signifies that a positive Hawkin test has little meaning. The drop arm test was very specific (97%), indicating that a positive test is useful for confirming subacromial impingement. The sensitivity of the drop arm test was poor (8%), revealing a high number of false negative results and attesting to the lack of meaning of a negative result.
Table 7.
Accuracy Statistics of Clinical Tests for Diagnosing
Subacromial Impingement Syndrome(a)

Test                   Sensitivity (%)   Specificity (%)

Hawkin                 92 (84, 96)       25 (14, 42)
Neer                   89 (80, 94)       31 (17, 46)
Horizontal adduction   82 (73, 89)       28 (15, 43)
Speed                  69 (58, 77)       56 (40, 71)
Yergason               37 (28, 48)       86 (70, 94)
Painful arc            33 (23, 43)       81 (63, 90)
Drop arm                8 (4, 16)        97 (85, 100)

Test                   Positive LR        Negative LR

Hawkin                 1.2 (1.0, 1.5)     0.32 (0.12, 0.76)
Neer                   1.3 (1.0, 1.6)     0.37 (0.18, 0.86)
Horizontal adduction   1.1 (0.90, 1.4)    0.65 (0.32, 1.4)
Speed                  1.5 (1.0, 2.3)     0.57 (0.37, 0.87)
Yergason               2.7 (1.1, 6.0)     0.73 (0.59, 0.91)
Painful arc            1.7 (0.76, 3.3)    0.84 (0.68, 1.1)
Drop arm               2.8 (0.35, 21.7)   0.95 (0.87, 1.3)

(a) Numbers in parentheses represent 95% confidence intervals,
which were estimated from the data presented in the study.[79]
LR=likelihood ratio.


Likelihood Ratios

Sensitivity and specificity values provide useful information; however, they have several shortcomings A shortcoming is a character flaw.

Shortcomings may also be:
  • Shortcomings (SATC episode), an episode of the television series Sex and the City
. These values work in the opposite direction of clinical decision making. Clinicians have knowledge of the test result and want to infer the probability that the result is correct. Sensitivity and specificity values infer the probability of a correct test, given the result of the reference standard. Sensitivity and specificity values can be used as independent estimates of the usefulness of negative and positive test results, but this information cannot be combined and analyzed simultaneously. The actual performance of a diagnostic test is not only related to sensitivity and specificity values, but also dependent on the pretest pre·test  
n.
1.
a. A preliminary test administered to determine a student's baseline knowledge or preparedness for an educational experience or course of study.

b. A test taken for practice.

2.
 probability that the condition is present. Useful tests should produce large shifts in probability once the result of the test is known.[77,80,81] Sensitivity and specificity values cannot be used to quantify the shift in probability of the condition given a certain test result.

The best statistics for summarizing the usefulness of a diagnostic test are likelihood ratios.[82,83] Likelihood ratios (LRs) overcome the difficulties cited by reflecting a combination of the information contained in sensitivity and specificity values into a ratio that can be used to quantify shifts in probability once the diagnostic test results are known.[84] The positive LR is calculated as sensitivity/(1 - specificity) and indicates the increase in odds favoring the condition given a positive test result. The negative LR is calculated as (1 - sensitivity)/ specificity and indicates the change in odds favoring the condition given a negative test result.[27] An LR of 1 indicates that the test result does nothing to change the odds favoring the condition, whereas an LR greater than 1 increases the odds of the condition, and an LR less than 1 diminishes the odds of the condition. Table 8 provides a guide for interpreting the strength of an LR.[83]
Table 8.
A Guide to Interpretation of Likelihood Ratio (LR) Values(a)

Positive   Negative
LR         LR         Interpretation

>10        <0.1       Generate large and often conclusive
                        shifts in probability
5-10       0.1-0.2    Generate moderate shifts in probability
2-5        0.2-0.5    Generate small, but sometimes important,
                        shifts in probability
1-2        0.5-1      Alter probability to a small, and rarely
                        important, degree

(a) Adapted from Jaeschke et al.[83]


A positive LR indicates the shift in odds favoring the condition when the test is positive. It is desirable, therefore, to have a large positive LR. Tests with a large positive LR generally have high specificity because both values attest To solemnly declare verbally or in writing that a particular document or testimony about an event is a true and accurate representation of the facts; to bear witness to. To formally certify by a signature that the signer has been present at the execution of a particular writing so as  to the usefulness of a positive test. In the study by Calis et al,[79] for example, the drop arm test had the highest specificity (97%) for determining the presence of subacromial impingement syndrome and also the largest positive LR (2.8) (Tab. 7). Because the negative LR indicates the change in odds favoring the condition given a negative result, a small negative LR will indicate a test that is useful for ruling out a condition when negative. Small negative LR values correspond to high sensitivity, as illustrated by the subacromial impingement syndrome tests. The highest sensitivity and smallest negative LR were found for the Hawkin test. A comparison of the horizontal adduction adduction /ad·duc·tion/ (ah-duk´shun) the act of adducting; the state of being adducted.
adduction (
 and Speed tests indicates the importance of combining sensitivity and specificity values. The sensitivity of the Speed test (69%) was less than that of the horizontal adduction test (82%). However, because the Speed test was substantially more specific than the horizontal adduction test (56% versus 28%), the negative LR was smaller for the Speed test (0.57 versus 0.65).

Diagnostic tests measured on a continuous scale are frequently transformed into multilevel mul·ti·lev·el  
adj.
Having several levels: a multilevel parking garage.

Adj. 1. multilevel - of a building having more than one level
 ordinal (mathematics) ordinal - An isomorphism class of well-ordered sets.  outcomes based on cutoff scores. When this is the case, LR values can be calculated for each level of the test.[42] Riddle riddle, puzzling question, specifically one that consists of a fanciful description or definition of something to be guessed. A famous riddle was asked by the Sphinx: "What goes on four legs in the morning, on two at noon, on three at night?" Oedipus guessed the  and Stratford[85] illustrated this process using the Berg Balance Test. Different test results were used as cutoff scores, and the LR values were calculated for each level. A more detailed explanation of the process can be obtained from the article by Riddle and Stratford.[85]

Evaluating the Evidence--Additional Considerations

Confidence Intervals confidence interval,
n a statistical device used to determine the range within which an acceptable datum would fall. Confidence intervals are usually expressed in percentages, typically 95% or 99%.


As is true of all statistics, sensitivity, specificity, and LR values are taken from a sample and represent an estimate of the true value that could be found in the population.[84] The confidence interval (CI) attests to the precision of this estimate. A 95% CI is the most common and indicates a range of values within which the population value would lie with 95% certainty.[86] If the CI is wide and contains values that are not clinically important, the usefulness of the measure may be questionable. That is, if another estimate were taken from a different sample, the statistic statistic,
n a value or number that describes a series of quantitative observations or measures; a value calculated from a sample.


statistic

a numerical value calculated from a number of observations in order to summarize them.
 calculated might be substantially different. In the study by Calis et al,[79] for example, the drop arm test had the largest positive LR among the tests for subacromial impingement (2.8), but the 95% CI was wide (0.35-21.7), indicating that the positive LR estimated from this sample of 120 patients was not very precise. Formulas for calculating CI ranges for diagnostic statistics have been published.[84,86,87] As is apparent in Table 7, the recommended formulas do not result in a symmetrical symmetrical

equally on both sides.


symmetrical multifocal encephalopathy
inherited disease in two forms: Limousin form appears at about a month old with blindness, forelimb hypermetria, hyperesthesia, nystagmus, aggression, weight
 CI about the statistical estimate.[88] The asymmetry Asymmetry

A lack of equivalence between two things, such as the unequal tax treatment of interest expense and dividend payments.
 is more pronounced as the sensitivity and specificity values move farther from 50% in either direction.[86] The width of the CI will also be related to the sample size and the amount of variability in the test being studied. Reporting of a CI with any diagnostic statistic is recommended to permit an assessment of the precision of any estimate of diagnostic accuracy.[86,89]

The Chi-Square Statistic

Studies of diagnostic tests comparing categorical That which is unqualified or unconditional.

A categorical imperative is a rule, command, or moral obligation that is absolutely and universally binding.

Categorical is also used to describe programs limited to or designed for certain classes of people.
 results of a test and a reference standard are frequently analyzed with a chi-square statistic and accompanying significance level. The chi-square statistic tests the hypothesis that the test results and reference standard have no association, but it does not indicate the strength or direction of any relationship that exists.[90] Chi-square statistics and associated probability values cannot assist in the process of probability revision based on test results in individual patients and, therefore, cannot be considered evidence-based statistics.[91]

Conclusions based strictly on chi-square analyzes can be misleading without information on sensitivity, specificity, and LR values. The study by Burke et al[43] on diagnostic tests for patients with suspected CTS illustrates this concern. One diagnostic test examined by the authors was the patient self-report of hand swelling, graded as present (positive) or absent (negative), against a reference standard of response to 2 weeks of splinting. The reference standard was graded as "positive response to splinting" or "no response to splinting" based on patient self-report.[43] The authors chose to analyze the data using a chi-square test chi-square test: see statistics.  only and found a statistically significant result (P=.028) (Tab. 9). The authors concluded, "These data suggest that the complaint of subjective swelling in the hand or wrist may be one of the most important findings from the history and clinical examination for determining which patients will, in fact, respond to conservative treatment (splinting)."[43] The sensitivity, specificity, and LR values calculated from the data do not support this conclusion. The sensitivity (33.3) and specificity (49.8) were low, resulting in a positive LR of 0.66 and negative LR of 1.34 (Tab. 9). Both LR values are close to 1, with the negative LR slightly greater than 1 and the positive LR slightly less than 1, indicating that the weak relationship between a complaint of swelling and response to splinting is in an inverse (mathematics) inverse - Given a function, f : D -> C, a function g : C -> D is called a left inverse for f if for all d in D, g (f d) = d and a right inverse if, for all c in C, f (g c) = c and an inverse if both conditions hold.  direction (ie, a negative complaint of swelling is associated with an increased likelihood of response to splinting). Because evidence-based statistics were not reported, we believe that the authors overinterpreted the utility of the test. This example illustrates the necessity of reporting sensitivity, specificity, and LR values to permit an appropriate assessment of a diagnostic test and interpretation for individual patient decision making.
Table 9.
Comparison of the Results of the Patients' Complaints of Hand
Swelling and Results of a 2-Week Period of Splinting in a Group of
Patients With Suspected Carpal Tunnel Syndrome[43](a)

                      Positive               Negative
                      Response               Response
                      to Splinting           to Splinting

Complaint of          17                     120
  swelling positive                      A   B

                                         C   D
Complaint of          34                     119
  swelling negative

                      Sensitivity (%)=33.3   Specificity (%)=49.8
                      95% CI: 20.4, 46.3     95% CI: 43.5, 56.1

[chi square]=4.80, P=.028
Positive likelihood ratio=0.66, 95% CI: 0.44, 1.0
Negative likelihood ratio=1.34, 95% CI: 1.06, 1.69

(a) The chi-square test shows statistical significance, but the
likelihood ratio values indicate a lack of accuracy for the complaint
of hand swelling in diagnosing a positive response to splinting.
CI=confidence interval.


The Role of Reliability

In order to provide useful information, a test should yield reliable results in the clinical setting. That is, performance of the test on different occasions should yield the same result if the status of the patient being examined has not changed. Traditionally, reliability has been emphasized as a precursor precursor /pre·cur·sor/ (pre´kur-ser) something that precedes. In biological processes, a substance from which another, usually more active or mature, substance is formed. In clinical medicine, a sign or symptom that heralds another.  to validity, a preliminary step that should be completed prior to initiating any study of validity. The numerous studies examining diagnostic test reliability without any assessment of validity attest to this mind-set. The peril The designated contingency, risk, or hazard against which an insured seeks to protect himself or herself when purchasing a policy of insurance.

Among the various types of perils for which insurance coverage is available are fire, theft, illness, and death.


PERIL.
 in this approach is that it may lead to the dismissal of potentially useful tests based on an inability to reach an arbitrary threshold of reliability. This could be due to properties of the statistics used to measure reliability.

The kappa Kappa

Used in regression analysis, Kappa represents the ratio of the dollar price change in the price of an option to a 1% change in the expected price volatility.

Notes:
Remember, the price of the option increases simultaneously with the volatility.
 statistic is the reliability coefficient typically used in studies of agreement between examiners for categorical data categorical data

data relating to category such as qualitative data, e.g. dog, cat, female. It may be nominal when a name is used, e.g. location, breed, or ordinal when a range of categories is used, e.g. calf, yearling, cow.
.[92] The kappa statistics appropriate for this purpose because it is a chance-corrected measure of agreement; however, it can be subject to deflation deflation: see inflation.
deflation

Contraction in the volume of available money or credit that results in a general decline in prices. A less extreme condition is known as disinflation.
 based on the prevalence of the condition being measured.[35,93] For example, Spitznagel and Helzer[94] noted that, if 2 raters of equal ability each performed a test and each rater rat·er  
n.
1. One that rates, especially one that establishes a rating.

2. One having an indicated rank or rating. Often used in combination: a third-rater; a first-rater. 
 was known to have 80% sensitivity and 98% specificity when his or her results were compared with a reference standard, the kappa statistic between the raters would be .67 if the errors made by the raters relative to the reference standard were independent. If the same raters, with the same level of accuracy, repeated the test in a second population with a prevalence of only 5%, the kappa value would fall to .52.[94] This is an example of the difficulty in interpreting kappa values when prevalence is extremely high or low. Many conditions of interest in physical therapy are rare, and kappa statistics used in these instances may be artificially lowered.

In addition, although arbitrary scales exist for categorizing kappa values as poor, fair, good, and so on,[92] the threshold level Noun 1. threshold level - the intensity level that is just barely perceptible
intensity, intensity level, strength - the amount of energy transmitted (as by acoustic or electromagnetic radiation); "he adjusted the intensity of the sound"; "they measured the
 making a test "reliable enough" is not known. For example, Smieja et al[95] examined the reliability and diagnostic accuracy of tests used in the identification of patients with diabetes who lacked sufficient protective sensation of the feet. A total of 304 patients were examined, 200 of whom were also examined by a second rater to measure reliability. The reference standard was a Semmes-Weinstein monofilament monofilament,
n a single strand of untwisted synthetic material such as nylon; used to create surgical sutures.

monofilament 
 examination. One diagnostic test that was examined was position sense assessed at the interphalangeal joint in·ter·pha·lan·ge·al joint
n.
See digital joint.
 of the great toe for a 10-degree change. The kappa value between raters for judgments of position sense was only fair by most standards ([Kappa]=.28). The results (Tab. 10), however, show that the position sense test provided useful information when it was positive (specificity=98%, positive LR=12.8).[95] If the reliability assessment had been performed separate from the study of validity, it is possible that the position sense test would have been discarded dis·card  
v. dis·card·ed, dis·card·ing, dis·cards

v.tr.
1. To throw away; reject.

2.
a. To throw out (a playing card) from one's hand.

b.
 from further consideration due to a lack of reliability, and the potential diagnostic value of a positive result may not have been uncovered.
Table 10.
Accuracy of Position Sense Testing for Diagnosing a Lack of Protective
Sensation in the Feet of Patients With Diabetes[95],(a)

                      Monofilament           Monofilament
                      Test Positive          Test Negative

Position sense test    34                      2
  positive                               A   B
                                         C   D
Position sense test   135                    126
  negative

                      Sensitivity (%)=20.1   Specificity (%)=98.4
                      95% CI: 14.1, 26.2     95% CI: 96.3, 1.0

Positive likelihood ratio=12.8, 95% CI: 3.1, 52.2
Negative likelihood ratio=0.81,95% CI: 0.75, 0.88

(a) CI=confidence interval.


Reliability data certainly convey meaningful information; however, we believe that their usefulness is best appreciated when considered in conjunction with data examining diagnostic accuracy or utility. Reliability assessments conducted as independent preliminary studies can lead to the premature exclusion of useful tests or the promotion of highly reliable, but diagnostically meaningless, tests. To encourage complete examination of a diagnostic test, reliability data should be considered a complement to, not a precursor of, an assessment of diagnostic value. An important role of reliability data in the context of assessing the strength of evidence prodded by a diagnostic test is that it may provide an explanation for inadequate accuracy or utility.[56,77] When a measurement is found to have little diagnostic meaning and poor reliability, the test's diagnostic ability may be improved if the test is performed in a manner that leads to more reliable measurements.

Applying the Evidence--Practicing Evidence-Based Practice

Although it may not be viewed in this manner by all therapists, the diagnostic process is essentially an exercise in probability revision (Fig. 2).[96] Prior to performing a test, a therapist has some idea of the likelihood that the patient has the condition of interest. The likelihood may be most readily expressed in qualitative terms such as "highly likely," "very unlikely," and so forth. These terms, however, can be made more quantitative by speaking in terms of probabilities. For instance, if a condition is thought to be highly likely, this may translate in the therapist's mind to a probability of 75% or 80% certainty. The condition of interest may be a question of screening (Does the patient's problem involve a certain anatomical structure Noun 1. anatomical structure - a particular complex anatomical part of a living thing; "he has good bone structure"
bodily structure, body structure, complex body part, structure

layer - thin structure composed of a single thickness of cells
 or region?) or of classification (Is the patient going to respond to a certain treatment?). The therapist can, also have in mind a treatment threshold level of certainty at which he or she will be "sure enough" and ready to act.[81] For example, a therapist may feel that he or she must be at least 80% certain that a patient has lumbar spinal stenosis before initiating a program of flexion exercises. Treatment thresholds may not be explicitly stated, but we believe that all therapists reach a point when the examination and evaluation process stops and intervention begins. This threshold should take into consideration the costs associated with being wrong versus the benefits of being correct.[97,98] For example, a high threshold is required when ruling out metastatic Metastatic
The term used to describe a secondary cancer, or one that has spread from one area of the body to another.

Mentioned in: Coagulation Disorders


metastatic

pertaining to or of the nature of a metastasis.
 disease as a source of LBP. Conversely con·verse 1  
intr.v. con·versed, con·vers·ing, con·vers·es
1. To engage in a spoken exchange of thoughts, ideas, or feelings; talk. See Synonyms at speak.

2.
, if the question concerned the application of a treatment with minimal cost and low potential for side effects Side effects

Effects of a proposed project on other parts of the firm.
, the threshold would be lower. For example, the application of patellar patellar

of or pertaining to the patella.


patellar cartilage
a cartilaginous process borne on the medial side of the patella of horses and cattle.
 taping for a patient with patellofemoral joint pain is a low-cost intervention with few side effects. A therapist may feel it necessary to be only 50% certain that the treatment will be effective in order to initiate the treatment.

[ILLUSTRATION OMITTED]

The patient's values should also be considered in establishing treatment thresholds and determining when to implement an intervention.[99] As an example, during the examination of a patient who had a stroke over 1 year previously, a therapist may test the modality modality /mo·dal·i·ty/ (mo-dal´i-te)
1. a method of application of, or the employment of, any therapeutic agent, especially a physical agent.

2.
 of light touch by alternately touching both of the patient's hands and checking for any difference in feeling. If the light touch test is positive (ie, there is a difference in feeling), there is evidence to suggest that the patient has a higher probability of improving function of the hemiplegic hem·i·ple·gia  
n.
Paralysis affecting only one side of the body.



[Late Greek hmipl
 upper extremity upper extremity
n.
The shoulder, arm, forearm, wrist, or hand. Also called superior limb, thoracic limb.
 with an intervention involving forced-use therapy.[100] This intervention, however, requires the patient to immobilize im·mo·bi·lize
v.
1. To render immobile.

2. To fix the position of a joint or fractured limb, as with a splint or cast.



im·mo
 the healthy upper extremity for up to 12 hours per day and attend daily therapy sessions lasting for 6 hours.[100] Some patients may not value the potential increased function of the extremity extremity /ex·trem·i·ty/ (eks-trem´i-te)
1. the distal or terminal portion of elongated or pointed structures.

2. limb.


ex·trem·i·ty
n.
1.
 highly enough to tolerate the required treatment intensity unless the probability of improving function is very high.

The amount of data required to move beyond the treatment threshold is partly determined by the pretest probability that the condition of interest is present. The pretest probability is an important consideration for examining the diagnostic process because it determines how much data will be required to reach a treatment threshold. If the pretest probability that a condition is present is very high, perhaps 80%, one negative test result is unlikely to lower the probability sufficiently to permit its exclusion from further consideration, and additional testing will likely be required to reach a threshold at which the diagnosis would be sufficiently ruled out.[101] Likewise, if the pretest probability is low, a single positive finding will probably not be adequate to elevate el·e·vate  
tr.v. ele·vat·ed, ele·vat·ing, ele·vates
1. To move (something) to a higher place or position from a lower one; lift.

2. To increase the amplitude, intensity, or volume of.

3.
 the probability beyond the threshold to rule in the condition. That is, if the therapist is fairly certain regarding a diagnosis and an unexpected finding occurs, further data are probably required before a treatment threshold can be reached. Pretest probabilities can come from a variety of sources, including epidemiological epidemiological

emanating from or pertaining to epidemiology.


epidemiological associations
the associative relationships between the frequency of occurrence of a disease and its determinants, its predisposing and precipitating
 data on prevalence rates for certain conditions, information already obtained on the patient from the examination, and clinical experience with similar presentations.[72] Regardless of the source, an often overlooked step in examining the diagnostic process is recognizing and quantifying the level of certainty in a diagnosis prior to the performance of a test.

The information provided by the results of a diagnostic test will alter the pretest probability to some extent, resulting in a revised posttest post·test  
n.
A test given after a lesson or a period of instruction to determine what the students have learned.
 probability that the condition of interest is present. The magnitude of the revision is based, as has been noted, on data derived from comparisons of the diagnostic test with a reference standard. Likelihood ratios quantify the direction and magnitude of change in the pretest probability based on the test result and, therefore, provide the best information needed to select the test or tests that will most efficiently move from the uncertainty associated with the pretest probability to the threshold for action.[32,102] To illustrate the process, we will use an example of a question that may arise during the examination of a 67-year-old patient with symptoms in both the low back/ buttock but·tock
n.
1. Either of the two rounded prominences on the human torso that are posterior to the hips and formed by the gluteal muscles and underlying structures.

2. buttocks The rear pelvic area of the human body.
 and anterior anterior /an·te·ri·or/ (an-ter´e-or) situated at or directed toward the front; opposite of posterior.

an·te·ri·or
adj.
1. Placed before or in front.

2.
 hip/groin that worsen wors·en  
tr. & intr.v. wors·ened, wors·en·ing, wors·ens
To make or become worse.


worsen
Verb

to make or become worse

worsening adjn
 when the patient is walking: Are the patient's symptoms coming from the lumbar spine Lumbar spine
The segment of the human spine above the pelvis that is involved in low back pain. There are five vertebrae, or bones, in the lumbar spine.

Mentioned in: Low Back Pain
 (eg, lumbar spinal stenosis)?

What is a reasonable pretest probability of lumbar spinal stenosis for this patient? Based on the patient's age and symptoms, epidemiological data[103,104] as well as clinical experience suggest that the probability is fairly high, perhaps 50%. What test should be performed to rule in this diagnosis? Examining the results from several studies[30,105,106] (Tab. 11), the best test appears to be asking the patient whether symptoms are absent when sitting (positive LR=6.6). It is not uncommon that information from the history exceeds that obtained from the systems review or the tests and measurements with regard to diagnostic accuracy. If the test is positive, what should the posttest probability of lumbar spinal stenosis be? Two methods can be used to make this determination. The simpler, but less precise, method uses a nomogram nomogram /nom·o·gram/ (nom´o-gram) a graph with several scales arranged so that a straightedge laid on the graph intersects the scales at related values of the variables; the values of any two variables can be used to find the values of  (Fig. 3).[107] A straightedge is anchored along the lefthand side of the nomogram at the point corresponding to the pretest probability. The posttest probability is determined by running the straightedge from this point through the appropriate LR value. The point of intersection of the straightedge with the right-hand side right-hand side nderecha

right-hand side right nrechte Seite f

right-hand side nlato destro 
 of the nomogram represents the posttest probability.[42]

[ILLUSTRATION OMITTED]
Table 11.
Accuracy of Diagnostic Tests for Lumbar Spinal Stenosis(a)

Test                              Sensitivity    Specificity

Factors from the history
  Symptoms become worse with      71 (57, 85)    30 (14, 46)
    walking[30]
  Ranks standing or walking as    89 (76, 100)   33 (12, 55)
    worse than sitting with
    regard to symptoms[105]
  Able to walk better when        63 (42, 85)    67 (40, 93)
    holding on to a shopping
    cart[105]
  Absence of pain when            46 (30, 62)    93 (84, 100)
    seated[30]
Factors from the examination
  No pain with lumbar             79 (67, 91)    44 (27, 61)
    flexion[30]
  Absent Achilles reflex[30]      46 (31,61)     78 (64, 92)
  Able to walk farther with the   58 (36, 80)    91 (74, 100)
    spine flexed vs
    extended[106]

Test                              Positive LR        Negative LR

Factors from the history
  Symptoms become worse with      1.0 (0.80, 1.3)    0.96 (0.50, 1.75)
    walking[30]
  Ranks standing or walking as    1.3 (0.93, 1.9)    0.35 (0.10, 1.2)
    worse than sitting with
    regard to symptoms[105]
  Able to walk better when        1.9 (0.79, 4.5)    0.55 (0.27, 1.1)
    holding on to a shopping
    cart[105]
  Absence of pain when            6.6 (2.4, 18.0)    0.58 (0.43, 0.77)
    seated[30]
Factors from the examination
  No pain with lumbar             1.4 (1.1, 1.9)     0.48 (0.25, 0.92)
    flexion[30]
  Absent Achilles reflex[30]      2.0 (1.1, 3.6)     0.69 (0.51,0.95)
  Able to walk farther with the   6.4 (0.95, 42.9)   0.46 (0.27, 0.81)
    spine flexed vs
    extended[106]

(a) Numbers in parentheses represent 95% confidence intervals
calculated from the data presented in the references.


An alternative method for quantifying posttest probability utilizes a 3-step calculation process described by Sackett et al[42] and outlined below:

1. Convert the pretest probability (50%) to odds using the formula:

Pretest odds=pretest probability/1 - pretest probability

In this example, the pretest odds would be: .50/ (1 - .50)=1:1.

2. Multiply the odds by the appropriate LR value (in this case, the positive LR) using the formula:

Pretest odds x LR=posttest odds

In this example, the posttest odds would be: 1:1 x 6.6=6.6:1.

3. Convert the posttest odds back to probability using the formula:

Posttest odds/Posttest odds + 1=posttest probability

In this example, the posttest probability would be: 6.6/(6.6 + 1)=87%.

Knowledge of the positive LR values permitted the selection of the test that produced the greatest shift in probability favoring the condition. Had another test been selected with a smaller positive LR, the results would not have been as conclusive. For example, if the therapist had opted to assess pain with lumbar flexion and the test were positive (ie, no pain), the posttest probability would increase to only 58%. Without knowledge of the relative unimportance of this finding, the therapist might over-interpret the value of the positive result.

The importance of the pretest probability is also highlighted by this example. If the patient in question had the same symptoms but was younger, perhaps 45 years of age, the pretest probability of lumbar spinal stenosis would be lower. If the pretest probability was estimated at 20% (pretest odds=0.25:1) and the question of the absence of pain when seated was positive, the posttest probability would increase to 62%. It is likely that, in the mind of the therapist, further confirmation would be needed to reach the action threshold for diagnosing the patient with lumbar spinal stenosis. Based on the data shown in Table 11, comparing walking tolerance with the spine flexed versus extended would be the best option (positive LR=6.4). If this test were positive, the probability would increase from 62% to 91%, likely exceeding the action threshold.

When the pretest probability is low, the therapist may instead seek information to rule out stenosis stenosis /ste·no·sis/ (ste-no´sis) pl. steno´ses   [Gr.] stricture; an abnormal narrowing or contraction of a duct or canal.  and then proceed with confirming an alternative hypothesis alternative hypothesis Epidemiology A hypothesis to be adopted if a null hypothesis proves implausible, where exposure is linked to disease. See Hypothesis testing. Cf Null hypothesis. .[4] In this circumstance, the test with the smallest negative LR would be desirable because a negative result would most effectively exclude the condition. Examining Table 11, it is again apparent that a question from the history will be more effective for this purpose than other factors. The patient is asked to rank sitting, standing, and walking from "best" to "worst" with regard to symptoms. If the test is negative (ie, pain during standing or walking is not ranked as "worst"), the negative LR associated with the finding is 0.33 and the probability of stenosis drops to 8%. Table 11 also illustrates the impact of the phrasing of the question. If the patient is asked simply whether or not symptoms become worse when walking, the result is useless, with positive and negative LR values of about 1.0. If the patient instead is asked about improvement in walking when holding on to a shopping cart, the specificity and positive LR increase, but the negative LR remains fairly low. If the goal is ruling out lumbar spinal stenosis, having the patient rank pain during sitting, standing, and walking has the potential to provide the strongest evidence.

Likelihood ratios provide the most powerful tool for demonstrating the importance of a particular test within the diagnostic process in a quantified manner. Because LR values can be calculated for both positive and negative results, the importance of each can be examined. This is necessary because few tests provide useful information in both capacities, and understanding the relative strength of evidence provided by a negative or positive result helps to refine test interpretation. For these reasons and for other reasons discussed, researchers examining diagnostic tests should calculate, or provide sufficient data to permit the calculation of, LR values.[83] Therapists should focus on LR values in determining which tests are most effective for ruling in or ruling out conditions of interest.

Applying the Evidence--The Consequences of Not Practicing Evidence-Based Diagnosis

Diagnostic tests play a critical role in the management of patients in physical therapy. The results of individual tests are evaluated during the examination process, determining which hypotheses should be ruled in or out, ultimately leading to a decision to a use a certain intervention that is believed to provide optimal outcomes for the patient. The ability to judge evidence for diagnostic tests, select the most appropriate test for an individual patient, and interpret the results will need to become familiar skills if physical therapy diagnosis is to become a more evidence-based process.

Many aspects of physical therapist practice, including diagnosis, have been criticized for excessive allegiance to expert opinion and uncritical acceptance of standards that are not based on evidence.[108,109] Systems of integrating diagnosis and intervention in common usage by physical therapists too frequently owe their popularity to tradition instead of sound data attesting to their usefulness. For example, neurodevelopmental treatment (NDT NDT Newfoundland Daylight Time ) is an approach to the management of patients with movement disorders Movement Disorders Definition

Movement disorders are a group of diseases and syndromes affecting the ability to produce and control movement.
Description
 in which the therapist examines factors such as movement patterns and postural reactions and then selects interventions to reduce abnormal movements ,and improve function.[110,111] Even though NDT appears to be the method most commonly used by physical therapists for managing children with cerebral palsy,[112] little research has been performed to examine the evidence for examination techniques used within the system or the manner in which the tests are evaluated to determine appropriate interventions.[111]

Without any validation of the diagnostic decision making underlying intervention choices, it is not surprising that clinical trials comparing patients treated with an NDT-based approach versus other interventions have not demonstrated improved outcomes with the use of the NDT system.[113-117] A similar situation exists for the most common treatment approach for patients with LBP, the McKenzie system.[118] The McKenzie system uses a variety of examination techniques, the results of which are used to place patients into categories and to determine interventions. Little work has been done to examine the diagnostic process used by the McKenzie system, and the reliability of the classifications is questionable.[119] A recent clinical trial comparing outcomes for the McKenzie system with chiropractic chiropractic (kīrəprăk`tĭk) [Gr.,=doing by hand], medical practice based on the theory that all disease results from a disruption of the functions of the nerves.  care and a patient education pamphlet pamphlet, short unbound or paper-bound book of from 64 to 96 pages. The pamphlet gained popularity as an instrument of religious or political controversy, giving the author and reader full benefit of freedom of the press.  resulted in essentially no differences among the treatment approaches.[120]

Reliance on patient management systems that are not evidence-based, in our view, has negative consequences not only for practitioners, but also for the profession of physical therapy as a whole. Both the McKenzie system and NDT have been used in clinical trials as representative of "physical therapy" interventions for patients with LBP or cerebral palsy, respectively.[117,120] The negative results of these trials have led to the conclusion that physical therapy may not have a role in the management of these conditions. It should not be surprising, however, that systems whose diagnostic procedures are not evidence-based do not result in improved patient outcomes. If diagnostic decisions had been made on the basis of tests with evidence attesting to their ability to focus the examination and determine the most effective interventions, the results might have been more positive. The McKenzie system and NDT serve only to illustrate a more fundamental problem. Without evidence-based diagnosis, interventions will continue to be based on observation that may not even be systematic, pathoanatomical theories, ritual, and opinion. Studies examining the outcomes of such interventions will continue, in our opinion, to offer discouraging results. The solution is not only to explore new and innovative interventions, but to refine the process by which interventions are linked to examination findings by studying evidence-based diagnosis.

Conclusions

The process of diagnosis is an essential task for physical therapists because it serves as the link between examination findings and interventions. To be able to examine diagnosis from an evidence-based perspective, we argue that therapists need to be familiar with the standards defining the "current best evidence" and how the evidence can be used for "making decisions about the care of individual patients."[21(p71)] The standards relate to several aspects of the study design and data analysis.

An important first step is to define the purpose for which a diagnostic test is used. The purpose should be reflected in the choice of a reference standard (measurement) against which the results are compared. Both the diagnostic test and the reference standard should be applied consistently in all subjects and judged by blinded examiners.

The study sample should be representative of the type of patients on whom the test is typically used in the clinical setting. The best statistics for application in individual patient decision making are LRs because they can be used to quantify probability revision based on positive or negative test results. The application of evidence into patient management requires an understanding of probability and the shifts in probability caused by a certain test result. Systems of patient management that link diagnostic tests with interventions may produce less favorable fa·vor·a·ble  
adj.
1. Advantageous; helpful: favorable winds.

2. Encouraging; propitious: a favorable diagnosis.

3.
 results when the diagnostic process within the system is not evidence-based. More studies are needed to examine commonly used diagnostic methods in physical therapy. The evidence provided by past and future studies should be applied to the management of patients in order to make the practice of physical therapy more evidence-based.

References

[1] Rose SJ. Physical therapy diagnosis: role and function. Phys Ther. 1989;69:535-537.

[2] Sahrmann SA. Diagnosis by the physical therapist: a prerequisite for treatment. Phys Ther. 1988;68:1703-1706.

[3] Guide to Physical Therapist Practice. 2nd ed. Phys Ther. 2001;81:43.

[4] Delitto A, Snyder-Mackler L. The diagnostic process: examples in orthopedic orthopedic /or·tho·pe·dic/ (-pe´dik) pertaining to the correction of deformities of the musculoskeletal system; pertaining to orthopedics.  physical therapy. Phys Ther. 1995;75:203-211.

[5] Clinical Research Agenda for Physical Therapy. Phys Ther. 2000;80: 499-513.

[6] Schwartz JS. Evaluating diagnostic tests: what is done, what needs to be done? J Gen Intern intern /in·tern/ (in´tern) a medical graduate serving in a hospital preparatory to being licensed to practice medicine.

in·tern or in·terne
n.
 Med. 1986;1:266-276.

[7] Deyo RA, Haselkorn J, Hoffman R, Kent DL. Designing studies of diagnostic tests for low back pain or radiculopathy. Spine. 1994; 19(suppl 18):2057S-2065S.

[8] Sackett DL, Wennberg JE. Choosing the best research design for each question: it's time It's Time was a successful political campaign run by the Australian Labor Party (ALP) under Gough Whitlam at the 1972 election in Australia. Campaigning on the perceived need for change after 23 years of conservative (Liberal Party of Australia) government, Labor put forward a  to stop squabbling over the "best" methods. BMJ BMJ n abbr (= British Medical Journal) → vom BMA herausgegebene Zeitschrift . 1997;315:1636.

[9] Shinozaki T, Hasegawa T, Yano E. Ankle-arm index as an indicator of atherosclerosis atherosclerosis (ăth'ərōsklərō`sĭs): see arteriosclerosis.
atherosclerosis
 or hardening of the arteries
: it's application as a screening method. J Clin Epidemiol. 1998;51:1263-1269.

[10] Newman AB, Siscovick DS, Manolio TA, et al. Ankle-arm index as a marker of atherosclerosis in the Cardiovascular Health Study. Circulation. 1993;99:837-845.

[11] Altman R, Alarcon G, Appelrouth D, et al. The American College American College is the name of:
  • American College Dublin, Dublin, Ireland
  • The American College in Madurai, Tamil Nadu, India
  • The American College of the Immaculate Conception, Leuven (also known as Louvain), Belgium
 of Rheumatology rheumatology /rheu·ma·tol·o·gy/ (-tol´ah-je) the branch of medicine dealing with rheumatic disorders, their causes, pathology, diagnosis, treatment, etc.

rheu·ma·tol·o·gy
n.
 criteria for the classification and reporting of osteoarthritis osteoarthritis
 or osteoarthrosis or degenerative joint disease

Most common joint disorder, afflicting over 80% of those who reach age 70. It does not involve excessive inflammation and may have no symptoms, especially at first.
 of the hip. Arthritis Rheum rheum (rldbomacm) any watery or catarrhal discharge.

rheum
n.
A watery or thin mucous discharge from the eyes or nose.



rheum

any watery or catarrhal discharge.
. 1991;34:505-515.

[12] Delitto A, Erhard RE, Bowling RW. A treatment-based classification approach to low back syndrome: identifying and staging patients for conservative management. Phys Ther. 1995;75:470-489.

[13] Khuffash B, Porter RW. Cross leg pain and trunk list. Spine. 1989;14:602-603.

[14] Porter RW, Miller CG. Back pain and trunk list. Spine. 1986;11: 596-600.

[15] McKenzie RA. Manual correction of sciatic sciatic /sci·at·ic/ (si-at´ik)
1. near or related to the sciatic nerve or vein.

2. ischial.


sci·at·ic
adj.
1.
 scoliosis Scoliosis Definition

Scoliosis is a side-to-side curvature of the spine.
Description

When viewed from the rear, the spine usually appears perfectly straight.
. NZ Med J NZ MED J New Zealand Medical Journal . 1972;76:194-199.

[16] McKenzie RA. The Lumbar Spine: Mechanical Diagnosis and Therapy. Waikanae, New Zealand New Zealand (zē`lənd), island country (2005 est. pop. 4,035,000), 104,454 sq mi (270,534 sq km), in the S Pacific Ocean, over 1,000 mi (1,600 km) SE of Australia. The capital is Wellington; the largest city and leading port is Auckland. : Spinal Publications Ltd; 1989.

[17] Charnley J. Orthopaedic signs in the diagnosis of disc protrusion protrusion /pro·tru·sion/ (-troo´zhun)
1. extension beyond the usual limits, or above a plane surface.

2. the state of being thrust forward or laterally, as in masticatory movements of the mandible.
. Lancet lancet /lan·cet/ (lan´set) a small, pointed, two-edged surgical knife.

lan·cet
n.
. 1951;1:186-192.

[18] Deyo RA, Rainville J, Kent DL. What can the history and physical examination tell us about low back pain? JAMA JAMA
abbr.
Journal of the American Medical Association
. 1992;268:760-765.

[19] van den Hoogen HMM HMM

heavy meromyosin.
, Koes BW, van Eijk JTM JTM Je T'aime (French: I Love You)
JTM Job Transfer & Manipulation
JTM Joint Technical Manual
JTM Jackass the Movie (movie)
JTM Jack T.
, Bouter LM. On the accuracy of history, physical examination, and erthrocyte sedimentation rate sedimentation rate
n.
The degree of rapidity with which red blood cells sink in a specimen of drawn blood, which when elevated may indicate anemia or inflammation. Also called erythrocyte sedimentation rate, sed rate.
 in diagnosing low back pain in general practice. Spine. 1995;20:318-327.

[20] Fritz fritz  
n. Informal
A condition in which something does not work properly: Our television is on the fritz.



[Perhaps from German Fritz
 JM. Use of a classification approach to the treatment of 3 patients with low back syndrome. Phys Ther. 1998;78:766-777.

[21] Sackett DL, Rosenberg WM, Gray JA, et al. Evidence based medicine: what it is and what it isn't. BMJ. 1996;312:71-72.

[22] van Tulder MW, Assendelft WJ, Koes BW, Bouter LM. Method guidelines guidelines,
n.pl a set of standards, criteria, or specifications to be used or followed in the performance of certain tasks.
 for systematic reviews in the Cochrane Collaboration The Cochrane Collaboration was developed in response to Archie Cochrane's call for up-to-date, systematic reviews of all relevant randomized controlled trials of health care.  Back Review Group for Spinal Disorders. Spine. 1997;22:2323-2330.

[23] Dickersin K, Scherer R, Lefebvre C. Identifying relevant studies for systematic reviews. BMJ. 1994;309:1286-1291.

[24] Guyatt GH, Sackett DL, Cook DJ. Users' guide to the medical literature, II: how to use an article about therapy or prevention, A: are the results of the study valid? JAMA. 1993;270:2598-2601.

[25] Mulrow CD, Linn linn  
n. Scots
1. A waterfall.

2. A steep ravine.



[Scottish Gaelic linne, pool, waterfall.]
 WD, Gaul MK, Pugh JA. Assessing quality of diagnostic test evaluation. J Gen Intern Med. 1989;4:288-295.

[26] Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999;282: 1061-1066.

[27] Irwig L, Tosteson ANA, Gatsonis C, et al. Guidelines for meta-analyses evaluating diagnostic tests. Ann Intern Med. 1994;120:667-676.

[28] Begg CB. Methodologic standards for diagnostic test assessment studies. J Gen Intern Med. 1988;3:518-520.

[29] Armstrong DG, Lavery LA, Vela vela

plural of velum.
 SA, et al. Choosing a practical screening instrument to identify patients at risk for diabetic foot diabetic foot A foot with a constellation of pathologic changes affecting the lower extremity in diabetics, often leading to amputation and/or death due to complications; the common initial lesion leading to amputation is a nonhealing skin ulcer, induced by  ulceration. Arch Intern Med. 1998;153:289-292.

[30] Katz JN, Dalgas M, Stucki G, et al. Degenerative lumbar spinal stenosis: diagnostic value of the history and physical examination. Arthritis Rheum. 1995;38:1236-1241.

[31] Geddes JR, Harrison PJ. Closing the gap between research and practice. Br J Psychiatry psychiatry (səkī`ətrē, sī–), branch of medicine that concerns the diagnosis and treatment of mental, emotional, and behavioral disorders, including major depression, schizophrenia, and anxiety. . 1997;171:220-225.

[32] Jaeschke RZ, Meade MO, Guyatt GH, et al. How to use diagnostic test articles in the intensive care unit: diagnosing weanability using f/Vt. Crit Care Med. 1997;25:1514-1521.

[33] Task Force on Standards for Measurement in Physical Therapy. Standards for tests and measurements in physical therapy practice. Phys Ther. 1991;71:589-622.

[34] Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature, III: how to use an article about a diagnostic test, A: are the results of the study valid? JAMA. 1994;271:389-391.

[35] Rothstein JM, Echternach JL. Primer on Measurement: An Introductory Guide to Measurement Issues. Alexandria, Va: American Physical Therapy Association; 1993:67-73.

[36] Ashworth B. Preliminary trial of carisoprodal in multiple scelerosis. Practitioner. 1964;192:540-542.

[37] Pandyan AD, Johnson GR, Price CI, et al. A review of the properties and limitations of the Ashworth and modified Ashworth Scales as measures of spasticity. Clin Rehabil. 1999;13:373-383.

[38] Haas BM, Bergstrom E, Jamous A, Bennie A. The inter-rater reliability Inter-rater reliability, Inter-rater agreement, or Concordance is the degree of agreement among raters. It gives a score of how much , or consensus, there is in the ratings given by judges.  of the original and of the modified Ashworth scale for the assessment of spasticity in patients with spinal cord injury Spinal Cord Injury Definition

Spinal cord injury is damage to the spinal cord that causes loss of sensation and motor control.
Description

Approximately 10,000 new spinal cord injuries (SCIs) occur each year in the United States.
. Spinal Cord spinal cord, the part of the nervous system occupying the hollow interior (vertebral canal) of the series of vertebrae that form the spinal column, technically known as the vertebral column. . 1996;34:560-564.

[39] Allison SC, Abraham LD, Petersen CL. Reliability of the Modified Ashworth Scale in the assessment of plantarflexor muscle spasticity in patients with traumatic brain injury Traumatic brain injury (TBI), traumatic injuries to the brain, also called intracranial injury, or simply head injury, occurs when a sudden trauma causes brain damage. TBI can result from a closed head injury or a penetrating head injury and is one of two subsets of acquired brain . Int J Rehabil Res. 1996;19:67-78.

[40] Katz RT, Rovai GP, Bait C, Rymer WZ. Objective quantification of spastic spastic /spas·tic/ (spas´tik)
1. of the nature of or characterized by spasms.

2. hypertonic, so that the muscles are stiff and movements awkward.


spas·tic
adj.
1.
 hypertonia hypertonia /hy·per·to·nia/ (-to´ne-ah) a condition of excessive tone of the skeletal muscles; increased resistance of muscle to passive stretching.

hy·per·to·ni·a
n.
: correlation with clinical findings. Arch Phys Med Rehabil. 1992;73:339-347.

[41] Fowler EG, Nwigwe AI, Ho TW. Sensitivity of the pendulum test for assessing spasticity in persons with cerebral palsy. Dev Med Child Neurol. 2000;42:182-189.

[42] Sackett DL, Haynes RB, Guyatt GH, Tugwell P. Clinical Epidemiology epidemiology, field of medicine concerned with the study of epidemics, outbreaks of disease that affect large numbers of people. Epidemiologists, using sophisticated statistical analyses, field investigations, and complex laboratory techniques, investigate the cause : A Basic Science for Clinical Medicine. 2nd ed. Boston, Mass: Little, Brown and Co Inc; 1992.

[43] Burke DT, Burke MA, Bell R, et al. Subjective swelling: a new sign for carpal tunnel syndrome. Am J Phys Med Rehabil. 1999;78:504-508.

[44] Cibulka MT, Koldehoff R. Clinical usefulness of a cluster of sacroiliac joint sacroiliac joint (sak´rōil´ēak´),
n an irregular synovial joint between the sacrum and ilium on either side of the pelvis.
 tests in patients with and without low back pain. J Orthop Sports Phys Ther. 1999;29:83-92.

[45] Levangie PK. Four clinical tests of sacroiliac joint dysfunction: the association of test results with innominate innominate /in·nom·i·nate/ (i-nom´i-nat) nameless.

in·nom·i·nate
adj.
1. Having no name.

2. Anonymous.
 torsion torsion, stress on a body when external forces tend to twist it about an axis. See strength of materials.  among patients with and without low back pain. Phys Ther. 1999;79:1043-1057.

[46] Dreyfuss P, Michaelsen M, Pauza K, et al. The value of medical history and physical examination in diagnosing sacroiliac joint pain. Spine. 1996;21:2594-2602.

[47] Maigne J-Y, Aivaliklis A, Pfefer F. Results of sacroiliac joint double block and value of sacroiliac pain provocation tests provocation test Medtalk 1 Any of a number of tests used to deliberately induce a suspected pathologic derangement–eg, provocation of ↑ intraocular pressure by ingestion of excess water 2 Neutralization, see there Orthopedics Any of a number of tests  in 54 patients with low back pain. Spine. 1996;21:1889-1892.

[48] Slipman CW, Sterenfeld EB, Chou LH, et al. The predictive value of provocative sacroiliac joint stress maneuvers in the diagnosis of sacroiliac joint syndrome. Arch Phys Med Rehabil. 1998;79:288-292.

[49] Cibulka MT, Delitto A, Koldehoff RM. Changes in innominate tilt after manipulation of the sacroiliac joint in patients with low back pain: an experimental study. Phys Ther. 1988;68:1359-1363.

[50] Panzer RJ, Suchman AL, Griner PF. Workup bias in prediction research. Med Decis Making. 1987;7:115-119.

[51] Rubenstein RA, Shelbourne KD, McCarroll JR, et al. The accuracy of the clinical examination in the setting of posterior cruciate ligament injuries posterior cruciate ligament injury PCL injury Orthopedics A partial or complete tear, dislocation, or stretch of the PCL from the bone attachment to the knee, or anywhere else along its length; it is usually injured by hyperextension, or a direct blow to the flexed . Am J Sports Med. 1994;22:550-557.

[52] Pinto-Martin JA, Torre C, Zhao H. Nurse screening of low-birth-weight infants Noun 1. low-birth-weight infant - an infant born weighing less than 5.5 pounds (2500 grams) regardless of gestational age; "a low-birth-weight infant is at risk for developing lack of oxygen during labor"
low-birth-weight baby
 for cerebral palsy using goniometry. Nurs Res. 1997;46: 284-287.

[53] Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research: getting better but still not good. JAMA. 1995;274:645-651.

[54] Ransohoff DF, Feinstein AR. Problems of spectrum bias in evaluating the efficacy of diagnostic tests. N Engl J Med. 1978;299:926-930.

[55] Itoh H, Kurosaka M, Yoshiya S, et al. Evaluation of functional deficits determined by four different hop tests in patients with anterior cruciate ligament deficiency. Knee Surg Sports Traumatol Arthrosc. 1998;6:241-245.

[56] Greenhalgh T. How to read a paper: papers that report diagnostic or screening tests. BMJ. 1997;315:540-543.

[57] Ganko A, Engebretsen L, Ozer H. The rolimeter: a new arthrometer compared with the KT-1000. Knee Surg Sports Traumatol Arthrosc. 2000; 8:36-39.

[58] Liu SH, Osti L, Henry M, Bocchi L. The diagnosis of acute complete tears of the anterior cruciate ligament: comparison of MRI, arthrometry and clinical examination. J Bone Joint Surg Br. 1995;77:586-588.

[59] Snyder-Mackler L, Fitzgerald GK, Bartolozzi AR, Ciccotti MG. The relationship between passive joint laxity laxity /lax·i·ty/ (lak´si-te)
1. slackness or looseness; a lack of tautness, firmness, or rigidity.

2. slackness or displacement in the motion of a joint.lax´


laxity

looseness.
 and functional outcome after anterior cruciate ligament injury anterior cruciate ligament injury Sports medicine An injury most common in sports characterized by abrupt changes of direction–eg, football, skiing, tennis, soccer Clinical Swelling, tenderness of knee Management ACL reconstruction via arthroscopy . Am J Sports Med. 1997;25:191-195.

[60] Katz JW, Fingeroth RJ. The diagnostic accuracy of ruptures of the anterior cruciate ligament comparing the Lachman test, the anterior drawer sign drawer sign
n.
An indication of laxity or a tear in the anterior or posterior cruciate ligments of the knee in which there is a forward or backward sliding of the tibia. Also called drawer test.
, and pivot shift test in acute and chronic knee injuries. Am J Sports Med. 1986;14:88-91.

[61] Cooperman JM, Riddle DL, Rothstein JM. Reliability and validity of judgments of the integrity of the anterior cruciate ligament of the knee using the Lachman's test. Phys Ther. 1990;70:225-233.

[62] Long AL. The centralization phenomenon: its usefulness as a predictor of outcome in conservative treatment of chronic low back pain (a pilot study). Spine. 1995;20:2513-2521.

[63] Karas Karas may refer to:
  • Karas Region, Namibia.
  • Karas Mountains, mountain range in Karas Region.
  • Karas (anime) by Sato Keiichi.
  • St. Karas
  • Karaš/Caraş, a river in Romania and Serbia.
 R, McIntosh G, Hall H, et al. The relationship between nonorganic signs and centralization of symptoms in the prediction of return to work for patients with low back pain. Phys Ther. 1997;77:354-360.

[64] Werneke M, Hart DL, Cook D. A descriptive study of the centralization phenomenon: a prospective analysis. Spine. 1999;24:676-683.

[65] Fritz JM, Delitto A, Vignovic M, Busse RG. Inter-rater reliability of judgments of the centralization phenomenon and status change during movement testing in patients with low back pain. Arch Phys Med Rehabil. 2000;81:57- 61.

[66] Sheps SB, Schechter MT. The assessment of diagnostic tests: a survey of current medical research. JAMA. 1984;252:2418-2422.

[67] Egglin TK, Feinstein AR. Context bias: a problem in diagnostic radiology radiology, branch of medicine specializing in the use of X rays, gamma rays, radioactive isotopes, and other forms of radiation in the diagnosis and treatment of disease. . JAMA. 1996;276:1752-1755.

[68] Begg CB. Biases in the assessment of diagnostic tests. Stat Med. 1987;6:411-423.

[69] Gellman H, Gelberman RH, Tan AM, Botte MJ. Carpal tunnel syndrome: an evaluation of the provocative diagnostic tests. J Bone Joint Surg Am. 1986;68:735-737.

[70] Kuhlman KA, Hennessey WJ. Sensitivity and specificity of carpal tunnel syndrome signs. Am J Phys Med Rehabil. 1997;76:451-457.

[71] Kassirer JP. Our stubborn stubborn Vox populi → medtalk Refractory; unresponsive to therapy  quest for Verb 1. quest for - go in search of or hunt for; "pursue a hobby"
quest after, go after, pursue

look for, search, seek - try to locate or discover, or try to establish the existence of; "The police are searching for clues"; "They are searching for the
 diagnostic certainty: a cause of excessive testing. N Engl J Med. 1989;320:1489-1491.

[72] Bernstein J. Decision analysis. J Bone Joint Surg Am. 1997;79: 1404-1414.

[73] Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and interpretation of diagnostic tests and procedures: principles and applications. Ann Intern Med. 1981;94:557-592.

[74] Hagen MD. Test characteristics: how good is that test? Med Decis Making. 1995;22:213-233.

[75] Lauder TD, Dillingham TR, Andary M, et al. Effect of history and exam in predicting electrodiagnostic outcome among patients with suspected lumbosacral radiculopathy. Am J Phys Med Rehabil. 2000;79:60-68.

[76] Kortelainen P, Puranen J, Koivisto E, Lahde S. Symptoms and signs of sciatica and their relation to the localization Customizing software and documentation for a particular country. It includes the translation of menus and messages into the native spoken language as well as changes in the user interface to accommodate different alphabets and culture. See internationalization and l10n.  of the lumbar disc herniation. Spine. 1985;10:88-92.

[77] Sackett DL. A primer on the precision and accuracy of the clinical examination. JAMA. 1992;267:2638-2644.

[78] Schulzer M. Diagnostic tests: a statistical review. Muscle Nerve. 1994;17:815-819.

[79] Calis M, Akgun K, Birante M, et al. Diagnostic value of clinical diagnostic tests in subacromial impingement syndrome. Ann Rheum Dis. 2000;59:44-47.

[80] Dujardin B, Van den Ende J, Van Gompel A, et al. Likelihood ratios: a real improvement for clinical decision making? Eur J Epidemiol. 1994;10:29-36.

[81] Lurie JD, Sox HC. Principles of medical decision making: spine update. Spine. 1999;24:493-498.

[82] Boyko EJ. Ruling out or ruling in disease with the most sensitive or specific diagnostic test. Med Decis Making. 1994;14:175-179.

[83] Jaeschke R, Guyatt GH, Sackett DL. Users' guides to the medical literature, III: how to use an article about a diagnostic test, B: What are the results and will they help me in caring for my patients? JAMA. 1994;271:703-707.

[84] Simel DL, Samsa GP, Matchar DB. Likelihood ratios with confidence: sample size estimation for diagnostic test results. J Clin Epidemiol. 1991;44:763-770.

[85] Riddle DL, Stratford PW. Interpreting validity indexes for diagnostic tests: an illustration using the Berg Balance Test. Phys Ther. 1999;79:939-948.

[86] Altman DG, Machin D, Bryant TN, Gardner MJ. Statistics With Confidence. 2nd ed. London, England: BMJ Books; 2000.

[87] Sackett DL, Strauss SE, Richardson WS, et al. Evidence-based Medicine evidence-based medicine Decision-making 'The use of scientific data to confirm that proposed diagnostic or therapeutic procedures are appropriate in light of their high probability of producing the best and most favorable outcome'. See Meta-analysis. . How to Practice and Teach EBM EBM Evidence-Based Medicine
EBM Electronic Body Music
EBM ecosystem-based management
EBM Evidence Based Medical (statistics)
EBM Environmentally Benign Manufacturing
EBM Expressed Breast Milk
EBM Executive Board Meeting
. 2nd ed. Edinburgh, Scotland: Churchill Livingstone Imprint of a medical publishing company owned by Elsevier Ltd, but previously owned by Harcourt and Pearsons. Originally formed from Livingstone, Edinburgh, Scotland, and J & A Churchill, London, UK, and subsequently with an office in New York, but now integrated with the rest of ; 2000:233-243.

[88] Newcombe RG. Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med. 1998;17:857-872.

[89] Harper R, Reeves B. Reporting of precision of estimates for diagnostic accuracy: a review. BMJ. 1999;318:1322-1333.

[90] Glass GV, Hopkins KD. Statistical Methods in Education and Psychology. 3rd ed. Boston, Mass: Allyn & Bacon; 1995.

[91] Goodman SN. Toward evidence-based medical statistics, 1: the P-value fallacy fallacy, in logic, a term used to characterize an invalid argument. Strictly speaking, it refers only to the transition from a set of premises to a conclusion, and is distinguished from falsity, a value attributed to a single statement. . Ann Intern Med. 1999;130:995-1004.

[92] Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-174.

[93] Grove WM, Andreasen NC, McDonald-Scott P, et al. Reliability studies of psychiatric psy·chi·at·ric
adj.
Of or relating to psychiatry.


psychiatric adjective Pertaining to psychiatry, mental disorders
 diagnosis: theory and practice. Arch Gen Psychiat. 1981;38:408-413.

[94] Spitznagel EL, Heizer JE. A proposed solution to the base rate problem in the kappa statistic. Arch Gen Psychiat. 1985;42:725-728.

[95] Smieja M, Hunt DL, Edelman D, et al. Clinical examination for the detection of protective sensation in the feet of diabetic patients. J Gen Intern Med. 1999;14:418-424.

[96] Sox HC Jr. Probability theory probability theory

Branch of mathematics that deals with analysis of random events. Probability is the numerical assessment of likelihood on a scale from 0 (impossibility) to 1 (absolute certainty).
 in the use of diagnostic tests: an introduction to critical study of the literature. Ann Intern Med. 1986; 104:60-66.

[97] Pauker SG, Kassirer JP. The threshold approach to clinical decision making. N Engl J Med. 1980;302:1109-1117.

[98] Pauker SG, Kassirer JP. Therapeutic decision-making: a cost benefit analysis. N Engl J Med. 1975;293:229-234.

[99] McNeil BJ, Pauker SG. The patient's role in assessing the value of diagnostic tests. Radiology. 1979;132:605-610.

[100] van der Lee JH, Wagenaar RC, Lankhorst GJ. Forced use of the upper extremity in chronic stroke patients: results from a single-blind randomized clinical trial. Stroke. 1999;30:2369-2375.

[101] Ross JM, Sox HC. If at first you don't succeed: clinical problem-solving. N Engl J Med. 1995;333:1557-1560.

[102] Hayden SR, Brown MD. Likelihood ratio: a powerful tool for incorporating the results of a diagnostic test into clinical decisionmaking. Ann Emerg Med. 1999;33:575-580.

[103] Fritz JM, Delitto A, Welch WC, Erhard RE. Lumbar spinal stenosis: a review of current concepts in evaluation, management, and outcome measurements. Arch Phys Med Rehabil. 1998;79:700-708.

[104] Herno A, Partanen K, Talaslahti T, et al. Long-term clinical and magnetic resonance imaging follow-up assessment of patients with lumbar spinal stenosis after laminectomy laminectomy /lam·i·nec·to·my/ (lam?i-nek´tah-me) excision of the posterior arch of a vertebra.

lam·i·nec·to·my
n.
Excision of a vertebral lamina. Also called rachiotomy.
. Spine. 1999;24:1533-1537.

[105] Fritz JM, Erhard RE, Delitto A, et al. Preliminary results of the use of a two-stage treadmill test treadmill test Exercise stress test, see there  as a clinical diagnostic tool in the differential diagnosis differential diagnosis
n.
Determination of which one of two or more diseases with similar symptoms is the one from which the patient is suffering. Also called differentiation.
 of lumbar spinal stenosis. J Spinal Dis. 1997;10:410-416.

[106] Dong G, Porter RW. Walking and cycling tests in neurogenic claudication Neurogenic Claudication (NC)
Common presentation of spinal stenosis and should be distinguished from vascular claudication. NC can be bilateral or unilateral lateral buttock, thigh, or leg discomfort that is precipitated by walking and prolonged standing.
. Spine. 1989;14:965-969.

[107] Fagan TJ. Nomogram for Bayes's theorem theorem, in mathematics and logic, statement in words or symbols that can be established by means of deductive logic; it differs from an axiom in that a proof is required for its acceptance. . N Engl J Med. 1975; 293:257.

[108] Rothstein JM. Editor's note Editor's Note (foaled in 1993 in Kentucky) is an American thoroughbred Stallion racehorse. He was sired by 1992 U.S. Champion 2 YO Colt Forty Niner, who in turn was a son of Champion sire Mr. Prospector and out of the mare, Beware Of The Cat.

Trained by D.
: questions for the disciples. Phys Ther. 1994;74:694-696.

[109] Fritz JM, Delitto A, Erhard RE, Roman M. An examination of the selective tissue tension scheme, with evidence for the concept of a capsular cap·su·lar  
adj.
Of, relating to, or resembling a capsule.

Adj. 1. capsular - resembling a capsule; "the capsular ligament is a sac surrounding the articular cavity of a freely movable joint and attached to the bones"
 pattern of the knee. Phys Ther. 1998;78:1046-1061.

[110] Keshner EA. Reevaluating the theoretical method underlying the neurodevelopmental theory: a literature review. Phys Ther. 1981;61: 1035-1040.

[111] DeGangi GA, Royeen CB. Current practice among neurodevelopmental treatment association members. Am J Occup Ther. 1994;48: 803-808.

[112] Bly L. A historical and current view of the basis of NDT. Pediatric pediatric /pe·di·at·ric/ (pe?de-at´rik) pertaining to the health of children.

pe·di·at·ric
adj.
Of or relating to pediatrics.
 Physical Therapy. 1991;3:131-135.

[113] Fetters fet·ter  
n.
1. A chain or shackle for the ankles or feet.

2. Something that serves to restrict; a restraint.

tr.v. fet·tered, fet·ter·ing, fet·ters
1. To put fetters on; shackle.
 L, Kluzik J. The effects of neurodevelopmental treatment versus practice on the reaching of children with spastic cerebral palsy. Phys Ther. 1996;76:346-358.

[114] Wagenaar RC, Meijer OC, van Wieringen PC. The functional recovery of stroke: a comparison between neuro-developmental treatment and the Brunnstrom method. Scand J Rehabil Med. 1990;22:1-8.

[115] Law M, Russell D, Pollock N, et al. A comparison of intensive neurodevelopmental therapy plus casting and a regular occupational therapy program for children with cerebral palsy. Dev Med Child Neurol. 1997;39:664-670.

[116] Law M, Cadman D, Rosenbaum P, et al. Neurodevelopmental therapy and upper-extremity inhibitive casting for children with cerebral palsy. Dev Med Child Neurol. 1991;33:379-387.

[117] Palmer FB, Shapiro BK, Watchel RC, et al. The effects of physical therapy on cerebral palsy: a controlled trial controlled trial Clinical research A clinical study in which one group of participants receives an experimental drug while the other receives either a placebo or an approved–'gold standard' therapy. See Blinding, Double-blinded.  in infants with spastic diplegia spastic diplegia A feature of cerebral palsy, which affects both legs, often unequally, characterized by hip flexion and internal rotation, due to the overactivity of the iliopsoas, rectus femorus, hip adductors; knee extension, due to overactivity of hamstrings, . N Engl J Med. 1988;318:803-808.

[118] Battie MC, Cherkin DC, Dunn R, et al. Managing low back pain: attitudes and treatment preferences of physical therapists. Phys Ther. 1994;74:219-226.

[119] Riddle DL, Rothstein JM. Intertester reliability of McKenzie's classifications of the syndrome types present in patients with low back pain. Spine. 1993;18:1333-1344.

[120] Cherkin DC, Deyo RA, Battie M, et al. A comparison of physical therapy, chiropractic manipulation, and provision of an educational booklet for the treatment of patients with low back pain. N Engl J Med. 1998;339:1021-1029.

(*) MEDmetric Corp, 7542 Trade St, San Diego San Diego (săn dēā`gō), city (1990 pop. 1,110,549), seat of San Diego co., S Calif., on San Diego Bay; inc. 1850. San Diego includes the unincorporated communities of La Jolla and Spring Valley. Coronado is across the bay. , CA 92121.

JM Fritz, PT, PhD, ATC ATC Air Traffic Control
ATC Average Total Cost
ATC Certified Athletic Trainer
ATC At the Center (Hartford, Maine retreat center)
ATC Applied Technology Council
ATC All Things Considered
, is Assistant Professor, Department of Physical Therapy, University of Pittsburgh, 6035 Forbes Tower Forbes Tower is a building of the University of Pittsburgh Medical Center in Pittsburgh, Pennsylvania, United States. Located directly behind the historic Iroquois Building, Forbes Tower was designed by the architectural firm Tasso Katselas Associates [1] and was , Pittsburgh, PA 15260 (USA) (jfritz@pitt.edu). Address all correspondence to Dr Fritz.

RS Wainner, PT, PhD, OCS OCS - Object Compatibility Standard , ECS See eComStation. , is Physical Therapy Research Coordinator, Wilford Hall Medical Center, Lackland Air Force Base Lackland Air Force Base (lăk`lənd), U.S. military installation, c.6,835 acres (2,766 hectares), S Tex., W of San Antonio; est. 1941. It is a major air force training center. , San Antonio San Antonio (săn ăntō`nēō, əntōn`), city (1990 pop. 935,933), seat of Bexar co., S central Tex., at the source of the San Antonio River; inc. 1837. , Tex.

Both authors provided concept/project design and writing.

The opinions or assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting the views of the Department of the Air Force The executive part of the Department of the Air Force at the seat of government and all field headquarters, forces, Reserve Components, installations, activities, and functions under the control or supervision of the Secretary of the Air Force. Also called DAF. See also Military Department. , the Department of the Army, the Department of the Navy, or the Department of Defense.
COPYRIGHT 2001 American Physical Therapy Association, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2001, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Wainner, Robert S
Publication:Physical Therapy
Article Type:Statistical Data Included
Geographic Code:1USA
Date:Sep 1, 2001
Words:14783
Previous Article:Effects of a Functional Therapy Program on Motor Abilities of Children With Cerebral Palsy.(Statistical Data Included)
Next Article:The Use of Electrical Stimulation to Increase Quadriceps Femoris Muscle Force in all Elderly Patient Following a Total Knee Arthroplasty.(Statistical...
Topics:



Related Articles
Making decisions based on group designs and meta-analysis. (physical therapy)
Do revenues or expenditures respond to budgetary disequilibria?
The diagnostic process: examples inorthopedic physical therapy.
A note on budget deficits and interest rates: evidence from a small open economy.
The relationship between duration of physical therapy services in the acute care setting and change in functional status in patients with...
Classification and low back pain: a review of the literature and critical analysis of selected systems.
Using Published Evidence to Guide the Examination of the Sacroiliac Joint Region.
Sifting the evidence--what's wrong with significance tests?(overview of medical research techniques)
The promise and perils of evidence-based medicine. (Part 1: Health Care Trends).
Evidence in practice.

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles