Making decisions based on group designs and meta-analysis.Making Decisions Based on Group Designs and Meta-analysis The need to apply formal decision-making theories in clinical practice is becoming more apparent. One of the major decisions with which clinicians are faced constantly is when to modify their practice on the basis of findings reported in the research literature. Ideally, such decisions should be based on both the credibility of the research and its applicability to the clinician's situation. This article briefly summarizes four types of research design and provides guidelines guidelines, n.pl a set of standards, criteria, or specifications to be used or followed in the performance of certain tasks. for assessing the credibility of studies based on them. In addition, this article provides a brief introduction to meta-analysis, a technique for assessing the collective results of multiple studies. Source of Information Before we can decide what information to include in our decision-making processes Presented below is a list of topics on decision-making and decision-making processes: | width="" align="left" valign="top" |
| width="" align="left" valign="top" | intr. & tr.v. sum·ma·rized, sum·ma·riz·ing, sum·ma·riz·es To make a summary or make a summary of. sum the knowledge we have gained from our experiences with a number of similar patients. Typically, however, our generalizations are derived in an informal, unsystematic manner, which may be quite biased. Another potential source of information regarding treatment efficacy is the research literature. Based on Michels's definition of research, (1) the research literature should contain convincing answers to important clinical questions about the patients we manage, including questions regarding the effects of treatment. In many instances, the information reported in the literature is based on data obtained systematically from groups of subjects in studies designed to minimize bias and eliminate alternative explanations for the changes attributed to a particular treatment. Thus, this latter source of information should be convincing, and if clinicians judge the information to be important, it should serve to guide clinical decisions. Credibility and Hypothesis Testing hypothesis testing In statistics, a method for testing how accurately a mathematical model based on one set of data predicts the nature of other data sets generated by the same process. Having identified our sources of information, how do we assess their credibility? In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke" put differently , how do we decide whether the evidence they present is convincing? Stated very simply, those who embrace the traditional view of science check for violations of the rules of science in the information they have obtained. Because the rules of science are embedded Inserted into. See embedded system. in the principles and methods of hypothesis testing and research design, I will now focus on those two topics. Given the often-encountered difficulty of making decisions, it is important to note the following two points about the traditional hypothesis-testing approach to science: 1) For a given experimental question, only two hypotheses are allowed and only one of the two hypotheses is actually tested, and 2) the rules for deciding whether to reject a hypothesis are specified explicitly by the method. Undeniably, these two features reduce the complexity of the decision-maker's task regarding a particular research question. First of all, the information-processing load is reduced, because there are only two alternatives between which to decide. Second, it is usually quite easy to check for violations of the rules of hypothesis testing in making a decision about the statistical significance of a particular treatment effect. We simply determine whether 1) the appropriate statistic statistic, n a value or number that describes a series of quantitative observations or measures; a value calculated from a sample. statistic a numerical value calculated from a number of observations in order to summarize them. was used (ie, the assumptions underlying its use were met), 2) the statistic was calculated accurately, 3) a reasonable alpha level was used, 4) the correct probability value was reported, and 5) the decision rule was applied appropriately. Unfortunately, however, the decisions we as clinicians must make concerning the credibility of particular research findings and the applicability of those findings to our own patients are not quite so simple. Why not? Because our clinical decisions must take into account much more information than the outcome of hypothesis testing in a single study. Specifically, we must check for violations of the rules of science concerning the validity of a particular study, and we must also consider the results of other studies. Research Design The rules of science concerning the general issue of validity are manifest in the principles and methods of research design, which have been elaborated by both clinical epidemiologists [2,3] and social scientists. [4] One of the major goals of research design is to maximize the validity of research studies by minimizing the possibility of making erroneous erroneous adj. 1) in error, wrong. 2) not according to established law, particularly in a legal decision or court ruling. inferences about the causes of treatment effects. The primary method for demonstrating the presence of causal relationships is the true experiment, and adherence to the criteria for a true experiment in the conduct of research produces the most convincing evidence for cause-effect relations. The three fundamental criteria for a true experiment are 1) the use of at lealst two groups of subjects, 2) random assignment of subjects to the groups, and 3) the controlled administration by the investigator of a treatment. If any of the three basic criteria are missing from the study, then there are major threats to the validity of the study and the evidence provided by the study is less convincing. [4] At first glance, deciding whether a study is valid seems to be a fairly easy task because there are only three criteria to consider. On closer examination, we realize that the decision is complicated by the following two realities: 1) Rarely are all three criteria present in a single clinical study, and 2) we are reluctant to disregard information totally just because some feautres of a true experiment are missing. As a result of these two realities, we must deal with degrees of validity, relative "convincingness" of information, and somewhat "flexible" decision rules. Consequently, there is more information to process and more uncertainty in making decisions concerning the validity and credibility of studies than in making decisions about statistical significance. Increases in the information-processing load and increases in uncertainty generally create difficulties for the decision maker. Fortunately, experts in are area of research design have provided us with some guidelines for assessing the credibility of results obtained both from true experiments and from studies based on nontraditional designs that do not meet all of the criteria of a true experiment. General Types of Designs Before discussing some of the guidelines for interpretation, we want to clarify the distinction between experimental and nonexperimental designs. Based on a very simple classification scheme, there are two types of research design: 1) experiemental and 2) nonexperimental. Given this classification scheme, it should be clear that the true experiment is the "gold standard" against which other designs are measured. [4] We already have noted the three defining characteristics of the true experiment, which can be used for comparison with the characteristics of alternative, nonexperimental designs. In addition, we need to consider the logic underlying the design of the true experiment. The fundamental principle of the true experiment is that before and during a study there are no differences between the groups included in the study except for the differences in the treatment provided to each group by the investigator. Consequently, differences between the groups at the end of the study are said to have been caused by the treatment. Other plausible causes for the differences between groups are ruled out by adherence to the criteria for the true experiment. Failure to comply with the criteria leaves competing explanations untested. Typically, each of the nonexperimental designs lacks at least one of the defining characteristics of the true experiment. Most commonly, nonexperimental designs are used when the investigator must use preexisting pre·ex·ist or pre-ex·ist v. pre·ex·ist·ed, pre·ex·ist·ing, pre·ex·ists v.tr. To exist before (something); precede: Dinosaurs preexisted humans. v.intr. , self-selected groups rather than randomized ran·dom·ize tr.v. ran·dom·ized, ran·dom·iz·ing, ran·dom·iz·es To make random in arrangement, especially in order to control the variables in an experiment. groups and when the investigator is unable to control application of the treatment to subjects within the groups. [4] Having noted the major differences between exprimental and nonexperimental designs, we now will discuss the methods used to compensate for the deficiencies in nonexperimental designs. A few methodological techniques, such as increasing the number of groups, increasing the number of different treatments, increasing the number of dependent variables, and increasing the number of times each subject is measured, can be used to help eliminate competing plausible explanations for demonstrated effects. The strength of the evidence from nonexperimental studies, however, also rests heavily on factors external to the study, specifically, strong theory; solid, logical explanations to rule out competing hypotheses; and substantial background information. Thus, evidence from a study based on a nonexperimental design will be convincing to the extent that the sum total of information both internal and external to the study can account for alternative explanations for the supposed treatment effect. Specific Types of Designs At this point, it may be useful to consider four specific types of designs. For ease of comparison, we will cite a limited number of main features, advantages, and disadvantages for each type of design. The four types of designs we will describe are 1) randomized clinical trials randomized clinical trial, n a clinical study where volunteer participants with comparable characteristics are randomly assigned to different test groups to compare the efficacy of therapies. (RCTs), 2) prospective cohort designs, 3) retrospective case-control designs, and 4) interrupted time-series designs (Appendix). Only the RCI RCI Royal Caribbean International RCI Radio Canada International RCI Rehabilitation Council of India RCI Residential Communities Initiative RCI Roof Consultants Institute RCI Remote Control Interface RCI Residential, Commercial, Industrial is a true experimental design; the other three are nonexperimental designs. Guidelines for implementation of the first three designs have been elaborated primarily by clinical epidemiologists, [2,3] whereas details regarding the use of interrupted time-sries designs have been developed by social scientists. [4] Randomized Clinical Trial The RCT RCT Randomized Controlled Trial RCT Regimental Combat Team (infantry regiment with their own artillery, engineers, medical and tanks) RCT Rollercoaster Tycoon RCT Randomized Clinical Trial RCT Rhondda Cynon Taff is the clinical analog of the true experiment. [3] To reiterate re·it·er·ate tr.v. re·it·er·at·ed, re·it·er·at·ing, re·it·er·ates To say or do again or repeatedly. See Synonyms at repeat. re·it briefly, the main features of the RCT are the use of controlled randomization randomization (ranˈ·d said of conduct not conforming with professional ethics. to withhold with·hold v. with·held , with·hold·ing, with·holds v.tr. 1. To keep in check; restrain. 2. To refrain from giving, granting, or permitting. See Synonyms at keep. 3. treatment, the investigator may not have access to the entire population for random selection of subjects, sample bias may be present despite the use of randomization techniques, and nonequivalent attrition Attrition The reduction in staff and employees in a company through normal means, such as retirement and resignation. This is natural in any business and industry. Notes: of subjects from groups may result in post hoc post hoc adv. & adj. In or of the form of an argument in which one event is asserted to be the cause of a later event simply by virtue of having happened earlier: sample bias. Prospective Cohort Design The first of the nonexperimental designs is the prospective cohort design. [3] The main features of this design type are the use of static, usually self-selected, nonequivalent groups of subjects; the formation of groups before outcome; the application of treatment that is not controlled by the investigator; and the use of subjects selected by reference to their baseline state but assigned to groups before outcome occurs. The primary advantages of the prospective cohort design are that it does not require the investigator to withhold treatment, it provides good assurance that treatment preceded outcome, it can be helpful in understanding coexisting co·ex·ist intr.v. co·ex·ist·ed, co·ex·ist·ing, co·ex·ists 1. To exist together, at the same time, or in the same place. 2. problems and effects, and it controls for many threats to internal validity. The essential disadvantages of this type of design are that the investigator does not have control over the assignment of subjects to treatment conditions; the absence of randomization makes it very susceptible to one of the greatest threats to internal validity, namely, selection bias; it does not control for some of the major threats to internal validity including history, maturation maturation /mat·u·ra·tion/ (mach-u-ra´shun) 1. the process of becoming mature. 2. attainment of emotional and intellectual maturity. 3. , and interactions with selection bias; and it requires strong theory, solid logic, and substantive background information to compensate for the lack of randomization and control. Retrospective Case-Control Design The retrospective case-control design is used for studies that essentially proceed backward in time. [3] Subjects are selected on the basis of outcome measures, the investigator does not control either selection of subjects or the application of treatment, and the investigator must rely heavily on archival data. The advantages of retrospective designs include economy of resources (ie, both time and money), no necessity to withhold treatment, and provision of an opportunity to use those ubiquitous computerized databases. The disadvantages of retrospective studies retrospective study, a study in which a search is made for a relationship between one phenomenon or condition and another that occurred in the past (e.g. are that they are susceptible to many biases in all aspects of the study, including differential susceptibility susceptibility the state of being susceptible. Refers usually to infectious disease but may be to physical factors such as wetting or to psychological factors such as harassment. of subjects to treatment and nonequivalent application of treatments, and that the archives on which they rely heavily are inflexible and may not contain the data needed. Interrupted Time-Series Design Interrupted time-series designs are characterized by the use of either a single group or single individual; the use of multiple baseline and postintervention measures; treatment controlled by the investigator and presented at specific times in the series of measurements; and heavy reliance on logic, common sense, and background data. [3] The advantages of interrupted time-series designs are that they can be used to assess maturational mat·u·ra·tion n. 1. The process of becoming mature. 2. Biology a. The processes by which gametes are formed, including the reduction of chromosomes in a germ cell from the diploid number to the haploid number trends and to some extent the effect of intervention, they can be used to assess the stability of the dependent variable, they can be tailored to each subject, and a treatment withdrawal phase can be used to strengthen the evidence. Some of the disadvantages of interrupted time-series designs are that it may not be possible to produce treatment effects instantaneously in·stan·ta·ne·ous adj. 1. Occurring or completed without perceptible delay: Relief was instantaneous. 2. ; many measurements are required on each subject (usually a minimum of 50 for statistical analysis); and, perhaps most importantly Adv. 1. most importantly - above and beyond all other consideration; "above all, you must be independent" above all, most especially , the longitudinal lon·gi·tu·di·nal adj. Running in the direction of the long axis of the body or any of its parts. nature of the study increases threats to internal validity such as history, maturation, and cyclical cyclical Of or relating to a variable, such as housing starts, car sales, or the price of a certain stock, that is subject to regular or irregular up-and-down movements. trends, thereby decreasing credibility (especially if the number of observations is limited). Consider the following to help clarify the differences in implementation of the various designs. Suppose you were interested in obtaining information about the effectiveness of electrical stimulation for producing muscle hypertrophy This article or section may contain original research or unverified claims. Please help Wikipedia by adding references. See the for details. This article has been tagged since September 2007. . If you were conducting a RCT, you would operationally define your dependent variable, that is, some measure of muscle hypertrophy; randomly select subjects from a population; assign subjects randomly to experimental and control groups; treat only subjects in the experimental group according to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. a specific protocol; measure the subjects on the dependent variable both before and after treatment; and, finally, test for statistically significant differences between the groups. A prospective cohort study A cohort study is a form of longitudinal study used in medicine and social science. It is one type of study design. In medicine, it is usually undertaken to obtain evidence to try to refute the existence of a suspected association between cause and disease; failure to refute actually could be very similar to a RCT, except for the fact that you would have no control, either over assigning subjects randomly to treatment and control groups or over the application of treatment. Although you might not be able to measure subjects prior to the beginning of treatment, you would be able to obtain measurement of the dependent variable prior to completion of treatment. In any event, you would be sure that the treatment preceded (although not necessarily caused) any observed effect. In a retrospective
n an investigation employing an epidemiologic approach in which previously existing incidents of a medical condition are used in lieu of gathering new information from a randomized population. , the subjects already would have been separated into treatment and no-treatment cohorts by some nonrandom process, treated (or not treated) with electrical stimulation, and measured on the dependent variable. You probably would not have direct acess to the subjects, but only to whatever records existed of the method used for determining whether a subject received treatment, the treatment regimen regimen /reg·i·men/ (rej´i-men) a strictly regulated scheme of diet, exercise, or other activity designed to achieve certain ends. reg·i·men n. 1. , and measures of the dependent variable. By contrast, in a simple interrupted time-series study, subjects may have been assigned to treatment and no-treatment conditions, but you would provide treatment. You would begin treatment, however, only after completing a series of pretreatment pretreatment, n the protocols required before beginning therapy, usually of a diagnostic nature; before treatment. pretreatment estimate, n See predetermination. measurements. In addition, you would need to test the subjects several times, both during the course of treatment and ideally after cessation cessation Vox populi The stopping of a thing. See Smoking cessation. of treatment. Synopsis A summary; a brief statement, less than the whole. A synopsis is a condensation of something—for example, a synopsis of a trial record. How does all of this information about the different types of designs help us with the problem of assesing the credibility of results obtained from studies based on different types of designs? As you may have noticed, we have already compressed a great deal of information into a few short paragraphs. Perhaps if the information is reduced even more, a few helpful guidelines will emerge. The RCT provides the strongest evidence regarding cause-effect relationships, but it is often either difficult to implement or unethical to use in a clinical setting. Prospective cohort designs, compared with RCTs, provide weaker evidence about causation causation Relation that holds between two temporally simultaneous or successive events when the first event (the cause) brings about the other (the effect). According to David Hume, when we say of two types of object or event that “X causes Y” (e.g. , especially in the absence of strong, logial argument and a sound theoretical basis, but they present fewer ethical problems. Retrospective case-control designs provide yet weaker evidence about causation, especially in the absence of strong theory and extraordinary archival data, but repeated demonstrations of an effect can serve to strengthen the evidence. Interrupted time-series designs also provide weaker evidence for causation than RCTs but limited reliance on archival data, and the use of many measurements improves the credibility of the evidence generated. Although it is logically impossible to demonstrate the existence of cause-effect relationships without conducting a true experiment, both the use of logical argument to account for specific threats to internal validity and the use of multiplicity mul·ti·plic·i·ty n. pl. mul·ti·plic·i·ties 1. The state of being various or manifold: the multiplicity of architectural styles on that street. 2. , that is, multiple nonequivalent groups, multiple measurements, and multiple variables, helps strengthen the evidence generated from studies based on nonexperimental research designs. n3 The bottom line is that we can extract credible information from nonexperimental studies, but we must be more careful about the inferences we make and the decisions we make concerning the applicability of the information to our own patients. Information from Multiple Studies Obviously, there is much more information available about the relative merits of various designs. The main point is that there are some rules for making decisions about the credibility of the results from individual studies. Unfortunately, it would probably be a mistake to make a decision to change our approach to treatment of a particular type of patient on the basis of one study. A reasonable alternative is to base our decisions on the outcome of multiple studies. Traditionally, the results of multiple studies in a defined area of research have been summarized in narrative review articles. One of the main practical difficulties with narrative reviews is that when many studies are available, it becomes very difficult for the reviewer re·view·er n. One who reviews, especially one who writes critical reviews, as for a newspaper or magazine. reviewer Noun a person who writes reviews of books, films, etc. Noun 1. to "process" all of the information. Consequently, narrative reviews may be distorted by a variety of biases. An alternative approach is to use quantitative methods, such as meta-analysis, to arrive at a conclusion regarding a treatment effect based on an aggregation of studies. The proponents of meta-analysis [5,6] are attempting to use quantitative rigor rigor /rig·or/ (rig´er) [L.] chill; rigidity. rigor mor´tis the stiffening of a dead body accompanying depletion of adenosine triphosphate in the muscle fibers. to help make decisions about groups of studies in the same way we use statistics to help make decisions about groups of patients. The first task of the meta-analyst is to search very thoroughly, even in remote file drawers, for all of the studies related to the review topic. The next task is to devise a scheme for representing the information from each study in a way that permits comparison across studies. Typically, studies provide atleast one quantitative measure of outcome and other types of information about procedures, methods, and theoretical assumptions. The outcome measures from the individual studies included in the review (eg, measures of effect size and statistical significance) become the "dependent variables" in the meta-analysis; the pieces of information contained in each study about methods, procedures, and theoretical assumptions translate to the "independent variables" in the meta-analysis. Finally, the meta-analyst uses a variety of quantitative techniques for estimating the results of the aggregated data. Given that meta-analysis is a relatively new mode of inquiry, an example demonstrating the analogy between a meta-analytic study and a primary experimental study may be useful. Suppose you were interested in obtaining information about the effectiveness of electrical stimulation for producing muscle hypertrophy. If you were conducting a primary experimental study, you would follow the steps outlined earlier for the RCT, that is, operationally defining the dependent variable, randomly selecting subjects from a population, assigning subjects randomly to experimental and control groups, treating only subjects in the experimental group, and finally testing for statistically significant differences between the groups. You would report the results of the study in terms of the means and standard deviations In statistics, the average amount a number varies from the average number in a series of numbers. (statistics) standard deviation - (SD) A measure of the range of values in a set of numbers. of the groups, the value of a statistic (such as t-test or F-ratio value), the level of statistical significance (ie, probability value), and an estimate of the effect size (ie, standardized standardized pertaining to data that have been submitted to standardization procedures. standardized morbidity rate see morbidity rate. standardized mortality rate see mortality rate. mean difference between groups). On the basis of this information, you could decide whether the treatment was effective in your study. If, on the other hand, you were conducting a meta-analytic study, you would begin not by selecting subjects but by collecting reports of all previous studies that examined the effect of electrical stimulation on muscle hypertrophy. Next, rather than treating and testing subjects as you would in a primary study, you would extract from your collection of primary studies information about the results of the primary studies (eg, significance level) to use as a dependent variable. In addition, you would extract from the primary studies information about the methodological, conceptual, and statistical validity of those studies to use as independent variables in your meta-analytic study. Finally, rather than using a statistical test to decide whether experimental and control groups differed on the dependent variable, in meta-analysis you would apply statistical tests to allow you to decide whether the results of the various studies differ from one another. Although many of the details of meta-analysis have been omitted in this rather concise description, hopefully there is sufficient information for you to appreciate the potential usefulness of this relatively new technique for helping us make better decisions about which treatments to apply to our patients. Caveats Before we conclude, we would like to make three additional points. First, although evidence from a particular study may be convincing because the investigators adhered to the rules of science, the information is not necessarily useful. The information will be useful only if clinicians judge the information to be important. Second, the information presented in this article will be useful only if you accept the notion that the traditional approach to science is applicable to the questions we must answer. Third, the debate continues about whether information derived from group studies is applicable to individual patients. We did not make note of these points to cause us to either stop doing research or stop using research findings to modify our treatment approaches. Rather, we wanted to give us an opportunity to think about whether we are using the best approaches. Perhaps the ultimate irony is that we look to research findings to help us with our clinical decisions, but we also continually need to make decisions about the way we conduct research. References [1] Michels E: Design of Research and analysis of Data in the Clinic: Introduction to Factorial factorial For any whole number, the product of all the counting numbers up to and including itself. It is indicated with an exclamation point: 4! (read “four factorial”) is 1 × 2 × 3 × 4 = 24. Design and Analysis of Variance Alexandria, VA, American Physical Therapy Association The American Physical Therapy Association (APTA) is a national professional organization representing more than 66,000 members. Its goal is to foster advancements in physical therapy practice, research, and education. , 1983 [2] Feinstein AR: Clinical Epidemiology epidemiology, field of medicine concerned with the study of epidemics, outbreaks of disease that affect large numbers of people. Epidemiologists, using sophisticated statistical analyses, field investigations, and complex laboratory techniques, investigate the cause : The Architecture of Clinical Research. Philadelphia, PA, W B Saunders Co, 1985 [3] Sackett DL, Haynes RB, Tugwell P: Clinical Epidemiology: A Basic Science for Clinical Medicine. Boston, MA, Little, Brown & Co Inc, 1985 [4] Cook TD, Campbell DT: Quasi-Experimentation: Design and Analysis Issues for Field Settings. Boston, MA, Houghton Mifflin Houghton Mifflin Company is a leading educational publisher in the United States. The company's headquarters is located in Boston's Back Bay. It publishes textbooks, instructional technology materials, assessments, reference works, and fiction and non-fiction for both young readers Co, 1979 [5] Rosenthal R: Meta-Analytic Procedures for Social Research Beverly Hills Beverly Hills, city (1990 pop. 31,971), Los Angeles co., S Calif., completely surrounded by the city of Los Angeles; inc. 1914. The largely residential city is home to many motion-picture and television personalities. , CA, Sage Publications This article or section needs sources or references that appear in reliable, third-party publications. Alone, primary sources and sources affiliated with the subject of this article are not sufficient for an accurate encyclopedia article. Inc, 1984 [6] Strube MJ, Gardner W, Hartmann DP: Limitations, liabilities, and obstacles in reviews of the literature: The current status of meta-analysis. Clin Psych psych also psyche Informal v. psyched, psych·ing, psyches v.tr. 1. a. To put into the right psychological frame of mind: Rev 5:63-78, 1985 B Norton, MHS (1) (Message Handling Service) An earlier messaging system from Novell that supported multiple operating systems and other messaging protocols, including SMTP, SNADS and X.400. It used the SMF-71 messaging format. , PT, is Instructor and Corrdinator, Applied Kinesiology Kinesiology, Applied Definition Kinesiology is a series of tests that locate weaknesses in specific muscles reflecting imbalances throughout the body. Laboratories, Program in Physical Theraphy, Washington University Washington University, at St. Louis, Mo.; coeducational; est. as Eliot Seminary 1853, opened 1854, renamed 1857. It has a well-known medical school and school of social work as well as research centers for radiology, space studies, engineering computing, and the Medical School, 660 S Euclid Ave, PO Box 8083, St Louis, MO 63110 (USA). M Strube, PhD, is Associate Professor, Department of Psychology, Washington University, St Louis, MO 63130. |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion