# Reply to David Papineau.

David Papineau takes issue with some arguments I have advanced (Howson and Urbach [1989]; Urbach [1985]) concerning the role of randomization in clinical trials and in experiments analogous to clinical trials, where the aim is to establish whether, or to what extent, a particular intervention has a particula causal effect. He and I agree that randomization is not an absolutely essential feature of such trials, without which no knowledge of the causes in question is possible, as its advocates often claim. However, Papineau sees randomization as a powerful aid to the inductive inference to causes; and if he were right, randomized trials would undoubtedly possess very important advantages over othe kinds of experiment.Papineau considers the issue in the context of an investigation into the effect if any, of a medical treatment on a person's recovery from a particular disease He envisages a sample of people who have the disease being selected and divided into two groups: the treatment is then administered to the individuals in one o the groups and withheld from those in the other. The initial sample, in Papineau's view--and here we agree too--need not be randomly selected from a population. However, Papineau regards randomization--the random assignment of people within the sample either to receive the treatment or not--as being particularly important when it comes to drawing inferences as to the efficacy o the treatment.

In order to examine Papineau's claim concerning randomization, let me transpose his example to an artificial but I think helpful, analogue. The analogous task is to investigate the causal effect of some intervention, T, on the propensity of coins to land heads when tossed in the normal way. An experiment is performe on a sample of coins selected either at random from the coins of the realm or chosen in some other way. The coins in the sample are then divided into two either by a randomizing process or in some deliberate, non-random fashion, and all of the coins that end up in one of the groups are subjected to the treatmen T. The trial now consists in tossing each of the coins a large number of times and recording for each group the frequency of heads among the heads-or-tails outcomes. With sufficiently many tosses, we shall be in a position to infer wit considerable confidence Prob (H[where]T) and Prob (H[where] [is approximate to] T) for the collections of coins in the treatment and non-treatment groups, respectively.

Following Papineau's analysis in Section 2 of his paper, suppose we find that Prob (H[where]T) [is greater than] Prob (H[where] [is approximate to] T). As he observes, we should not automatically infer that T has a causal effect on H; th discrepancy in probabilities may simply be due, for example, to an excess in th treatment group of coins put into circulation by a firm of forgers whose coins are all, or mostly, biased towards heads. Suppose that such coins can be identified by chemical examination (let us say they all have property Y) and that the experimental results when re-analysed reveal the following equalities: Prob (H[where]T and Y) = Prob (H[where] [is approximate to] T and Y); and Prob (H[where]T and [is approximate to] Y) = Prob (H[where] [is approximate to] T an [is approximate to] Y). Papineau claims that if a result such as this were found, we could then conclude that T has no causal effect on the occurrence of H. But that conclusion would not necessarily be correct. Take the first equality: this could arise through the Y-type coins in the treatment group bein less biased towards heads than those in the control group and by the T-treatmen exactly compensating for this imbalance. The second equality is also consistent with T having a causal influence: the non-Y-type coins in the treatment group might, for example, be more tails-biased than those in the control group, with the causally effective T-treatment making up for the discrepancy. Table 1 illustrates these possibilities. This is perhaps a lesser objection and does no go to the heart of Papineau's thesis, which arises as follows: he points out that if it should turn out that Prob (H[where]T and Y) [is greater than] Prob (H[where] [is approximate to] T and Y) and Prob (H[where]T and [is approximate to] Y) [is greater than] Prob (H[where] [is approximate to] T and [is approximate to] Y), we still could not automatically conclude that T has any causal influence on H, for it could be that some other confounding cause was responsible for the greater probability of heads in the treatment group. We could test some of the possible confounding causes, but it hardly seems feasibl to investigate them all, and so we might never discover whether the higher probability of heads within the test group was due to the treatment or not.

Papineau, however, claims that the problem would be resolved and a valid inference as to causality made, provided the experiment was randomized. For if the probability of heads occurring is greater when the treatment is present tha when it is not, then, so Papineau claims, the personal probability that T causally influences H must be 1. Further on in his paper. Papineau expresses himself even more strongly by claiming that the inference from probabilities to causes, when the probabilities are determined in randomized groups, is 'quite infallible' and a 'sure-fire guide to causal conclusions'.

TABLE 1 Treatment group Control group Coin Y present? Prob (heads) Coin Y present? Prob (heads) 1. No 0.5 1. No 0.5 2. Yes 0.8 2. No 0.5 3. Yes 0.8 3. Yes 0.8

But how could this be so? The experimental groups thrown up in a randomized trial might be identical to the ones created by deliberate selection. And we know that the treatment group could well contain more heads-biased coins than the control group; thus, the experimental observation that the probability of heads is greater amongst the former group of coins than amongst the latter is compatible with the treatment having no causal effect whatever on the appearanc of heads, even though the experiment was randomized. Papineau is of course awar of this kind of objection and he illustrates the difficulty himself with a case where a test group contains much younger subjects than the control group, despite randomization, and where this alone accounts for the different recovery rates within the two groups. But Papineau insists that the fact that randomized allocation may throw up very unbalanced groups does not impugn the 'sure-fire' effectiveness of randomization.

The reason why Papineau takes this view lies in his distinction between two kinds of objective probability that might be inferred from a clinical trial: th probability of an effect in each of the experimental groups, and the corresponding probabilities for the whole population. The terms Prob (H[where]T and Prob (H[where] [is approximate to] T), which figured in my discussion of th coin example, denote sample probabilities; the latter, which Papineau calls 'objective population probabilities', I shall denote PROB (H[where]T). Such population probabilities are, in my opinion, not easy to conceptualize when we are dealing with the responses of types of patient to medical interventions, bu this is very simple in my example. For we can arbitrarily specify some population of coins, and define the population probability for coins treated with T to land heads as the limiting relative frequency of heads in a trial conducted in the manner described earlier, except that now the whole population constitutes the test group; and assuming that the T-treatment has no permanent effect on the coins, the population probability of heads for untreated coins would be the corresponding limiting relative frequency in a similar trial, conducted under relevantly similar conditions, where the same, entire populatio now formed the control group.

Papineau argues that the inference to causes in a clinical trial is necessarily a two-stage process: the first involves inferring the separate objective population probabilities of recovery from the corresponding recovery rates within each of the groups; the second stage concerns the inference to a causal explanation from those probabilities. (The sample probabilities, which seem to be the focus of Papineau's earlier discussion, play no role here.) According to Papineau, it is the second inference which is 'quite infallible', while the first is all too fallible, and in the case he cites, it would, he says, be a mistake to equate probabilities with the recovery rates within the test and control groups, because these groups are quite unrepresentative of the wider population. Hence, the test treatment should not, on the evidence of the different recovery rates, be regarded as causally effective.

I do not believe that Papineau's way of dealing with the problem is at all satisfactory. For consider a trial that is performed in the normal way, using test and control groups formed from a sample drawn from the population, with a view to estimating the population probabilities PROB (H[where]T) and PROB (H[where] [is approximate to] T). As Papineau states, if PROB (H[where]T)[is greater than]PROB (H[where] [is approximate to]T), the experimental treatment, T, must be the cause of the difference, and it is indeed the case that the 'inference, from [these population] probabilities to causes, is quite infallible'. But I believe he is mistaken in saying that the inference is valid 'in virtue of the randomization of the treatment in the experiment': on the contrary, it is true simply by definition of the notion of a probabilistic caus and holds quite independently of how the experiment was conducted, or, indeed, of whether there ever was such an experiment.

We might also note that the first of Papineau's inferences--from recovery rates in each of the experimental groups to population probabilities--is not facilitated by randomization either. For, as Papineau agrees, the original sample might be very unrepresentative of the population, and no subsequent randomization can repair that obstacle to a correct inference. Hence Papineau's analysis of a clinical trial through his two stages of inference, far from vindicating randomization, finds no role for it at all.

PETER URBACH London School of Economics

REFERENCES

HOWSON, C. and URBACH, P. [1989]: Scientific Reasoning: The Bayesian Approach. La Salle, IL: Open Court Publishing Co. (second edition, 1993).

URBACH, P. [1985]: 'Randomization and the Design of Experiments'. Philosophy of Science, 52, pp. 256-73.

URBACH, P. [1993]: 'The Value of Randomization and Control in Clinical Trials', Statistics in Medicine, 12, pp. 1421-31.

Printer friendly Cite/link Email Feedback | |

Title Annotation: | response to article by David Papineau in this issue, p. 437 |
---|---|

Author: | Urbach, Peter |

Publication: | The British Journal for the Philosophy of Science |

Date: | Jun 1, 1994 |

Words: | 1762 |

Previous Article: | Comment on Barrett and Sober's paper on the relevance of entropy to retrodiction and prediction. |

Next Article: | William Whewell: problems of induction vs. problems of rationality. |

Topics: |