Printer Friendly

A tale of two effects.

1. Introduction

In recent years, there has been a philosophical cottage industry producing arguments that our concept of causation is not univocal: that there are in fact two (or more) concepts of causation, corresponding to distinct species of causal relation. Papers written in this tradition have borne titles like "Two Concepts of Cause" (Sober 1985) and "Two Concepts of Causation" (Hall forthcoming). With due apologies to Charles Dickens, I hereby make my own contribution to this genre.

In a series of articles and books, Elliott Sober and Ellery Eells have argued for a distinction between token-level and type-level causal relations (see especially Sober 1985 and Eells 1991 (introduction and chap. 6)). Type-level causal relations are described by causal generalizations such as "smoking causes lung cancer," while token-level causal relations involve particular events: "the loud party next door caused Jennifer's headache." Type-level causal relations can be analyzed using the sort of probabilistic theory presented in Cartwright 1979 and developed in Eells and Sober 1983 and Eells 1991 (chaps. 1-5). Token-level causal relations, by contrast, require analyses that are more sensitive to the spatiotemporal details of causal processes (see, for example, Eells 1991, chap. 6).

In two recent papers, Ned Hall (2000, forthcoming) has introduced a distinction between what he calls "dependence" and "production." "Dependence" is to be analyzed in terms of counterfactual dependence. Hall does not analyze "production," although his informal discussion suggests that it may bear some resemblance to Salmon's notion of a causal process (see Salmon 1984) and also to Lewis's notion of influence (Lewis 2000). One important distinction between these two relations is that production is transitive, whereas dependence is not. These two relations come apart, for example, in cases of "double prevention." Vandals steal a stop sign, and four hours later there is an accident at that intersection. The accident depends upon the act of larceny--had the latter not occurred, the former would not have occurred. But the accident was not produced by the theft--the peculiar features of the accident cannot be traced back to the vandals' action.

In this paper, I will articulate and defend a distinction between component effect along a causal route and net effect. When an event or factor C exerts an effect on another, E, it may do so along several causal routes--paths or chains of factors connecting C with E. C has an effect on E along a particular causal route when C makes a difference for E in virtue of being connected to E along the route in question. C has a net effect on E if it makes an overall difference to E taking into account all of the routes that connect them. These are not intended as precise explications, but only gesture at the distinction that I will illustrate, articulate, and apply. The expression "C has an ... effect on E" is intended to be liberal; it can apply in cases where C raises or lowers the probability of E, where C increases or decreases the magnitude of E, or where C affects the time or manner of Es occurrence. In such cases, C may be causally relevant to E, even though we may not feel comfortable saying that C causes E outright. The locution "has an ... effect on" is meant to invoke this looser notion of causal relevance.

The distinction I have in mind has a close analog in the causal modeling literature (see, for example, Pearl 2000 and references cited therein). My notion of net effect is closely related to the notions of total effect and causal effect in causal modeling, (1) and my notion of component effect is closely related to the notions of direct and indirect effect. (2) We will return to some of these parallels in section 7.1 below. So I claim no striking originality for my proposed distinction. Nonetheless, it is a distinction that has been systematically ignored by most philosophers, despite the fact that it can be readily formulated within almost any theory of causation.

The distinction between net and component effects is not the same as either of the distinctions described in the opening paragraphs. First, my distinction is not the same as the type-token distinction, since the two distinctions cross-classify: there can be token-level net effects and token-level component effects, type-level net effects and type-level component effects. Second, unlike Hall's production, the relation of component effect along a causal route is not transitive. The notion of type-level net effect bears some resemblance to Eells and Sober's type-level causation; in particular, I think that the former notion can be given a passably good explication in terms of probabilities. Likewise, the notion of token-level net effect resembles Hall's dependence, and can be given a passably good analysis in terms of counterfactuals. (I leave it open whether a suitably formulated probabilistic theory could give a good account of token-level net effects, or whether a suitably formulated counterfactual theory could give a good account of type4evel net effects. More generally, it is not my intention to evaluate the relative merits of these two approaches to causation.) This leads me to a third respect in which my proposed distinction differs from the other two: however one analyzes net effect, it should be possible to give a closely related analysis of component effect. Unlike the two relations postulated by Eells and Sober, or the two postulated by Hall, net and component effects do not need to be given radically divergent analyses. The two relations, although distinct, are cut from the same cloth.

I will begin by presenting two examples to motivate my distinction. Then I will offer suggestions for how the two concepts can be analyzed. I will not presuppose any one theoretical framework, but rather provide a recipe that can be readily adapted within different approaches to causation." (3) Finally, I will argue that this distinction can be used to resolve a number of specific difficulties within the theory of causation.

2. Illustrations

2.1 The Birth Control Pills

My first illustration will be a familiar one from the probabilistic causation literature, due to Germund Hesslow (1976). One of the most dangerous potential side effects of taking birth control pills is thrombosis--the formation of blood clots within arteries. This suggests that consumption of birth control pills causes thrombosis. On the other hand, consumption of birth control pills prevents pregnancy, which is itself a potential cause of thrombosis. In fact, among women who are under 35 and do not smoke or have other major risk factors for thrombosis, who are also fertile, sexually active, and not employing any other means of contraception, consumption of birth control pills lowers the overall probability of thrombosis. Hesslow presents this as a counterexample to probabilistic theories of causation: birth control pills cause thrombosis even though consumption of birth control pills lowers the probability of thrombosis.

The structure of the case is represented schematically in figure 1. Consumption of birth control pills has a strong negative effect upon pregnancy: it is an effective preventer of pregnancy. Pregnancy, in turn, is a powerful cause of thrombosis. Independently, birth control pills have a weaker, positive effect on thrombosis. Presumably, this effect is mediated by some intermediary not included in figure 1. In each case, the strength of the effect is represented by the thickness of the arrow, and the direction of the effect by the accompanying sign. I am assuming that the probabilistic nature of the example results from genuine indeterminism. In particular, the probabilities do not arise from mixing together different subpopulations of women exhibiting different causal structures (for example, one subpopulation consisting of women who will avoid thrombosis come what may, another of those destined to become pregnant no matter how fastidious they are in using contraceptives, and so on).

[FIGURE 1 OMITTED]

Before continuing with our discussion of Hesslow's example, let me pause to make some remarks about my terminology and use of diagrammatic representation. The diagram is to be understood roughly in the sense of the causal graphs used by Spirtes, Glymour, and Scheines (1993) and Pearl (2000). The left to right direction corresponds loosely to temporal order. The terms `Pills', `Pregnancy', and `Thrombosis' in the diagram represent variables. `Pregnancy', for example, has the two possible values {pregnancy occurs, pregnancy does not occur}. Figure I depicts two causal routes from the consumption of birth control pills to thrombosis, one of which includes pregnancy, the other of which bypasses pregnancy. In saying that the first route includes pregnancy, I mean that this route includes the effect that the consumption of oral contraceptives has upon pregnancy; I do not mean that this route is operative only when pregnancy actually occurs. Similarly, the second route excludes pregnancy in the sense that it involves an effect of birth control pills upon thrombosis that does not result from the influence of birth control pills upon pregnancy.

In Hesslow's example, there is a sense in which it is right to say both that the consumption of birth control pills causes thrombosis and that consumption of birth control pills prevents thrombosis. This can be said without contradiction by invoking the distinction between net and component effects. There are two causal routes from the consumption of birth control pills to thrombosis. Along one of these routes--the one including pregnancy--birth control pills have a strong negative component effect on thrombosis. Along the other route, the component effect is positive--birth control pills tend to cause thrombosis. The net effect of birth control pills on thrombosis is a function of these two component effects; since the preventive route from birth control pills to thrombosis is stronger than the causative route, the net effect of birth control pills on thrombosis is negative. Thus when we say that birth control pills cause thrombosis, we can be understood as reporting the existence of a causal route along which the component effect is positive. When we say that the consumption of birth control pills prevents thrombosis, we could be understood as reporting the existence of a causal route along which the component effect is negative, or we could be understood as claiming that the net effect of birth control pills upon thrombosis is negative.

Different conversational contexts can make one or the other of the two causal concepts salient. Suppose that Carla--who is a fertile, sexually active, nonsmoker under the age of 35---does not use any form of contraception. Her excuse for avoiding birth control pills is her fear of thrombosis. It would be entirely appropriate to point to the net preventive effect of birth control pills on thrombosis in order to expose the fallacy in her reasoning. On the other hand, if a company that manufactures birth control pills wishes to make a safer product, they should be interested in the presence of a causal route wherein pills cause thrombosis. In particular, the fact that this route bypasses pregnancy suggests the possibility of eliminating or reducing the impact of this causal route without undermining the pill's primary function. Indeed, our story has a happy ending: birth control pills are now considerably safer than they were when Hesslow's example was first presented. (4) There is a simple reason why Carla should be concerned with net effects and the manufacturing company with component effects: Carla has control only over whether she takes the pill, whereas the manufacturer has the ability to exercise control over the ways in which the pill influences thrombosis.

Hesslow's example was originally formulated in terms of type-level causal relations. Moreover, this example arose in the context of a discussion of probabilistic causation, implying that the underlying relations are thought of as indeterministic. Thus, my distinction between net and component effects can be drawn among type-level probabilistic causal relations.

2.2 The Economic Impact of Railroads

The impact of the railroad upon the economy of nineteenth-century America is the stuff of legend, and also the stuff of history. Prior to the mid 1960s, most historians accepted some version of what Robert Fogel (1964) dubbed the "axiom of indispensability," that "railroads were indispensable to American economic growth during the 19th Century" (Fogel 1964, vii). Krooss (1959, 439) refers to the railroads as "the principal single determinant of the levels of investment, national income, and employment in the 19th Century"; and Bolino (1961, 173) claims that "[b]esides stimulating investment and creating a demand for goods and factors, the railroad also provided a transportation service which was essential to the development of capitalism in America." It was little short of shocking, then, when Fogel issued a provocative challenge to the indispensability thesis. Fogel did not deny that transport by rail was considerably more efficient than transport by more traditional methods such as wagon and boat, but rather challenged the thesis that these forms of transport could not have sustained the economic boom of the late nineteenth century. Fogel attempted to estimate the "social savings" that resulted from this increase in efficiency in the year 1890. Fogel's estimate, while not inconsiderable, was small when considered as a percentage of gross national product. Indeed, it was on the same order of magnitude as one year's economic growth, implying that if the railroad had never existed, the American economy would have been set back by only a year or so.

For purposes of illustration, I will focus here on one small part of Fogel's argument. Fogel reasoned that if the railroad had never been built, there would have been a great deal of additional investment in alternative forms of transportation. New canals would have been built, roads would have been upgraded; Fogel examines a number of plans that were explicitly drawn for such projects. More speculatively, if the railroads had never been built, much of the financial and intellectual capital that was in fact invested in improvements in rail technology would have been diverted into the development of the internal combustion engine (which existed in primitive form as early as the 1820s). These improvements would have helped to close the gap between the actual cost of transporting goods in 1890, and the hypothetical cost in the absence of railroads.

Fogel's model is represented schematically in figure 2. The variable `Railroad' takes the values {railroad constructed prior to 1890, no railroad construction prior to 1890}; the variable `Canals, Roads, etc.' takes as values possible stages of development of alternative modes of transport as of 1890; and the values of `Transport costs' are the possible costs of transporting goods in 1890 (measured, say, in 1890 dollars per ton-mile). Once again, there are two distinct causal routes from railroad construction to transport costs. Railroad construction obviously allowed for the transportation of goods along rail lines: along this route, railroad construction had a strong negative effect on transport costs. But the construction of the railroad also diverted money from other projects that would themselves have reduced transport costs: along this route, railroad construction increased transport costs. The net effect of railroad construction on transport cost was still negative, but not as great as one would be led to believe if one focused only on the first of these causal routes. Note that the effect, `transport costs', is a quantitative variable. In this context, to say that railroad construction had a negative effect is to say that it reduced the cost of transporting goods.

[FIGURE 2 OMITTED]

Many of Fogel's critics challenged not only his numbers, but the relevance of the "what if?" questions that he sought to answer. Stanley Lebergott, a relatively sympathetic critic of Fogel, characterized this fundamental difference in perspective in the following way:
   An older tradition was shaped by the historian's desire to see economic
   change "as it was," by his recognition that every happening was
   indispensable to the actual course of history. The newer focus, on history
   "as it was not," reflects a belief that in economic affairs, as in those of
   the heart, desire will find its object--or a sufficiently close substitute.
   It is difficult for the economist to view any historic innovation as
   revolutionary or to equate technological novelty with economic importance.
   (Lebergott 1966, 000)


These historiographical schools disagree about what it means to ask about the economic impact of a technological innovation such as the railroad. Members of the "older tradition" would attempt to answer by citing the total value of goods transported on the railroads, and on the efficiency of the rail system of (say) 1890 in comparison with previously existing transport systems. These, they would argue, correspond to the actual economic impact of the railroad. These historians would thus judge the economic impact of the railroad in terms of the component effect of railroad construction on transport costs along the route that bypasses its effect on the development of rival transportation systems. By contrast, the "new economic history" would examine the quantity of goods transported and transportation costs in comparison with transport systems that would have arisen in the absence of the railroad. This corresponds to evaluating the net effect of railroad construction on transport costs along both routes. Thus, Lebergott's distinction between the two different ways of evaluating economic impact can be understood in terms of my distinction between net and component effects.

Note that this example does not involve type-level causal claims, but relations between particular historical occurrences. Moreover, while the underlying economic processes might be indeterministic, none of the discussion presupposes this. Indeed, Fogel employs deterministic functional models in his analysis, and his argument is broadly counterfactual in character. I hope that these two examples serve to motivate and give an intuitive feel for the distinction between net effects and component effects along causal routes. I hope, moreover, that they make plausible my claim that the distinction applies at both the type and token levels, and that it does not presuppose any one theoretical approach to the analysis of causation.

3. Analyzing Net and Component Effects

3.1 Net Effects

Many theories of causation have essentially the same form:
   CF C has an effect on E iff E depends upon C--that is, iff E varies as C is
   varied while holding fixed other appropriate factors.


Since I wish to talk about the distinction between net and component effects without presupposing any one theoretical perspective, I have attempted to employ theory-neutral terminology: "has an effect on," "depends upon," "varies," "holding fixed." Interpreting these terms in different ways yields different theories of causation.

Probabilistic theories of causation of the form developed by Cartwright (1979) and Eells (1991) emerge from CF employing the following scheme of translation. "Depends" means probabilistically depends. "Varying" C refers to the comparison of probabilities conditional upon C, and conditional upon ~C or upon specific alternatives C', C'', .... Thus E varies as C varies if the probability of E, conditional upon C, is different from the probability of E conditional upon the various rivals to C. Other factors F are "held fixed" by conditioning on them. The phrase "has an effect on" encompasses a number of different relations that can be defined within the probabilistic framework. Typically, C is deemed a (positive, promoting, or contributing) cause of E if the conditional probability of E given C is higher than the conditional probability given ~C, while C is said to prevent, inhibit, or be a negative cause of E if the conditional probability of E given C is lower than the conditional probability given ~C. I will also say that C has a positive or negative effect upon E in these two cases. C may also have on effect on E without unambiguously falling into either of these categories. This may happen, for instance, if C raises the probability of E while holding other factors fixed in a certain combination, while C lowers the probability of E while holding other factors fixed in a different combination. In such a case, Eells (1991) says that C is a mixed cause of E. It might also happen that C raises the probability of E relative to some alternative C', but that C lowers the probability of E relative to another alternative C''--see Hitchcock 1993 for discussion of this sort of case.

Counterfactual theories of causation of the sort developed by David Lewis (1973, 2000) also conform to the schema CF. To translate into the counterfactual mode, "depends" must be understood as "counterfactually depends." "Varying C" involves comparing the actual world, in which the event C occurs, with other possible worlds in which C is absent or in which some alternative C' occurs in place of C. In Lewis 2000, these alternatives are called "alterations" of C. (5) "E varies as C is varied" means that E does not occur, or some alternative E' to E occurs in some of these other possible worlds. More precisely, in Lewis 1973, E must not occur in any of the closest possible worlds where C does not occur; if this relationship obtains, E is said to "causally depend" upon C. In Lewis 2000, there must be a "substantial range" of alterations C', C" .... of C, and alterations E', E" .... of E such that in all of the closest possible worlds where C' occurs, E' occurs, and likewise for C" and E" and the other alterations of C and E. If this relationship obtains, Lewis says that C "influences" E. I will use the term "counterfactual dependence" to encompass both causal dependence and influence. In principle, other factors F may be "held fixed" in one of two ways: we may explicitly restrict attention to possible worlds where F also occurs (say by explicitly incorporating F into the antecedent of the relevant counterfactual), or we may employ standards of similarity among possible worlds (such as those described in Lewis 1979) that happen to leave F unchanged when we move to the relevant closest possible worlds. In practice, Lewis himself employs only the second method for holding other relevant factors "fixed." (6) Note that Lewis identifies causation with the ancestral of counterfactual dependence and not with counterfactual dependence itself. We will return to this point in section 6.4 below.

CF requires that other appropriate factors be "held fixed." Which factors are the appropriate ones to hold fixed? Detailed accounts are provided in Eells 1991 (chaps. 2-4) for probabilistic theories, and in Lewis 1979 for his counterfactual theory. I will here give only two illustrations. Suppose that C and E are both effects of some common cause D, but that neither has any direct effect on the other. Then, as we vary C, we do not want this variation to "backtrack" through D and hence to E. We can avoid this by holding D fixed (inter alia), hence "screening off" C from E. (7) By contrast, suppose that D is causally intermediate between C and E: C causes E by causing D, which in turi1 causes E. Suppose, moreover, that this is the only way in which C affects E. Then, we do not want to hold D fixed: that would screen C off from E, preventing variation in C from being passed on to E. Thus, it is appropriate to hold fixed common causes of C and E, but inappropriate to hold fixed factors that are intermediate between C and E. Note that Lewis (1979) believes that it is possible to specify the factors that are to be held fixed--or rather the metric of similarity among possible worlds--in purely acausal terms, while Cartwright (1979) and Eells (1991) deny this. If C and E are particular events, and we are evaluating the token causal claim that C caused E, then we must hold the appropriate factors fixed at their actual values. If D did occur, we hold fixed that D occurred; if D did not occur, we hold fixed that D did not occur; and if D is a quantitative or multi-valued variable, we hold fixed the value of D that actually obtained. By contrast, if C and E are generic factors or event-types, then it makes no sense to hold D fixed at its actual value. Instead, we can hold it fixed at different possible values; call an assignment of values to the factors that are to be held fixed a background condition. In order for C to have an effect on E, we require only that there be some background condition such that E varies as C is varied while holding that background condition fixed. In order to say that C causes E or that C prevents E, we may require that the appropriate pattern of dependence occur across a relatively wide range of background conditions.

I will leave readers with the task of translating CF into other theories of causation, such as necessity-sufficiency accounts, manipulability accounts, or other accounts formulated in terms of probabilities or counterfactuals. I hope that the two translations provided help to make clear what is intended by the various terms employed in CF. The advantage of CF is that it is theory-neutral--it does not presuppose any one theory of causation--while maintaining sufficient contact with prominent theories of causation to ward off charges of hand-waving. I do not wish to deny that there are important differences between probabilistic, counterfactual, and other theories of causation; but there are important similarities as well, and it is these similarities that I wish to bring to the fore. Nor do I wish to deny that there are difficulties with theories of causation formulated along the lines of CF; in particular, there are problems involving causal overdetermination, and problems involving token-level indeterministic causation. (8) These problems may force the addition of epicycles to CF; nonetheless, CF lies at the core of many serviceable accounts of causation.

CF is the appropriate form for a theory of net effect. Let us state this explicitly by re-writing CF slightly:
   Net C has a net effect on E iff E depends upon C--that is, iff E varies as
   C is varied while holding fixed other appropriate factors (including common
   causes of C and E, but excluding factors intermediate between C and E).


It follows that most extant philosophical theories of causation--such as the probabilistic theory developed by Cartwright (1979) and Eells (1991) and the counterfactual theory developed by Lewis (1973, 2000)--provide good theories of net effect. There are numerous difficulties involving the details of these theories, but they have the right general form for theories of net effect.

I should be careful to specify that it is Lewis's theories of causal dependence and of influence--rather than his accounts of causation per se--that provide good theories of net effect. Recall that Lewis identifies causation with the ancestral of causal dependence (in 1973) and of influence (in 2000). I think that these accounts are best understood as (mistaken) attempts to capture the concept of component effect; I will return to this point in sections 6.2 and 6.4 below.

To say that C has a net effect on E is not to say anything about the type of net effect that C has on E: C might be a (positive, promoting, contributing) cause of E, C might prevent (or be a negative or inhibiting cause of) E, or C might be a mixed cause of E; E might causally depend upon C, C might influence E. How we classify the net effect of Con E will depend upon how Evaries as we vary C. If removing C or replacing it with an alternative C' results in a lower probability for E or in a lower value for the relevant variable, then we will say that C is a net (positive) cause of E, or that C has a positive net effect on E. If, by contrast, removing C or replacing it with an alternative C' results in a higher probability for E or in a higher value for the relevant variable, then we will say that C is a net preventer of E, or that C has a negative net effect on E. In Hesslow's example, consumption of birth control pills has a negative net effect on thrombosis (among women who are fertile, sexually active, and so on): taking birth control pills lowers the probability of thrombosis. In Fogel's model, the construction of the railroads had a negative net effect on transportation costs--transportation costs would have been higher if the railroads had not been constructed--albeit not as large a negative effect as many had thought.

3.2 Component Effects

Consider now the general case where there are multiple causal routes from C to E, as shown in figure 3. In this scenario, there are three causal routes between C and E, via the three intermediate factors [D.sub.1], [D.sub.2], and [D.sub.3]. In general, there is a causal route from C to E via [D.sub.i] if C has a net effect on [D.sub.1], and [D.sub.i], has a net effect on E. (By contrast, there would not be a causal route from C to E via [D.sub.i] if C and [D.sub.i] are merely correlated, say in virtue of sharing a common cause.) There may be more than one causal route from C to E via [D.sub.i] if the net effect of C on [D.sub.i] or of [D.sub.i] on E is itself mediated by more than one causal route. For purposes of discussion, however, we will assume that there are only the three causal routes shown. The existence of a causal route from C to E via [D.sub.i] does not entail that C has a component effect on E along this route. We will return to a number of these issues at numerous points below.

[FIGURE 3 OMITTED]

According to Net, if we wish to evaluate the net effect of C on E, we should not hold any of the intermediate factors [D.sub.1], [D.sub.2], [D.sub.3] fixed. Suppose, however, that we wish to evaluate the component effect of C on E along one of these routes, say the route that runs through [D.sub.1]. What is needed is a way of isolating this component effect, of eliminating the contribution along the other routes. This can be accomplished by holding fixed intermediate factors along the other routes from C to E, in this case [D.sub.2] and [D.sub.3]. By holding these factors fixed, variation in C is allowed to propagate to E only along one route, the one that runs through [D.sub.1]. This idea is readily translated into the probabilistic and counterfactual frameworks described above. Within the probabilistic framework, the intermediates [D.sub.2] and [D.sub.3] can be held fixed by conditionalizing on them. Within the counterfactual framework, they can be held fixed either by making suitable adjustments in the similarity metric or, more simply, by explicitly incorporating them into the antecedent of the relevant counterfactuals. This second approach is carried out in detail in Hitchcock 2001.

Collecting these ideas together, we get:
   Comp C has a component effect on E along a particular causal route iff E
   depends upon Calong this route--iff E varies as C is varied while holding
   fixed other appropriate factors (including factors that are intermediate
   between C and E along other routes).


As with Net above, the expressions "depends," "varies," and "holding fixed" will be understood in different ways within different theoretical approaches. Note that a component effect of C on E is not a property of C and E alone, but rather of a specific causal route from C to E. If there is only one causal route from C to E, then the component effect of C on E along this route will be equal to the net effect of C on E. As with net effects, we may talk of different kinds of component effects: C may have a positive component effect along one route, a negative component effect along another, and so on.

3.3 The Illustrations Again

Let us apply these analyses to our two illustrations. In Hesslow's example, women who take birth control pills are less likely to suffer from thrombosis than women who do not (assuming they are fertile, sexually active, nonsmokers, and so on). According to Net, then, consumption of birth control pills has a net effect on thrombosis (among the relevant class of women); in particular, it is a net preventer of thrombosis. This is the sense (or at least one sense) in which it is appropriate to say that consumption of oral contraceptives prevents blood clots.

By contrast, if you hold fixed whether a woman becomes pregnant, consumption of birth control pills increases her chances of suffering from thrombosis. That is, a woman is more likely to suffer from thrombosis if she consumes birth control pills and becomes pregnant than if she does not consume birth control pills and becomes pregnant; and she is more likely to suffer from thrombosis if she consumes birth control pills and does not become pregnant than if she does not consume birth control pills and does not become pregnant. It is for this reason, according to Comp, that consumption of birth control pills has a positive component effect on thrombosis, along the route that does not include pregnancy. It is for this reason that we are justified in saying that the consumption of birth control pills causes thrombosis.

This analysis is very different from those of Eells (1991, 223-25) and Cartwright (1989, chap. 3). Both claim to account for the dual effect of birth control pills on thrombosis by showing how pills can have the two types of effect in different subpopulations. (The authors differ in important ways about how these subpopulations are to be characterized.) I claim, by contrast, that there is a sense in which consumption of oral contraceptives can both cause and prevent thrombosis in the same population. (9)

Notice that in figure 1, we did not include any intermediary factor along the causal route that excludes pregnancy. According to Comp, we can identify this causal route, and evaluate the nature of the component effect of birth control pills along this route, without needing to know what further factors mediate this effect. In order to determine this component effect, it suffices to hold fixed intermediate factors that lie along other causal routes from oral contraceptives to thrombosis. (10) On the other hand, if we wish to evaluate the component effect of birth control pills upon thrombosis along the causal route that includes pregnancy, we will need to hold fixed an intermediate factor that lies along the causal route that bypasses pregnancy (such intermediates are omitted from figure 1). One might think that it follows, from the facts that pill consumption prevents pregnancy and that pregnancy causes thrombosis, that oral contraceptives have a preventive effect on thrombosis along this route. This implication does not hold in general, as we shall see in section 6.4 below. Nonetheless, it is plausible enough in this example that oral contraceptives do indeed have a negative component effect along this route.

In Fogel's model, if you hold fixed the actual state of development of canals, roads, and other alternative modes of transport, eliminating the railroad would make a very large difference in the cost of transporting goods. It is for this reason, according to Comp, that the construction of the railroad had a very large component effect on transport costs, and that historians are justified in claiming that the railroad had a very large impact on the U.S. economy. On the other hand, if we do not hold fixed the level of development of alternative means of transport, then the counterfactual assumption that the railroad was never constructed would lead to a smaller difference in transportation costs. Thus the net effect of railroad construction on transportation costs is substantially less than the component effect.

4. Further Complications

In this section, I elaborate on this analysis of net and component effects by touching on a number of complications involved in this distinction.

4.1 A Generalization

The concepts of net effect and component effect along a causal route are really special cases of a more general concept. Let S = {[r.sub.i], ..., [r.sub.n]} be the set of all causal routes from C to E. The net effect of C on E may be thought of as the effect of C on E for the entire set S; the component effect of C on E along [r.sub.i] may be thought of as the effect of Con E for the singleton set {[r.sub.i]} [subset or equal to] S. The generalized concept is that of the effect of C on E for an arbitrary subset S' [subset or equal to] S of routes. This effect would be evaluated by holding fixed intermediates along the routes in S\S'. (11) For example, in figure 3, we could hold D3 fixed to evaluate the effect of C on E for the two other routes combined. In examples where there are more than two causal routes from C to E, this generalized concept will often prove useful. In what follows, however, I will keep matters simple by discussing only examples involving two (or fewer) routes, and retain the dichotomy between net and component effects.

4.2 Reduction

Some readers may have a worry about the recipe for analyzing component effects described in section 3.2. In order to evaluate the component effect of C on E along some particular route, we must hold fixed causal intermediates along other routes. But "causal intermediate" and "causal route" are themselves causal notions. Does this undermine the project of giving a reductive analysis of component effect?

Defenders of probabilistic theories of causation such as Cartwright (1979) and Eells (1991) should not have this worry: their accounts of net effect already appeal to causal notions. This prevents them from providing reductive analyses along the lines of CF, but not from providing illuminating theories about the relationship between causation and probabilities. In the same spirit, we could take the concepts of "causal intermediate," "causal route," and even "component effect along a causal route" as primitives, and treat Comp as an interesting constraint on the relationship between these entities and probabilities. Indeed, as I will argue in section 6.5 below, the distinction between net and component effects helps us to become clearer about just what needs to be taken as primitive within a probabilistic theory of causation.

By contrast, Lewis and his followers have maintained that it is possible to provide a reductive counterfactual analysis of causation. Whether this is possible hinges upon the success of Lewis's analysis of "non-backtracking" counterfactuals, presented in Lewis 1979. This account is sketchy, and I am not sanguine about its prospects for success. For the sake of argument, however, let us grant that it is possible to give a reductive analysis of counterfactuals. Then it will be possible to give a reductive analysis of component effects in terms of counterfactuals, so long as it is possible to specify what further factors need to be held fixed in purely counterfactual terms. This project is pursued in some detail in Hitchcock 2001; a brief summary is provided in section 7.2 below.

4.3 Interactions

If C has an effect on E, it may nonetheless be difficult to characterize the nature of this effect in any simple manner. One reason for this is the existence of causal interactions. Consider Hesslow's example: suppose we want to know whether the consumption of birth control pills has a net positive effect on thrombosis. According to CF, we must hold fixed other relevant factors. Research has shown that among these relevant factors are the age and smoking status of the woman who is considering oral contraception. Assume (as is likely) that consumption of oral contraceptives has a negative net effect on thrombosis among nonsmokers under the age of 35, while having a positive net effect for women over 35 who smoke. Oral contraceptives interact with these other causes of thrombosis. Then, at the type-level, we cannot say unambiguously that oral contraceptives have a positive or negative net effect on thrombosis. We must stick with the less informative claim that birth control pills have a net effect on thrombosis, or else we must restrict our claims about type-level net effects to specific subpopulations.

A related problem arises for type-level component effects. In order to evaluate the type4evel component effect of pill consumption on thrombosis, through the route that excludes pregnancy, we must hold fixed whether a woman becomes pregnant (in addition to all of the further factors discussed above). There are two ways in which we can do this: we can hold fixed that pregnancy occurs, or that it does not occur. As the example was presented, consumption of oral contraceptives has the same component effect in both of these cases, at least qualitatively: consumption of oral contraceptives increased the probability of thrombosis in each case. This need not have been the case: it could have been that birth control pills lower the probability of thrombosis among women who fail to become pregnant, but increase the probability of thrombosis among those who do become pregnant. In this case, pill consumption would have a component effect upon thrombosis, but we could not easily classify the component effect as positive or negative. Even in the case as presented, the size of the effect may be different in the two cases: it may be that pill consumption dramatically increases the probability of thrombosis among women who do not become pregnant, but makes only a slight difference for women who do become pregnant. In such a case, no blanket claim could be made about the size of the component effect.

These problems do not arise at the token level. If we wish to evaluate whether Carla's pill consumption had (say) a positive component effect upon thrombosis, we need only to hold the appropriate factors fixed at their actual values. If Carla in fact avoided pregnancy, it only matters how the pills affect her probability of thrombosis when we hold fixed that she did not become pregnant. It does not matter whether the pills would have had a different effect had she become pregnant.

5. Causal Discovery

A causal relationship between C and E can be uncovered in two distinct ways. First, there might be a statistical association between C and E. In particular, an association that is present when C is assigned randomly (as is done in clinical trials), or when confounding factors are controlled for, can provide strong evidence for a causal relationship between C and E. Second, the discovery that C has a certain mechanism of operation might persuade us that it causes E. It often happens that one of these types of evidence is available in the absence of the other. Aspirin relieves headaches, and lithium relieves symptoms of manic depression. Both of these causal relationships were established decades ago on the basis of clinical trials, even though the mechanisms whereby these drugs achieve their effects are only now becoming known. By contrast, consider the effect of moderate alcohol consumption on cardiovascular disease; here we have knowledge of some of the mechanisms whereby alcohol consumption has an effect on heart disease, but we do not have reliable statistical evidence for a causal relationship. It is known, for example, that moderate alcohol consumption increases levels of HDL cholesterol, which lowers the risk of cardiovascular disease. On the other hand, inferences drawn from epidemiological data on alcohol consumption and cardiovascular disease are highly suspect due to the difficulty of controlling for other relevant factors such as diet and socioeconomic status.

These two modes of causal discovery tend to align with the two different types of effect. More specifically, a statistical association between two factors, by itself, tends only to support conclusions about the net effect of one upon the other, while mechanisms of operation by themselves tend only to support conclusions about component effects. Let us examine the former claim first. It does not deny that careful statistical analysis can reveal the presence of component effects. For example, a clinical study of oral contraceptives that controls for whether pregnancy occurs might reveal that birth control pills have a positive component effect on thrombosis. But the standard protocol for clinical trials, in which subjects are randomly assigned to treatment and control groups, would not result in pregnancy being controlled for: we would expect pregnancy to occur at a much higher rate within the control group. In order to take the extra step of controlling for pregnancy, we would have to at least suspect that oral contraceptives have an effect on thrombosis via pregnancy. In the absence of any information about the mechanisms whereby C might affect E, we would not know which factors to control for in order to discover the distinct component effects of C on E. For example, absent any knowledge of the mechanisms whereby lithium affects manic depression, we would be unable to hold fixed the relevant causal intermediates in order to evaluate the various component effects of lithium on bipolar disorder. This point is related to the worry about reduction raised in section 4.2 above.

In the case of alcohol consumption and heart disease, we do have knowledge about some of the mechanisms whereby the former might affect the latter. Moderate alcohol consumption raises levels of HDL cholesterol, which protects against cardiovascular disease. If these effects are fairly robust--in particular, if they continue to hold in the presence of other factors that are effects of alcohol consumption--then we have some reason to think that moderate alcohol consumption has a negative component effect on heart disease along this route. That is, if we were to hold fixed all the other intermediate effects of alcohol, we would expect that alcohol consumption would raise HDL cholesterol levels, which would in turn reduce the risk of heart disease. Unfortunately, we also know that moderate alcohol consumption can increase homocysteine levels, which increases the risk of cardiovascular disease. There are no doubt many other routes whereby alcohol may have an effect upon heart disease: some forms of alcohol (especially beer) are rich in vitamin B6, which not only reduces homocysteine levels (so alcohol consumption affects homocysteine levels via two different routes) but also appears to have an independent negative effect on heart disease; alcohol is high in calories, so consumption of alcohol may lead to obesity, or may displace other elements of one's diet; alcohol consumption has psychological and social effects that may affect the risk of cardiovascular disease; and so on. For those of us who wish to know whether we should feel guilty or virtuous when enjoying a glass of wine with dinner, all of this information is bewildering: for we still do not know what the net effect of alcohol on heart disease is.

The discussion so far has been concerned with the evaluation of type-level causal claims. Token-level claims are typically evaluated after the fact; they are, as it were, the subject of history. In historical analysis, it is often easier to assess claims about component effects than claims about net effects. This point is nicely illustrated by Fogel's model of the effect of railroad construction on transport costs. In order to determine the component effect of the railroad on transport costs (through the route that excludes improvements in alternative modes of transport), we must hold fixed the actual state of the nation's network of roads and canals, and ask about the cost of transport if the railroad were not in existence. While it is no simple task, these transport costs can be estimated using information about those goods that were transported by more traditional means and information about transport costs prior to the construction of the railroads. In order to determine the net effect of the railroad on trade, however, we would need to know the amount of investment that would have gone into roads and canals had the railroad been absent, the extent of the expanded transport system, the nature of improvements on the internal combustion engine, and the impact of all of these on transport costs. It is little wonder, then, that Fogel ventured only a qualitative judgment, and that more traditional historians opposed this study of history "as it was not."

6. Problem Solving with Net and Component Effects

In this section, I will argue that the distinction between net and component effects allows us to solve a number of outstanding problems in the theory of causation. Some of these problems arise within specific theories of causation; when appropriate I will adopt the terminology and general framework of the theory in question.

6.1 Avoiding Spurious Counterexamples

Theories of causation are often evaluated by comparing their verdicts with those of intuition in a variety of test cases. A good theory of causation ought to render the verdict that C causes E just in case our intuition judges this claim to be true. By now it should be clear that there is a flaw in this methodology. The claim that C causes E is ambiguous: it might mean that C has a (positive) net effect on E, or it might mean that C has a (positive) component effect on E. A perfectly good theory of net effect will appear to fail if we unconsciously disambiguate the causal claim as a claim about a component effect (and vice versa). For instance, recall that Hesslow's example was originally intended as a counterexample to probabilistic theories of causation. As traditionally formulated, these theories have been suitable as theories of net effect. Hesslow presented his example in such a way as to render salient a judgment about a particular component effect. It is hardly surprising, then, that Hesslow's example has the appearance of a counterexample. But no theory of causation can be expected to pass this kind of test. In cases like Hesslow's where a net preventer has a positive component effect along some salient causal route, it will always be possible to make theory and intuition come apart if our theory captures one causal notion while intuition latches on to the other. In evaluating a theory of causation we should not expect our theory to yield one univocal answer that agrees with a univocal answer from intuition. Rather, we should expect our theory to be able to capture the diverse causal relations that are present in the case.

6.2 Preemption

In this section, I will discuss the problem of causal preemption. Lewis (2000) distinguishes between three different types of preemption, which he calls "early cutting," "late cutting," and "trumping"; I will treat only the first of these. Suppose that Assassin fires at Victim, who dies. Had Assassin not fired, Backup certainly would have, and Victim would have died anyway. This is depicted schematically in figure 4. This example would cause problems for a theory that identified causation with counterfactual dependence. Assassin's shot caused Victim to die, but Victim's death does not depend counterfactually upon Assassin's shot: if Assassin had not shot, Victim still would have died.

[FIGURE 4 OMITTED]

Lewis's solution to this problem is to find an event that is intermediate between Assassin's shot and Victim's death, and to show that there is a chain of counterfactual dependence from the former to the latter. Consider, for example, the presence of the bullet on its way to Victim with a certain momentum; call this event i (for "intermediate"). i depends counterfactually upon Assassin's shot; if Assassin had not shot, i would not have occurred. Moreover, Victim's death depends counterfactually upon i. This part is a little bit trickier. If i had not occurred, Assassin still would have shot: the counterfactual does not "backtrack" from i. So, if i had not occurred, Assassin would still have shot, and Backup would not have; since there would then have been no bullet speeding toward Victim, she would not have died. Since Lewis identifies token causation with the ancestral of counterfactual dependence, rather than with counterfactual dependence per se, this suffices to show that Assassin's shot caused Victim's death. Lewis's identification brings on its own problems, however, as we shall see in section 6.4 below.

I propose to analyze this case (and similar cases) in a different way. Assassin's shot has a positive component effect on Victim's death, and it is for this reason that we judge it to be a cause of death. This positive component effect can be seen by holding fixed Backup's failure to shoot: given that Backup did not shoot, if Assassin had not shot, Victim would not have died. This counterfactual is very intuitive--much more so than the one that reveals the chain of counterfactual dependence from Assassin's shot to Victim's death in Lewis's treatment.

6.3 Pragmatics

This analysis of preemption is subject to an important objection. If Comp and Net are correct, then Assassin's shot has a component effect on Victim's death, but no net effect. Is the latter verdict correct? In particular, if Assassin's shot has a component effect on Victim's death, but no net effect, shouldn't we feel some tension in the claim that Assassin's shot caused Victim's death? In Hesslow's example, where birth control pills have a positive component effect on thrombosis but a negative net effect, we feel genuinely torn between saying that oral contraceptives cause thrombosis and saying that they prevent thrombosis. Indeed, this tension was taken to be symptomatic of conflicting net and component effects. Shouldn't the absence of such a tension in the case of the two assassins give us pause?

I do not think Net is incorrect in ruling that Assassin's shot has no net effect on Victim's death. Since "net effect" is not a part of ordinary English, it is difficult to marshal intuitive support for Net We would, however, assent to a claim such as the following: "Assassin's choice to shoot Victim himself made no difference to the final outcome." The relevant notion of "making a difference" certainly seems to be a causal one, so I see no reason not to read "made no difference" as "had no net effect."

Nonetheless, a problem remains. The case of the two assassins does present a counterexample to the following thesis: whenever C has (lacks) a positive (negative) net effect on E, or has (lacks) a positive (negative) component effect on E along some route, then we will feel some intuitive pressure to accept (deny) the claim that C causes (prevents) E. If my account is to make sense of our causal intuitions, it will need to be supplemented with a story about why particular net or component effects get a grip on our intuition while others do not. This is essentially a problem in the pragmatics of causal discourse. We make and accept unqualified causal claims--claims that do not specify whether net or component effects are being discussed. In some cases, such as Hesslow's, these claims seem to be genuinely ambiguous. In other cases we are able to disambiguate the claim. When we hear that Assassin's shot caused Victim's death, we somehow interpret this as a claim about the component effect of Assassin's shot. Why do we do this?

I offer three suggestions. First, according to Net and Comp, we evaluate causal claims by hypothetically varying some factors while others are held fixed. Whether we are evaluating a claim about a net effect or about a particular component effect will depend upon what we hold fixed. I suggest that when we hear a causal claim, we tend to interpret it in such a way that it is psychologically easy to assess. In particular, we will interpret it in such a way that is easy to hold fixed the appropriate factors for evaluating the claim. In the case of the two assassins, it is easy to hold fixed Backup's failure to shoot. It is easy to see what would have happened if Assassin had not shot, while Backup refrained from shooting. In order to do this, we do not need to know exactly how Assassin's shot is related to Backup's action. It does not matter whether Backup really would have fired had Assassin refrained, whether Backup was paying attention, whether Backup had a clear line of sight, and so on. We construe the claim "Assassin's shot caused Victim to die" as a claim about component effects, because it is so easy for us to see that this claim is true. More generally, when evaluating a token causal claim that is made after a particular sequence of events has occurred, it is natural to hold fixed salient events that actually occurred. Thus, our token causal claims are often naturally construed as claims about component effects. This is related to the point, made in section 5 above, that claims about component effects typically yield to historical analysis more easily than do claims about net effects.

My second suggestion is connected to the first. Certain causal routes are highly "portable," while others are not. Causal routes fail to be portable when they result from idiosyncratic features of the example at hand. When we are told that C causes E we tend to interpret this as a claim about the effect of C on E along the portable causal routes. If one particular route is highly portable, then we will interpret the claim as asserting that C has a component effect on E along that route. If all of the routes are equally portable, then we will interpret the claim as asserting that C has a net effect on E. In the case of the two assassins, the presence of a backup assassin with perfect aim, ready to shoot, is idiosyncratic. In most situations when assassins shoot at victims, or indeed when guns are fired in general, there is no backup shooter waiting in the wings. Thus, when asked whether Assassin's shot caused Victim to die, we tend to disregard Backup and evaluate the component effect of Assassin's shot along the remaining route. It is only natural that when asking about the effect of Assassin's shot, we should be interested in component effects of the sort that are most likely to be present in other cases in which guns are fired. In Hesslow's example, by contrast, it is to be expected that women who take birth control pills thereby affect their chances of becoming pregnant (for that, of course, is the very point of using them). The presence of multiple routes from pill consumption to thrombosis is not a contrived feature of a particular example, but is to be expected on most occasions when oral contraceptives are used. Thus, it is much more natural, in Hesslow's example, to be thinking in terms of the net effect of birth control pills along both causal routes.

Finally, when E is some specific event that has already occurred, we are more inclined to accept the token causal claim "C caused E" if E is unexpected, and hence in need of causal explanation. People normally survive their strolls in the park; hence when Victim does not, we are naturally receptive to information about causes of her demise. Likewise, if Carla develops thrombosis, we are more inclined to accept the claim that her consumption of birth control pills caused her thrombosis than that it did not. By contrast, if Carla avoids thrombosis, we may not find this occurrence to be in need of explanation. Hence, we will be more ambivalent about saying that her contraceptive use caused her good health.

Let me reiterate that these suggestions, rough as they are, concern the pragmatics of causal discourse, and not the metaphysics of causation. The question is not when causation simpliciter is to be identified with having a net effect, and when with having a component effect along a causal route. That question has a false presupposition: there is no such thing as causation simpliciter.

6.4 Transitivity

The treatment of (early cutting) preemption in terms of component effects has a decided advantage over Lewis's treatment: the former does not commit us to the transitivity of causation in general. Consider, for example, the following prima facie counterexample to transitivity, due to Michael McDermott (1995). Terrorist, who is right-handed, suffers a serious dog bite on his right hand. As a result of this, he presses a detonator button with his left hand, causing a bomb to explode. In this example, the dog bite caused Terrorist to push the button with his left hand: if the dog had not bitten him, he would have pushed it with his right hand instead. Terrorist's pushing the button with his left hand caused the bomb to explode: if he had not pushed it with his left hand, he would not have pushed it at all (his right hand being incapacitated) so the bomb would not have gone off. Intuitively, however, the dog bite was not a cause of the explosion. (12) Because of this and related counterexamples, many philosophers have challenged the thesis that causation is transitive.

Let us see what Net and Comp say about this example. According to Net, the dog bite has a net effect on Terrorist's push, and Terrorist's push has a net effect on the explosion. But the dog bite has no net effect on the explosion: the explosion would have occurred just as it did if the dog had not bitten Terrorist's right hand. This shows that net effect is not transitive in general. When Lewis identifies causation with the transitive closure of counterfactual dependence, I think that it is plausible to interpret this as an attempt to capture the notion of component effect. I will argue, however, that component effects are not transitive either. In McDermott's example, the dog bite has a component effect upon Terrorist's push, which has a component effect upon the explosion. (We assume that there is only one causal route connecting each of these pairs, so that the net and component effects coincide.) It follows that there is a causal route from the dog bite to the explosion. But it does not follow that the dog bite has a component effect along this route. In order for the dog bite to have such a component effect, the explosion must counterfactually depend upon the dog bite while holding fixed factors that lie along other routes.

What are these factors that must be held fixed? Here, an analogy with our earlier preemption example will be helpful. If Assassin had not shot, Backup would have and Victim would still have died; if the dog bite had not occurred, Terrorist would have pushed the button with his right hand and the bomb would still have exploded. The dog bite preempted Terrorist's right-handed push, just as Assassin's shot preempted Backup's shot. Backup's failure to shoot was a causal intermediary along a secondary route from Assassin's shot to Victim's death, and it is this event that we must hold fixed when evaluating the component effect of Assassin's shot upon Victim's death. This suggests that in McDermott's example, we should hold fixed Terrorist's failure to push the detonator with his right hand. But when we hold this event fixed, the explosion still does not depend on the dog bite. That is, if the dog bite had not occurred, and Terrorist (nonetheless) did not push the button with his right hand, then he would have pushed the button with his left hand, (13) and the explosion would have occurred as before. According to Comp, then, the dog bite does not have a component effect upon the explosion. Despite their superficial similarities, there is an important difference in the structure of the two cases.

This shows that component effects are not transitive in general. The dog bite has a component effect on the button push, and the button push has a component effect upon the explosion, but the dog bite does not have a component effect upon the explosion. In this regard, my concept of component effect is different from Hall's concept of "production." This example also shows that one event (C) may fail to have a net effect on another (E) for a variety of reasons. First, C and E may have nothing to do with one another; they may not be connected by any causal routes. Second, C may have component effects on E along two or more causal routes, where these component effects cancel each other. This is what happens in the case of the backup assassin. Third, there may be one or more causal routes between C and E, but C may fail to have a component effect on E along any of them. This is what happens in McDermott's example. The reason that the dog bite has no component effect on the explosion is that the component effects along the segments of the causal route do not align properly. Varying whether the dog bite occurs leads to differences in Terrorist's push, and varying whether the push occurs affects the occurrence of the explosion, but the variations in the push that are produced by variations in the dog bite are not the same variations as the ones that lead to differences in the explosion.

The distinction between net and component effects shows us how there can be causation without counterfactual dependence in genuine cases of preemption: in such cases, one event can still have a component effect upon another. This treatment of preemption does not commit us to the transitivity of causation, leaving us free to accept McDermott's example and its cousins as genuine counterexamples to the transitivity of causation. Note, however, that my account is compatible with the claim that the relation of being connected by a causal route is transitive. In particular, there is a causal route from the dog bite to the explosion. Perhaps this is what philosophers really have in mind when they advocate the transitivity of causation.

6.5 What Must be Held Fixed?

According to probabilistic theories of causation, C is a (net) cause of E if and only if C raises the probability of E while other appropriate factors are held fixed. Exactly which factors need to be held fixed is a complex matter: Eells 1991 (chaps. 2-4) is by far the most detailed discussion. Eells (following Cartwright) characterizes the factors that need to be held fixed in causal terms. In particular, we want to hold fixed factors that are themselves causally relevant to E, for which C is not causally relevant. This means that such probabilistic theories of causation cannot provide reductive analyses of causation in terms of probabilities, but the hope is that they will impose nontrivial probabilistic constraints upon causal relations.

Now that we have a distinction between net and component effects, it is worth asking whether the causal relations that are presupposed by probabilistic theories of causation are the same as those that are explicated by those theories. In this section, I wish to argue that they are not. Whereas probabilistic theories of causation (as traditionally formulated) are best understood as theories of net effects, the factors that need to be held fixed must be specified in terms of component effects. There is a sense, then, in which probabilistic theories of causation do provide reductive analyses of net effects: they provide analyses in terms of probabilities and component effects.

Consider the causal structure represented in figure 5. A has a positive component and net effect on C (since there is only one causal route, the two are equivalent). By contrast, A has two component effects on E, one positive, the other negative. Suppose that these two component effects cancel, so that A has no net effect on E. In this structure, C has no net effect on E. Is this the verdict of probabilistic theories of causation? Is there no probabilistic correlation between C and E when the relevant factors are held fixed? If we are required to hold fixed factors that have net effects on E (that are not in turn affected by C), then we are required to hold B fixed, but not A. This will result in a spurious correlation between C and E. It is apparent, then, that we want to hold A fixed in this scenario. This can be accomplished by requiring that we hold fixed factors that have a component effect on E along some route, such that C has no component effect on them. (14)

[FIGURE 5 OMITTED]

Thus, the distinction between net and component effects enables us to provide a much clearer account of the factors that are to be held fixed in a probabilistic theory of causation: in particular, we see that these factors must be specified in terms of component effects. Moreover, the distinction also helps us to appreciate that the rigid distinction between reductive and circular analyses of causation is too crude. In probabilistic theories of causation, component effect is the primitive concept, and net effect the defined concept. The analysis is not fully reductive, since net causes are not defined in purely acausal terms. But neither is it viciously circular, since it does allow us to define new causal concepts that are not presupposed by the theory.

7. Causal Graphs and Component Effects

It has become common in the recent causal modeling literature to represent systems of causal relations using directed graphs (see for example, Spirtes, Glymour, and Scheines 1993 and Pearl 2000). Let V = {X, Y, Z, ...} be a set of variables. A directed graph G on V is a set of ordered pairs or "arrows" from members of V to other members of V. If there is an arrow from the variable X to Y then X is a direct cause or causal parent of Y. I have been employing this mode of representation in the figures above. For example, figure 1 (Hesslow's example) is a directed graph over the variables {consumption of birth control pills, pregnancy, thrombosis}; consumption of birth control pills is a direct cause of both pregnancy and thrombosis, and pregnancy is a direct cause of thrombosis. A chain of arrows pointing in the same direction is a directed path; thus, in figure 1 there is a directed path from pill consumption to pregnancy to thrombosis. If there is a directed path from X to Y, then X is an ancestor of Y, and Y a descendant of X Directed graphs provide powerful heuristic tools, facilitating many types of causal inference.

7.1 Direct Causes and Component Effects

While the use of directed graphs to represent causal relationships has attracted the attention of a number of philosophers of science, it has received surprisingly little attention from metaphysicians and other philosophers interested in causation. One reason for this neglect is that the notions "direct cause" and "directed path" do not correspond to any of the concepts that have been defined within the more traditional philosophical approaches to causation. As we noted in section 3 above, most philosophical theories of causation are apt only for the concept of net causation. Armed only with the concept of net causation, it is impossible to interpret the concepts of direct causation and directed path. Rather, these graph-theoretic notions must be understood in terms of component effects along causal routes. Once we have these concepts in play, it is easy to understand just what it is that the graphs represent. The first part of the translation is easy: a directed path from one variable to another in a causal graph represents what I have been calling a causal route, and the variables that lie along the directed path represent intermediates along the causal route.

A directed graph G is related to the probability distributions over the variable set V via the Markov Condition (see for example Spirtes, Glymour, and Scheines 1993, 53-54):

MC If PA(X) is the set of all the parents of X; and Y is not a descendant of X then:

P(X|Y & PA(X)) = P(X|PA(X)). (15)

That is, conditional on the values of variables that are direct causes of X, the value of Y does not affect the probability that X takes on any value unless Y is a descendant of X A graph that does not satisfy the Markov condition does not adequately represent the causal relations among the variables in V. Consider figure 6, and suppose that P(Z|XY) [not equal to] P(Z|Y). Then the graph will violate the Markov condition unless X is a direct cause of Z That is, X will be a direct cause of Z if Z depends upon X while holding fixed the value of Y (for at least one value of y). (16) But this is precisely the condition that must hold in order for X to have a component effect on Z along a route that bypasses Y. Thus, if X is a direct cause of Z, X will have a component effect on Z. More precisely, X is a direct cause of Z relative to the variable set V just in case X has a component effect on Z through a route that does not include any other variable in V.

[FIGURE 6 OMITTED]

Note that the relation "X is a direct cause of Z" cannot simply be identified with "X has a component effect on Z along some causal route": X can have a component effect on Z along a causal route without being a direct cause of Z Suppose we move to a new variable set V' that includes a variable W mediating the route from X to Z--see figure 7. Then X will no longer be a direct cause of Z; but X will continue to have a component effect on Z along the route X-W-Z. It will still be the case that the value of Z depends upon the value of X, given the value of Y. In this sense, the notion of component effect along a route is less sensitive to the choice of a variable set V than is the notion of a direct cause. (17)

[FIGURE 7 OMITTED]

Note that while a directed path from X to Z indicates the presence of a causal route, it does not, by itself, indicate the presence of a component effect along this route. In figure 7, there is a directed path from X to W to Z, but X need not have a component effect along this route. The pattern of dependence may exhibit the sort of misalignment that we saw in McDermott's dog-bite example, in which case Z will not depend upon X while Y is being held fixed. In general, one cannot read off the presence of either net or component effects from graphical structure alone, (18) but only from the graphical structure in conjunction with relevant probabilities.

7.2 Reduction Again

In the preceding section, I showed how the concept of component effect along a causal route helps us to understand what is being represented in a causal graph. In this section, I will use the apparatus of causal graphs to fulfill a promissory note issued in section 4.2 above. Let us suppose, with Lewis and his followers, that non-backtracking counterfactuals can be reductively analyzed in non-causal terms. Suppose, moreover, that we are interested in developing a reductive analysis, not only of net effects, but also of component effects along a causal route. We can formulate an appropriate version of Comp in terms of counterfactuals, but this will not provide us with a reductive analysis of component effects unless we have in hand a reductive analysis of the notion of "causal route" and "intermediate factor along a causal route." I will sketch a strategy for doing this. Since I do not wish to endorse the reductive project, I will not develop this proposal in great detail. (19) But there are philosophers who believe that the only interesting analysis of causation is a reductive analysis of causation, and I hope to say enough to persuade these philosophers that they should not give up on Comp.

As stated above, the Markov condition expresses a relationship between graphical structure and patterns of probabilistic dependence. But it is possible to formulate an analog of the Markov condition in terms of counterfactual dependence. Given a set of variables V, we can define the set PA(X) to be the smallest set of variables in V\{X} that has the following property: For every assignment of values p to the variables in PA(X), for every value x of X; for every variable Y in V that is not in PA(X) [union] {X}, and for any distinct values y and y' of Y,

CMC The counterfactual "if PA(X) = p and Y = y, then X = x" is true iff the counterfactual "if PA(X) = p and Y= y', then X = x" is true.

That is, given the values of the variables in PA(X), the value of Y makes no difference to the value of X; where "making a difference for" is understood counterfactually." (20) Since the relevant counterfactuals are of the non-backtracking variety, there is no need to restrict Y to non-descendants of X, as is done in MC. For each variable X in V, CMC determines a unique set PA(X). Thus, the set of true counterfactual statements concerning the possible values of variables in V determines a unique causal graph over V. This causal graph will not, by itself, tell us whether X has a component effect on Z along some causal route: we still need to employ the counterfactual test described by Comp. But we can use the graph to define the two key concepts that figure in Comp: there is a causal route from X to Z if there is a directed path from X to Z in the corresponding graphical representation, and Y is an intermediate along this route if it lies on the directed path. In this way, the concepts of "causal route" and "causal intermediate" would be reductively analyzed in terms of counterfactuals. (21) Thus, the inclusion of these concepts in Comp, our proposed analysis of component effects, should pose no additional obstacle to the project of providing a reductive analysis of causation.

8. Conclusion

The philosophical literature on causation has generally proceeded as if there were but one relation--causation--that is the target of analysis. Adherents to the distinction between type- and token-level causation (such as Eells and Sober) or to the distinction between dependence and production (such as Hall) have challenged this orthodoxy. These heretics have employed similar strategies in arguing for their distinctions: they have used hypothetical examples to evoke apparently inconsistent causal judgments, and then argued that these judgments can only be reconciled if they apply to distinct causal concepts. I have employed a similar methodology in arguing for the distinction between net and component effects. A number of examples can be used to bring out the distinction with little or no strain on the intuition: the effect of birth control pills on thrombosis, the effect of railroad construction on transport costs, the effect of alcohol consumption on heart disease, and so on. The distinction imports no new theoretical apparatus; it can be articulated using the resources of existing theories of causation. Nonetheless, the distinction can be used to resolve a diverse array of extant problems in the theory of causation. It is, indeed, a distinction with a difference.

For comments on earlier drafts, I would like to thank Nancy Cartwright, Phil Dowe, Harold Hodes, Richard Miller, Judea Pearl, Jonathan Schaeffer, Jim Woodward, two anonymous referees for the Philosophical Review, and audience members at the University of Wisconsin at Madison and Arizona State University.

(1) As Pearl defines them (2000, 70, 164), these are average effects in a population. I do not intend net effects to be population averages.

(2) Apart from linear models, however, explicit treatments of indirect effects are rare. One important exception is Pearl 2001.

(3) For those who are interested in more precise characterizations, Hitchcock 2001 provides a detailed account of the notion that I am here calling component effect from a broadly counterfactual perspective. In that paper, I identify token causation with what I am here calling component effect. This was done primarily for expository reasons--in all of the cases there discussed it is our judgments about component effects that drive our causal intuitions--and I should be understood as retreating from that identification in the present paper. Pearl (2001) defines the notions of "total effect" and "pathspecific effect" within his structural framework. Woodward (forthcoming) uses the broadly manipulationist approach that he favors to draw a distinction between "total causes" and "contributing causes" that is essentially the same as my distinction. Dowe (1999) offers an account of chance-lowering causes that employs an idea that is similar to my notion of a component effect. His analysis of chance4owering causes is carried out using counterfactuals, probabilities, and causal processes. Cartwright (1989, chap. 3, esp. 3.4) formulates and criticizes a proposal along the lines of the account of component effects offered in section 3.2 below.

(4) The details of the case are somewhat complicated. Birth control pills prevent pregnancy by mimicking the hormonal effects of an actual pregnancy, so it is not surprising that some of the side effects of birth control pills are also effects of pregnancy. Most birth control pills introduce two hormones into the bloodstream: estrogen and progestin. Both are efficacious in preventing pregnancy, but only estrogen is implicated in thrombosis. This has led to the manufacture of birth control pills containing progestin only. In addition, the contraceptive effects of estrogen can be achieved using much lower doses than those used at the time of Hesslow's writing, thus decreasing the risk of thrombosis considerably.

(5) The term `alternative' is a little misleading here, since an alteration of C need not be numerically distinct from C itself. See Lewis (2000) for discussion.

(6) But not all proponents of counterfactual theories do so; see for example McDermott (1995).

(7) The terminology of "backtracking" is due to Lewis (1979), "screening off" to Reichenbach (1956). Both phrases are highly suggestive, and hence I am using them in a more general way than did their originators.

(8) For discussion of the latter, see Hitchcock forthcoming.

(9) It is true, however, that the consumption of birth control pills cannot both successfully cause and successfully prevent thrombosis in the very same women. I assume that at the type-level, the words cause and prevent are not success verbs, but rather describe causal tendencies.

(10) Thus Comp is consistent with the existence of (at most) one component effect of C on E that is unmediated. One problem that can arise in this case is how to factor out this route when evaluating component effects along other causal routes. Hitchcock 2001 suggests that it may not be desirable to factor out such a "direct" route, although Pearl 2001 provides a formal apparatus for doing so. Presumably, there are no such unmediated component effects in this world (with the possible exception of quantum phenomena), although component effects may by unmediated relative to a certain level of analysis or "grain." See notes 17 and 21, below.

(11) Pearl (2001) presents an elegant definition of this generalized notion from within the framework of his structural approach to causation.

(12) L. A. Paul (2000) argues that this is not a genuine counterexample to transitivity: the thing caused by the dog bite is not identical to the thing that causes the explosion. Many others with whom I've discussed this example also have the sense that it involves a trick of some sort. Lewis, however, bites the bullet and claims that the dog bite does indeed cause the explosion (personal communication).

(13) At any rate, it seems wrong to say that he definitely would not have pushed the button with his left hand, and that is all that my argument requires. This example is discussed in much greater detail in Hitchcock 2001.

(14) Eells (1991) discusses an example similar to this one (140-42), and proposes that we need to hold fixed factors that interact with C with respect to E, where interaction is characterized in purely probabilistic terms (131-32, 160). I fear, however, that Eells's notion is too permissive. For example, the biconditional C [equivalent to] E meets Eells's definition of a factor that interacts with C with respect to E. If we are permitted to hold such biconditionals and their negations fixed, then everything will count as a cause of everything.

(15) Since X, Y, and the members of PA(X) are all variables, this equation is really a shorthand for the following: For all values x of X y of Y, and all realizations p of the set PA(X), P(X= x|Y= y & PA(X) = p) = P(X= x|PA(X) = p). I will adopt the convention that equalities are implicitly universally quantified, while inequalities are implicitly existentially quantified.

(16) The Markov condition implies only that this is a sufficient condition for X to be a direct cause of Z The Minimality condition (Spirtes, Glymour, and Scheines 1993, 55) implies that it is necessary as well. Both implications assume that the rest of the graph (comprising the arrows from X to Y to Z) is correct.

(17) It is not completely insensitive however. Suppose, for example, that the two routes in figure 7 cancel. (This involves a violation of the principle that Spirtes, Glymour, and Scheines (1993, 56) call "faithfulness"; but this principle is not intended as a metaphysical principle about causal structure, only as a reliable principle for causal inference.) Then, if we move to a new variable set V" = V'\{Y}, we will no longer be able to detect the component effect of X on Z: the intermediate variable whose value would need to be held fixed is no longer present in V'.

(18) One can do this if one assumes faithfulness. Even then, however, graphical structure alone indicates only the presence or absence of net and component effects, and nothing at all about the nature of these effects.

(19) Although see Hitchcock 2001 for a more detailed account.

(20) The inspiration for this approach comes from Pearl (2000). The set of true counterfactuals whose antecedents specify the values of variables in PA(X), and whose consequents specify the value of X, can be represented as a function from values of PA(X) to values of X. Pearl calls such functions "mechanisms." In contrast to the approach sketched here, Pearl takes the mechanisms, rather than the counterfactuals, to be the basic entities.

(21) Although these concepts would still be relativized to a choice of variable set V. I argue in Hitchcock 2001 that this relativity is a virtue rather than a vice; in particular I use this relativity to address certain counterexamples to the transitivity of causation that differ in structure from McDermott's dog bite example. An alternate approach would be to develop the idea that a causal route corresponds to a directed path in a "sufficiently rich" variable set. A variable set would be sufficiently rich if the addition of new variables would not create any new directed paths between variables X and Z, but only interpolate variables along existing paths.

References

Bolino, A. 1961. The Development of the American Economy. Columbus, Ohio: Merrill.

Cartwright, N. 1979. "Causal Laws and Effective Strategies." Nous 13:419-37.

--. 1989. Nature's Capacities and Their Measurement. Oxford: Oxford University Press, Clarendon Press.

Collins, J., N. Hall, and L. Paul, eds. Forthcoming. Causation and Counterfactuals. Cambridge: MIT Press.

Dowe, P. 1999. "The Conserved Quantity Theory of Causation and Chance Raising." Philosophy of Science 66 (Proceedings): S486-S501.

Eells, E. 1991. Probabilistic Causality. Cambridge: Cambridge University Press.

Eells, E., and E. Sober. 1983. "Probabilistic Causality and the Question of Transitivity." Philosophy of Science 50:35-57.

Fogel, R. 1964. Railroads and American Economic Growth. Baltimore: Johns Hopkins Press.

Hall, N. 2000. "Causation and the Price of Transitivity." Journal of Philosophy 97:198-222.

--. Forthcoming. "Two Concepts of Causation." In Collins, Hall, and Paul forthcoming.

Hesslow, G. 1976. "Discussion: Two Notes on the Probabilistic Approach to Causality." Philosophy of Science 43:290-92.

Hitchcock, C. 1993. "A Generalized Probabilistic Theory of Causal Relevance." Synthese 97:335-64.

--. 2001. "The Intransitivity of Causation Revealed in Equations and Graphs." Journal of Philosophy 98:273-99.

--. Forthcoming. "Do All and Only Causes Raise the Probabilities of Effects?" In Collins, Hall, and Paul forthcoming.

Krooss, H. 1959. American Economic Development. Englewood Cliffs, N.J.: Prentice-Hall.

Lebergott, S. 1966. "United States Transport Advance and Externalities." Journal of Economic History (December 1966): 437-61

Lewis, D. 1973. "Causation." Journal of Philosophy 70:556-67.

--. 1979. "Counterfactual Dependence and Time's Arrow." Nous 13:455-76.

--. 2000. "Causation as Influence." Journal of Philosophy 97:182-97. McDermott, M. 1995. "Redundant Causation." British Journal for the Philosophy of Science 40:523-44.

Paul, L. A. 2000. "Aspect Causation." Journal of Philosophy 97:235-56.

Pearl, J. 2000. Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press.

--. 2001. "Direct and Indirect Effects." Technical Report R-273, Cognitive Systems Laboratory, University of California at Los Angeles. Forthcoming in Proceedings of UAI-2001. San Mateo, Calif.: Morgan Kauffman.

Reichenbach, H. 1956. The Direction of Time. Berkeley and Los Angeles: University of California Press.

Salmon, W. 1984. Scientific Explanation and the Causal Structure of the World. Princeton: Princeton University Press.

Sober, E. 1985. "Two Concepts of Cause." In PSA 1984, vol. 2, ed. P. Asquith and P. Kitcher, 405-24. East Lansing: Philosophy of Science Association.

Spirtes, P., C. Glymour, and R. Scheines. 1993. Causation, Prediction, and Search. New York: Springer-Verlag.

Woodward, J. Forthcoming. "Probabilistic Causality, Direct Causes, and Counterfactual Dependence." In Stochastic Dependence and Causality, ed. D. Constantini, M. C. Galavotti, and P. Suppes.
Christopher Hitchcock
California Institute of Technology
COPYRIGHT 2001 Cornell University
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2001 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:causation analysis
Author:Hitchcock, Christopher
Publication:The Philosophical Review
Article Type:Critical Essay
Geographic Code:1USA
Date:Jul 1, 2001
Words:14955
Previous Article:Conceptual analysis and reductive explanation.
Next Article:Demonstrative concepts and experience.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters