Multiplant firms and innovation adoption and diffusion.
1. Introduction
One empirical regularity associated with the adoption of new technology is that large firms tend to adopt sooner than small firms (e.g., see Davies 1979; Mansfield 1968; Mansfield et al. 1982; Stoneman 1983, 1995; or Hoppe 2002). The usual explanation, of course, is that large firms expect a greater return from adoption than small firms. However, large firms do not always adopt first, as can be seen from the diffusion of several new processes in the U.S. steel industry: the basic oxygen furnace and continuous casting (Adams and Mueller 1982) and thinslab casting (Ghemawat 1993, 1995). This article develops a new theoretical model of innovation adoption and diffusion that explains why large firms tend to adopt first but admits conditions under which small firms adopt first. The analysis focuses on the adoption of an innovation of uncertain profitability when a firm's size is measured by the number of plants it operates. As is well known, one reason for operating multiple plants is production costs that are increasing at the margin. Another reason is the existence of economies of multiplant operations, cost savings that result solely from the operation of multiple plants. Theoretical and empirical support for these economies is mixed. Theoretically, the operation of multiple plants can allow savings in nonproduction costs, such as transportation, distribution, and inventory. It can also allow economies of massed reserves, cost savings associated with retaining proportionately fewer spare parts, backup machines, and repair persons in reserve. Information sharing between plants can reduce production and adoption costs. However, multiplant operation can also result in greater information costs. Van Zandt and Radner (2001) show that, because information processing takes time, computational constraints limit the amount of information that can be used in reaching a decision. This informational crowdingout effect can result in decreasing returns to size, or diseconomies of multiplant operation. In a seminal, wideranging study, Scherer et al. (1975) find little empirical evidence in support of multiplant economies. Moreover, when these economies do exist, they involve savings in nonproduction costs. More recently, in their comparative study of the performance of light water nuclear reactor power plants in the United States and France, Lester and McCabe (1993) do find empirical support for production cost savings due to information sharing about learningbydoing between plants. In his study of the adoption of thinslab casting in the steel industry, Ghemawat (1995, 1997) finds that Nucor achieved cost savings due to information sharing between plants regarding both construction and production costs. However, he attributes these multiplant economies to specific aspects of Nucor's organizational structure, and observes that other steel firms with different organizational structures did not achieve these same cost savings. Given these conflicting results, the analysis in this article assumes that, if there are multiplant economies, they take the form of savings in nonproductinn costs. In this case, a large firm need not have a greater incentive to adopt first. Its increase in profit from the adoption of a success in all its plants is certainly greater. However, the existence of nonproduction cost economies can reduce the incentive of a large firm to adopt first. Suppose there are two firms, a large firm with two plants and a small firm with one plant. Also suppose that, when the innovation is first installed in a plant, the adopter must shut down that plant for a finite amount of time in order to learn if the innovation is a success or not. This shutdown time can be considered as an experimental period during which the firm trains workers, produces and tests prototypes, attempts to overcome any problems associated with the use of the new technology, or makes any necessary changes in its existing organization. Thus, the opportunity cost of the first adoption of the innovation includes a "learning cost" in the form of profit foregone during this experimental period in which the firm attempts to adopt. This learning cost can be greater for a large firm if there are multiplant economies because the nonproduction cost savings from multiplant operation are lost, or at least reduced, when the firm shuts down one of its plants to adopt. That is, the profit lost when that plant is shut down exceeds the profit from production in either plant, and thus can exceed the profit lost by the small firm when it shuts down its plant to adopt. A twoperiod model of a duopoly faced with an exogenous innovation is analyzed. Adoption requires converting a plant to the new technology, which can succeed or fail to reduce production costs. Conversion is instantaneous and costless, but the first adoption does not reveal immediately if the innovation is a success. An initial adopter must spend a period experimenting with the new technology to learn about it. Its true nature is revealed to all at the end of this learning period. If it is a success, any remaining plants are converted and production begins immediately. If not, any converted plants are reconverted to the old technology. This initial learning cost provides an incentive to wait and learn about the innovation from the rival's adoption (see Adams and Mueller [1982] for examples in the U.S. steel industry). This incentive to free ride on rival adoption implies the subgame perfect Nash equilibrium in pure strategies must be either no adoption or a diffusion. In fact, for some probabilities of success, a diffusion is the unique subgame perfect Nash equilibrium. Joint adoption occurs only if both firms randomize in equilibrium, and in this case, a diffusion led by the large firm or the small firm is also possible. If there are no economies of multiplant operations, then the large firm leads the diffusion because the greater return from adoption of a success dominates. However, if there are such economies and the large firm's resulting learning cost disadvantage dominates, then the small firm leads the diffusion. By now there is a large theoretical literature on innovation adoption and diffusion. (1) These studies typically assume that firms are identical, and the exceptions do not focus on differences in the size of the firm. (2) A noteworthy exception is David (1969), who analyzes a capitalembodied new process with lower variable cost but higher fixed cost. He shows that a diffusion occurs if wages rise over time relative to capital costs (notice this is similar to Fellner's (1951) argument that a diffusion occurs as the cost of maintaining a plant with old technology rises over time). Large firms adopt sooner because larger output means larger labor savings. Most empirical studies simply conjecture large firms adopt first because they expect to earn more from adoption. For example, Davies (1979) assumes a firm adopts if the expected time to pay off the adoption cost is less than a critical payoff period. Thus, large firms adopt sooner because their higher profits allow them to pay off the adoption cost sooner. More recently, Ghemawat (1993) shows that large firms are first adopters of new processes with large enough minimum optimal scales, but small firms are first adopters of new processes with low enough minimum optimal scales. The analysis herein differs in that a large firm may be the first adopter of a new technology with a zero minimum optimal scale. The results of the analysis also contribute to the literature on multiplant firms. The most closely related studies are those that analyze exit in industries where demand and profit are steadily declining over time and find that small firm size is an advantage in this environment. Ghemawat and Nalebuff (1985) show that, given capacity constraints such that firms must produce at a fixed capacity or shut down, a large firm exits before a small firm. When firms have different numbers of equalsized plants, Whinston (1988) shows that the largest firms shut down plants first. The analysis herein is complementary because it focuses on a case where profit increases, in expectation, but a firm must shut down a plant to adopt initially. Because the large firm adopts first if there are no economies of multiplant operation, this analysis indicates that the results of these exit studies might differ if they were extended to allow for multiplant economies. The next section introduces the twostage adoption model and discusses the Nash equilibria in the periodtwo subgames. Section 3 derives initial subgame perfect adoption behavior and the conditions under which each firm leads the diffusion. An algebraic example with linear demand and quadratic cost is also provided. Section 4 concludes and discusses the implications of relaxing several of the key assumptions. Proofs are relegated to an Appendix. 2. A Model of CostReducing Technology Consider a twoperiod model of a duopoly faced with an exogenously developed innovation. The new technology can succeed or fail in that it may or may not reduce marginal cost at all output levels. When it appears, each firm's common knowledge estimate of the probability of success is p [member of] [0, 1]. The large firm L has two plants and the small firm S has one plant. The innovation is adopted by converting a plant to the new technology. That is, the innovation must be a new machine, technique, or process that can be grafted onto existing plants. Conversion at any date is instantaneous and costless, but the first adoption by either firm does not immediately reveal if the innovation succeeds or fails. After the initial adoption, the adopter must spend one period experimenting with the new technology to learn if it succeeds or not. Thus, although conversion is costless, initial adoption is not because it entails a learning cost, the loss of profit from shutting a plant down to experiment with the innovation and learning whether it succeeds. The true nature of the innovation is revealed to all at the end of the learning period. (3) If it succeeds, production with it can begin immediately. If it fails, production can occur only if that plant is reconverted to the old technology. Assumption 1 Converting a plant to the new technology or reconverting it back to the old technology is instantaneous and costless. After initial adoption, one period is required in order to determine whether the innovation is a success or failure. No production can occur in the plant during this period. At the end of this period, the success or failure of the innovation becomes common knowledge. This assumption is made to focus the analysis sharply on the effects of the output and profit reduction required to adopt and learn about the innovation. As noted above, it is reasonable to assume that adoption of a new technology requires some experimentation, during which time production is halted or at least reduced. Thus, one can also view this as an extreme version of adjustment costs in investment. That is, this assumption could be replaced with the following. Conversion requires a lumpsum cost and adoption at any date takes time, during which the plant is shut down or output is substantially reduced, but the initial adoption takes longer than subsequent ones (e.g., due to learning by doing in adoption). Now consider the natural assumptions on flow profits. Each firm's profit at any date depends on how it and its rival use their plants. Each plant can be operated with the new technology, if it succeeds (denoted by n), operated with the old technology (denoted by 0), or shut down (denoted by d). Let L's two plants be denoted by ([a.sub.1],[a.sub.a]) and S's plant be denoted by [a.sub.3], where [a.sub.j] [member of] {n, o, d} for j = 1, 2, 3. Then the industry profile of plants ([a.sub.1], [a.sub.2], [a.sub.3]) determines flow profits, which are written as [[PI].sub.L]([a.sub.b], [a.sub.2]; [a.sub.3]) for L and [[PI].sub.S]([a.sub.1], [a.sub.2]; [a.sub.3]) for S. First assume that preinnovation profit is positive, old plants can be always operated at a profit (i.e., a success is not drastic), and profit from a plant shut down is zero. Assumption 2 Plants with the old technology earn positive profit if operated, but zero profit if shut down. (a) [[PI].sub.L](o, o; [a.sub.3] > [[PI].sub.L](o, d; [a.sub.3]) > [[PI].sub.L](d, d; [a.sub.3]) = 0 for any [a.sub.3]. (2) [[PI].sub.S]([a.sub.1], [a.sub.2]; o) > [[PI].sub.S]([a.sub.1], [a.sub.2]; d) = 0 for any ([a.sub.1], [a.sub.2]). Next, assume plants with the same technology are identical. This has two implications. Given [a.sub.3], the plant profiles ([a.sub.1], [a.sub.2]) and ([a.sub.2], [a.sub.1]) must give L and S the same profit. Similarly, if each firm operates one plant with the same technology, then each should earn the same profit. Assumption 3 Plants with the same technology are identical. (a) [[PI].sub.i]([a.sub.1], [a.sub.2]; [a.sub.3]) = [[PI].sub.i]([a.sub.2], [a.sub.1]; [a.sub.3]) for all ([a.sub.1], [a.sub.2]; [a.sub.3]) and i = S, L.) (b) [[PI].sub.L]([a.sub.1], d; [a.sub.3]) = [[PI].sub.S ([a.sub.1], d; [a.sub.3]) for [a.sub.1] = [a.sub.3] = n or [a.sub.1 = [a.sub.3] = o. Finally, assume adoption of a successful new technology has the usual effects of increasing the adopter's profit but decreasing a nonadopter's profit. Assumption 4 If the innovation succeeds, adoption in one plant increases the adopter's profit and decreases its rival's profit. Adoption in all plants increases profit for both firms, but the increase is greater for the large firm. (a) [[PI].sub.L](n, [a.sub.2]; [a.sub.3]) > [[PI].sub.L](o, [a.sub.2]; [a.sub.3]) for all [a.sub.2] and [a.sub.3]. (b) [[PI].sub.S]([a.sub.1, [a.sub.2]; n) > [[PI].sub.S]([a.sub.1], [a.sub.2]; o) for all ([a.sub.1], [a.sub.2]). (c) [[PI].sub.L]([a.sub.1], [a.sub.2]; n) < [[PI].sub.L]([a.sub.1], [a.sub.2]; o) for all ([a.sub.1], [a.sub.2]). (d) [[PI].sub.S](n, n; [a.sub.3]) [[PI].sub.S](n, o; [a.sub.3) < [[PI].sub.s](o, o; [a.sub.3]) for all [a.sub.3]. (e) [[PI].sub.i](n, n; n) > [[PI].sub.i]s (o, o; o) for i = S, L. (f) [[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o) > [[PI].sub.S](n, n; n) = [[PI].sub.S](o, o; o). Three remarks are in order. First, Assumptions 2(a), 3, and 4(a) imply that, under the same technology, the large firm earns greater flow profit than the small one. Assumption 4(f) says that the large firm's gain from adoption of a success in both of its plants exceeds the small firm's gain in its one plant. Nothing else is assumed about how firm size affects flow profits at this time (this issue is deferred until section 3). Second, these profits should be viewed as the Nash equilibrium profits of a static market game in which firms maximize profit by choosing outputs if they produce a homogeneous product or by choosing either outputs or prices if they produce differentiated products. That is, assuming the firms earn positive profits is not very restrictive because it roles out only Bertrand competition with a homogeneous product. Third, Assumptions 24 can be derived from any of several algebraic examples of such duopolies. One example with linear demand and quadratic cost is presented in section 3. To find the subgame perfect Nash equilibria, the analysis begins with determination of Nash equilibrium behavior in each of the subgames that can arise in period two. First note that, if neither firm adopts initially, then neither firm adopts in period two either. If a firm did adopt in period two, the required learning period under Assumption 1 implies a firm cannot benefit from a success until the next period, so no firm adopts in the "last" period. (4) In this case, periodtwo profits are [[PI].sub.L](o, o; o) and [[PI].sub.S (o, o; o). Suppose instead that the innovation is adopted initially in at least one plant, so its success or failure becomes common knowledge at the end of period one. Then Assumptions 1 and 2 guarantee that, if the innovation is a failure, then both firms revert to the old technology for period two and again earn profits [[PI].sub.L](o, o; o) and [[PI].sub.S](o, o; o). If the innovation is a success, then Assumption 4 implies industrywide adoption of the new technology because conversion of the remaining plants is costless and so the firms earn profits [[PI].sub.L](n, n; n) and [[PI].sub.S](n, n; n) in period two. 3. Initial Equilibrium Behavior Given the Nash equilibrium subgame payoffs when the true state is revealed, the payoffs to the entire game can now be written in reduced form as functions of only the firms' actions in period one. Let firm i's action (pure strategy) be [[sigma].sub.i], the number of plants in which it adopts, so [[summation of].sub.L] = {0, 1, 2} and [[summation of].sub.S] = {0, 1} are the strategy sets. Let [P.sub.L]([[sigma].sub.L], [[sigma].sub.S]) and [P.sub.s]([[sigma].sub.L], [[sigma].sub.S]) be the expected payoffs to L and S from period one actions ([[sigma].sub.L], [[sigma].sub.S]), given equilibrium behavior in the periodtwo subgames. A subgame perfect Nash equilibrium (SPNE) in pure strategies for this dynamic adoption game is then a periodone strategy pair ([[sigma].sup.*.sub.L.], [[sigma].sup.*.sub.S]), and corresponding equilibrium behavior in the periodtwo subgames, such that [P.sub.L]([[sigma].sup.*.L], [[sigma].sup.*.sub.S]) [greater than or equal to] [P.sub.L]([[sigma].sub.L], [[sigma].*.sub.S.]) for all [[sigma].sub.L] [member of] [[summation of].sub.L] and [P.sub.S]([[sigma].sup.*.L], [[sigma].sup.*.sub.S]) [greater than or equal to] [P.sub.S]([[sigma].sup.*.sub.L], [[sigma].sub.S]) for all [[sigma].sub.S] [member of] [[summation of].sub.S]. These payoffs are the discounted profits from both stages, expected in period one, when the common estimate of the probability of success is p [member of] [0, 1]. If the common discount factor is [beta] > 0, then (1) [P.sub.L](2, [[sigma].sub.S]) = [beta][p[[PI].sub.L](n, n; n) + (1  p)[[PI].sub.L](o, o; o)] for all [sigma].sub.S], (2) [P.sub.L](1, 1) = [[PI].sub.L](o, d; d) + [beta][p[[PI].sub.L](n, n; n) + (1  p)[[PI].sub.L](o, o; o)], (3) [P.sub.L](1, 0) = [[PI].sub.L](o, d; o) + [beta][p[[PI].sub.L](n, n; n) + (1  p)[[PI].sub.L](o, o; o)], (4) [P.sub.L](0, 1) = [[PI].sub.L](o, o; d) + [beta][p[[PI].sub.L](n, n; n) + (1  p)[[PI].sub.L](o, o; o)], (5) [P.sub.S]([[sigma].sub.L], 1) = [beta][p[[PI].sub.L](n, n; n) + (1  p) [[PI].sub.S](o, o; o)] for all [[sigma].sub.L], (6) [P.sub.S](1, 0) = [[PI].sub.S](o, d; o) + [beta][p[[PI].sub.S](n, n; n) + (1  p)[[PI].sub.S](o, o; o)], (7) [P.sub.S](2, 0) = [[PI].sub.S](d, d; o) + [beta][p[[PI].sub.S](n, n; n) + (1  p)[[PI].sub.S](o, o; o)], and (8) [P.sub.i](0, 0) = [[PI].sub.i](o, o; o) + [beta][[PI].sub.i](o, o; o] for i = S, L. The learning cost implies that the SPNE cannot have initial adoption either by L in both plants or by both L and S in one plant. That is, L never adopts in both plants because it learns as much from adoption in one plant and it earns positive profit from operating its second plant with the old technology, [P.sub.L](1, [[sigma].sub.S]) > [P.sub.L](2, [[sigma].sub.S]) for all [[sigma].sub.S] by Assumption (2a). Moreover, because L can learn from S's adoption, L never adopts if S does, [P.sub.L](0, 1) > [P.sub.L](1, 1) > [P.sub.L](2, 1), again by Assumption (2a). Analogously, S never adopts if L does because [P.sub.S](1, 0) > [P.sub.S](1, 1) by Assumption (2b). Thus, there are only three possible periodone outcomes (in pure strategies) in the SPNE: no adoption, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 0); L adopts initially in one plant but S does not, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (1, 0); or S adopts initially but L does not, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 1). Essentially, the issue is which firm, if any, adopts first. Hence, it is essential to compare each firm's incentive to adopt first, [F.sub.L](p) = [P.sub.L](1, 0)  [P.sub.L](0, 0) and [F.sub.S](p) = [P.sub.s](0, 1)  [P.sub.S](0, 0), where (9) [F.sub.L](p) = [beta]p[[[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)]  [[[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o)] and (10) [F.sub.S](p) = [beta]p[[[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o)]  [[PI].sub.S](o, o; o). In each of these, the first term is the discounted expected gain from a success. The second term is the learning cost of initial adoption. Each firm has an incentive to adopt initially only if the expected gain exceeds this certain learning cost. Naturally, neither firm adopts initially if the innovation is a certain failure (p = 0) because the learning cost is paid, and there is no expected gain, [F.sub.L](0) = [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o)] < 0 and [F.sub.S](0) = [[PI.].sub.S](o, o; o) < 0. However, even if it is a certain success (p = 1), L adopts initially only if (11) [beta][[[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)] > [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o), and S adopts initially only if (12) [beta][[[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o)] > [[PI].sub.S](o, o; o). These conditions reemphasize the critical role of the length of the learning period in determining whether adoption occurs. That is, although this model has two periods, these periods do not need to be of equal length. One can interpret higher values of [beta] as corresponding to innovations for which the learning period is short compared with the useful life of a success. Under this interpretation, the only restriction on [beta] is that it is positive because it can exceed 1 if the learning period is short enough. (5) Indeed, it is possible that [beta] > 1 is a necessary condition for Equation 11 and/or Equation 12 to hold, and thus for adoption to occur for any p. THEOREM 1. Under Assumptions 14, the only possible outcomes in period one of the SPNE are as follows: i. If Equation 11 holds and Equation 12 does not, there exists unique [p.sub.L] [member of] (0, 1) such that no firm adopts if p [less than or equal to] [p.sub.L] and L adopts in one plant if p [greater than or equal to] [p.sub.L]. ii. If Equation 12 holds and Equation 11 does not, there exists unique [p.sub.S] [member of] (0, 1) such that no firm adopts if p < [p.sub.S] and S adopts if p [greater than or equal to] [p.sub.S]. iii. If Equation 11 and Equation 12 both hold, then no firm adopts if p [less than or equal to] min{[p.sub.L], [p.sub.S]}, L adopts in one plant if p [greater than or equal to] [p.sub.L], S adopts if p [greater than or equal to] [p.sub.S], and L adopts with probability [[mu].sub.L](p) and firm S adopts with probability [[mu].sub.S](p) if p [greater than or equal to] max {[[p.sub.L], [p.sub.S]}, where [[mu].sub.L](p) [member of] (0, 1) for p [member of] ([p.sub.S], 1) and [[mu].sub.S](p) [member of] (0, 1) for p [member of] ([p.sub.L], 1). For low probabilities of success, p < min {[p.sub.L], [p.sub.s]}, the unique SPNE is neither firm adopts initially and the innovation is never adopted. For intermediate probabilities, the unique SPNE is one firm adopts initially: L if [p.sub.L] < [p.sub.S] and p [member of] ([p.sub.L], [p.sub.S]), and S if [p.sub.S] < [p.sub.L] and p [member of] ([[p.sub.S], [p.sub.L]). But for high probabilities, p > max {[p.sub.L], [p.sub.S]}, there are multiple SPNE: either firm adopts initially or both randomize and adopt with positive probability. That is, the pure strategy SPNE involves no adoption or a diffusion. It is worthwhile to note that, for a success, a diffusion is the unique SPNE outcome for intermediate probabilities of success. If [p.sub.S] < [p.sub.L] and p [member of] ([p.sub.S], [p.sub.L]), then S adopts in period one and L follows by adopting in both plants in period two. This is an intraindustry diffusion led by S. But if [p.sub.L] < [p.sub.S] and p [member of] ([p.sub.L], [p.sub.S]), then L adopts in one plant in period one and it adopts in its second plant and S follows in period two. This is an intrafirm diffusion within L as well as an intraindustry diffusion led by L. Of course, these outcomes can also occur for high probabilities, but not uniquely (in the mixed strategy equilibrium, initial adoption by neither firm or both firms can occur with positive probability). Nevertheless, a diffusion occurs as the unique SPNE for some p unless [p.sub.L] = [p.sub.S], an unlikely razor' sedge case in this model. Hence, it is natural to say that L is more likely to lead a diffusion if [p.sub.L] < [p.sub.S] and S is more likely to lead if [p.sub.S] < [p.sub.L]. Whether L is more likely to lead depends on the relative magnitudes of the learning costs. To see this, notice that (13) [p.sub.L] = [[[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o)]/[beta][[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)] and (14) [p.sub.s] = [[PI].sub.S](o, o; o)/[beta][[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o)]. In each of these, the numerator is the firm's learning cost, while the denominator is its discounted gain from industrywide adoption of a success in period two. Because L's gain from adoption is larger by Assumption 4(f), it is more likely to lead unless its learning cost is also larger. The crucial issue now is the effect of L's larger size on the relative magnitudes of the learning costs. Notice, however, that, if there are no economies of multiplant operation, then L's learning cost is smaller because it loses less profit from its second plant than S loses from shutting down its only plant, (15) [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o) < [[PI].sub.S](o, o; o). L is then more likely to lead because Equation 15 implies [p.sub.L] < [p.sub.S]. To see this, consider the outcome when each firm has one (identical) plant, so they produce the same amount and earn the same duopoly profit. If one firm opens another identical plant, then in the new equilibrium it does not simply double its previous output and act as a triopolist. Although its total output increases, its output per plant falls and is less than the other firm's output. Similarly, its total profit increases, but profit per plant falls and is less than the other firm's profit. This property holds in several specific algebraic market models, including the one presented below. As noted above, the evidence on economies of multiplant operation is mixed and tends to support such economies primarily when they involve nonproduction cost savings. If there are such cost economies, then they are lost, or at least reduced, when L shuts down one of its plants to adopt initially. These lost economies increase L's learning cost above production profit in the plant shut down. However, if there are no such economies or if they are not substantial, then L's learning cost is smaller and it is more likely to lead. THEOREM 2. Under Assumptions 14, the large firm is more likely to lead if there are no economies of multiplant operation or if they exist but are small. Conversely, if economies of multiplant operation are substantial, then L's learning cost is greater, in which case S is more likely to lead. THEOREM 3. Under Assumptions 14, the small firm is more likely to lead only if there are substantial economies of multiplant operation. Note well, however, that a greater learning cost for L is not sufficient to guarantee that S is more likely to lead. L must have not only a greater learning cost, but ",also a learning cost disadvantage that outweighs its gain from industrywide conversion of a success. That is, [[p.sub.S] < [p.sub.L] only if [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o) > {[[[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)]/[[PI].sub.S](n, n; n)  [[PI].sub.S(o, o; o)]}[[PI].sub.S(o, o; o) > [[PI].sub.S](o, o; o). It is interesting that this condition does not depend on the discount factor [beta]. Hence, although the length of the learning period is critical in determining whether adoption occurs, as noted above, it does not affect which firm is more likely to lead. Example. The firms produce a homogeneous good with inverse demand P = A  ([q.sub.1] + [q.sub.2] + [q.sub.3]), where P is price; [q.sub.i] is L's output in plant i = 1, 2; [q.sub.3] is S's output; and A > 0 is a constant. Their cost functions are [C.sub.L] = ([k.sub.1][q.sub.1] + [q.sup.2.sub.1]) + ([k.sub.2][q.sub.2] + [q.sup.2.sub.2]) and [C.sub.S] = [k.sub.3][q.sub.3] + [q.sup.2.sub.3] where [k.sub.i] = k for old technology and [k.sub.i] = k  [epsilon] for a success, and k and [epsilon] are constants such that A > k > [epsilon] > 0. Note that L's cost function assumes no multiplant economies. Then [q.sub.i](o, o; o) = (3/22)(A  k) for i = 1, 2; [q.sub.3](o, o; o) = (2/11)(A  k); [[PI].sub.L](o, o; o) = 6[q.sub.i][(o, o; o).sup.2]; and [[PI].sub.S](o, o; o) = 2[q.sub.3][(o, o; o).sup.2.] Similarly, [q.sub.i](n, n; n) = (3/22)(A  k + [epsilon]) for i = 1, 2; [q.sub.3](n, n; n) = (2/11)(A  k + [epsilon]); [[PI].sub.L](n, n; n) = 6[q.sub.i][(n, n; n).sup.2]; and [[PI].sub.S](n, n; n) = 2[q.sub.3][(n, n; n).sup.2]. Finally, [q.sub.1](o, d; o) = [q.sub.3](o, d; o) = (1/5)(A  k) and [[PI].sub.L](o, d; o) = [[PI].sub.S](o, d; o) = 2[q.sub.1][(o, d; o).sup.2]. It then follows that [[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o) = (27/242)[2(A  k)[epsilon] + [[epsilon].sup.2]] > [[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o) = (16/ 242)[2(A  k)[epsilon] + [[epsilon].sup.2]], [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o) = (191/6050)[(A  k).sup.2] < [[PI].sub.S](o, o; o) = (400/6050)[(A  k).sup.2], and [p.sub.L] = (191/675){[(A  k).sup.2]/[2[(A  k)[epsilon + [[epsilon].sup.2]]} = (191/675)[p.sub.S] < [p.sub.S]. 4. Conclusions This article shows diffusions must occur, at least for some probabilities of success, if firms are not identical in size, if a plant must be shut down to adopt initially and learn about the innovation, and if adoption reveals success or failure to all after the learning period. Economies of multiplant operation do not necessarily imply the large firm is more likely to lead a diffusion. Such economies can imply that the learning cost (the profit lost from shutting down a plant) is greater for the large firm. Hence, the small firm can adopt first if such economies exist and are large enough. However, multiplant operation also implies that the large firm's profit increase from adoption is greater, so it leads if the learning cost is not significant. Thus, this article shows diffusions are more likely than joint adoption when firms are not identical. It also provides an intuitively appealing reason for why large firms tend to adopt first, but need not always do so. The analysis is easily extended in several directions. First, the same results hold for drastic innovations when Assumptions 3 and 4 are modified appropriately. The results also hold if the assumption of a learning period is replaced by the assumption that conversion always takes time (during which a plant is shut down), but the initial conversion takes longer than subsequent ones. This extension is more cumbersome because differential times for conversion and adoption make the use of discrete time somewhat more arbitrary, although a switch to continuous time as indicated in footnote 5 is simple. In this case, there is also a profit loss when a firm adopts after the new technology is revealed to be a success. The main difference, however, is that initial adoption now provides a future gain. If a firm adopts first in a plant and the innovation succeeds, then that plant is operated in the conversion period after success is revealed (rather than shut down to be converted then). Thus, both may adopt initially unless the initial profit loss outweighs this future gain for both firms. Nevertheless, L's advantage in adoption of a success persists, so it is more likely to lead unless multiplant economies give it a large learning cost disadvantage. Finally, the assumption that adoption by one firm reveals the true state to its rival may seem too strong. Although firms do often have an incentive to share such information, it may he impossible for them to do so. Differences in organizational structure apparently prevented large, integrated steel firms from free riding on adoption by their smaller rivals who operated minimills (Ghemawat 1995, 1997). One could analyze this by assuming initial adoption by a firm required a learning period even if its rival has already adopted. However, because the large firm should be able to free ride on information provided by its own adoption, the only change would be that the small firm's incentive to wait and learn from rival adoption becomes zero, which increases its incentive to adopt first. Because the large firm still gains more from adoption in both plants, it is still more likely to lead unless multiplant economies give it a large learning cost disadvantage (although first adoption by the small firm is now more likely, in general). Appendix Subgames When Success Revealed If L adopted in one plant and S adopted, then L adopts in its other plant in period two if [[PI].sub.L](n, n; n) [greater than or equal to] [[PI].sub.L](n, o; n). If S adopted but L did not, then L adopts in both plants in period two if [[PI].sub.L](n, n; n) [greater than or equal to] max {[[PI].sub.L](n, o; n), [[PI].sub.L](o, o; n)}. If L adopted in both but S did not, then S adopts in period two if [[PI].sub.S](n, n; n) [greater than or equal to] [[PI].subs](n, n; o). Finally, if L adopted in one plant but S did not, then in period two, they simultaneously decide to adopt or not. Notice S adopts if [[PI].sub.S](n, o; n) [greater than or equal to] [[PI].sub.S](n, o; o) and [[PI].sub.s](n, n; n) [greater than or equal to] [[PI.].sub.S](n, n; o) (i.e., adopt is strongly dominant for S). In this case, L's best reply is to adopt if [[PI].sub.L](n, n; n) [greater than or equal to] [[PI].sub.L](n, o; n). All these inequalities follow from the conditions in Assumption 4. PROOF OF THEOREM 1. From Equations 13, [P.sub.L](1,0)  [P.sub.L](2, 0) = [[PI].sub.L](o, d; o) > 0 and [P.sub.L](1, 1)  [P.sub.L](2, 1) = [[PI].sub.L](o, d; d) > 0 for all p [member of] [0, 1] by Assumption 3(a). This proves that neither ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S] = (2, 0) nor ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S] = (2, 1) can be a SPNE. It also proves that, if a mixed strategy SPNE does exist, then in it, L must place zero probability on initial adoption in both plants. Next, from Equations 2 and 46, [P.sub.L](0, 1)  [P.sub.L](1, 1) = [[PI].sub.L](o, o; d)  [[PI].sub.L](o, d; d) > 0 and [P.sub.S](1, 0)  [P.sub.s](1, 1) = [[PI].sub.S](o, d; o) > 0 for all p [member of] [0, 1] by Assumption 3. Hence, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (1, 1) cannot be a SPNE for any p [member of] [0, 1]. The only remaining pure strategy SPNE candidates are the following: neither firm adopts, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 0); L adopts in one plant but S does not, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (1, 0); and S adopts but L does not, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 1). Hence, ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (1, 0) is a SPNE if [P.sub.L](1, 0) [greater than or equal to] [P.sub.L](0, 0) and [P.sub.S](1, 0) [greater than or equal to] [P.sub.S](1, 1), [[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 1) is a SPNE if [P.sub.L[(0, 1) [greater than or equal] [P.sub.L](1, 1) and [P.sub.S](0, 1) [greater than or equal to] [P.sub.S](0, 0), and ([[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 0) is a SPNE if [P.sub.L](0, 0) [greater than or equal to] [P.sub.L] (1, 0) and [P.sub.S](0, 0) [greater than or equal] [P.sub.S](0, 1). But, as shown above, [P.sub.L](0, 1) > [P.sub.L](1, 1) and [P.sub.S](1, 0) > [P.sub.s](1, 1) for all p. Therefore, from Equations 9 and 10, [[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (1, 0) is a SPNE if [F.sub.L](p) [greater than or equal to] 0, [[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 1) is a SPNE if [F.sub.S](p) [greater than or equal to] 0, and [[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 0) is a SPNE if [F.sub.L](p) [less than or equal to] 0 and [F.sub.S](p) [less than or equal] 0. Finally, [[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (1, 0) is the unique SPNE if and only if [F.sub.L](p) > 0 > [F.sub.S](p), and [[sigma].sup.*.sub.L], [[sigma].sup.*.sub.S]) = (0, 1) is the unique SPNE if and only if [F.sub.S](p) > 0 > [F.sub.L](p). As noted in the text, using Equations 3, 5, and 8 10, one can show that Assumption 3 implies [F.sub.L](0) < 0 and [F.sub.S](0) < 0, while [F.sub.L](1) > 0 if and only if Equation 11 holds, and [F.sub.S](1) > 0 if and only if Equation 12 holds, Because [F.sub.L](p) is linear, if Equation 11 holds, then there exists a unique [p.sub.L] [member of](0, 1), defined by Equation 13, such that [F.sub.L]([p.sub.L])=0 and [F.sub.L](p)[??] 0 if and only if p [??] [p.sub.L]. Similarly, if Equation 12 holds, then there exists a unique [[p.sub.S] [member of] (0, 1), defined by Equation 14, such that [F.sub.S]([p.sub.S]) = 0 and [F.sub.S](p) [??] 0 if and only if p [??] [p.sub.S]. Finally, let D(p) = [P.sub.L](0, 1)  [P.sub.L])(1, 1) and E(p)=[P.sub.s](1, 0)  [P.sub.S](1, 1). Then a mixed strategy equilibrium in which L adopts in one plant with probability [[mu].sub.L] and S adopts with probability [[mu].sub.S] exists if and only if [[mu].sub.L] = {[F.sup.S]p)/[[F.sub.S](p) + E(p)]} [member of] (0, 1) and [[mu].sub.S] = {[F.sub.L](p)/[[F.sub.L](p) + D(p)]} [member of] (0, 1). Because D(p) > 0 for all p and E(p) > 0 for all p, it follows that [[sigma].sub.L] [member of] (0, 1) if and only if p > [p.sub.s] and [[sigma].sub.sup.S] [member of] (0, 1) if and only if p > [p.sub.L]. PROOFS OF THEOREMS 2 AND 3. From Equations 13 and 14, [P.sub.L]  [P.sub.S] has the sign of [[[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o)][[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o)]  [[PI].sub.S(o, o; o)[[[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)]. And because [[PI].sub.L](n, n; n)  [[PI].sub.L](o. o; o) > [[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o) by Assumption 4(f), it follows that [p.sub.L] < [p.sub.S] if and only if [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o) < { [[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)]/[[[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o)]}[[PI].sub.S](o, o; o). So Equation 15 is sufficient but not necessary for [p.sub.L] < [p.sub.S]. Similarly, [p.sub.S] < [p.sub.L] if and only if [[PI].sub.L](o, o; o)  [[PI].sub.L](o, d; o) > {[[[PI].sub.L](n, n; n)  [[PI].sub.L](o, o; o)]/[[[PI].sub.S](n, n; n)  [[PI].sub.S](o, o; o)]}[[PI].sub.S(o, o; o) > [[PI].sub.S](o, o; o), so the converse of Equation 15 is necessary but not sufficient for [p.sub.s] < [p.sub.L]. I thank Heidrun Hoppe, Mort Kamien, Rob Masson, Mark McCabe, and two anonymous referees for helpful comments and suggestions on earlier versions of this article. The research for this article was conducted in part while the author was visiting at the Department of Managerial Economics and Decision Sciences, Kellogg School of Management, Northwestern University. whose support is gratefully acknowledged. (1) See Reinganum (1989), Rogers (1995), Stoneman (1995), and Hoppe (2002) for surveys. (2) In decisiontheoretic studies, Jensen (1982, 1983) assumes firms differ in their prior beliefs; Bhattacharya, Chatterjee, and Samuelson (1986) assume they receive different information; and Jensen (1988) assumes they have different capacities to process information. The gametheoretic studies that derive diffusions by assuming an exogenous decease in the supply price of the innovation over time (see the surveys in footnote 1) differ in two important ways. First, in this study, the decrease in adoption cost is endogenous, occurring if and only if some firm adopts and the innovation succeeds. Second, these studies also assume identical firms, so when a diffusion occurs, it can be led by any firm. This study provides conditions under which a diffusion led by the large firm is a unique equilibrium and conditions under which a diffusion led by the small firm is a unique equilibrium. In an alternative approach, Stenbacka and Tombak (1994) assume an implementation period of uncertain length after adoption, where firms differ in their hazard rates for successful implementation. Goetz (2000) extends their work to show that the firm with the higher hazard rate adopts sooner and cams more, in expectation. However, neither of these studies draws conclusions about firm size and diffusion leadership. (3) That lone adoption by one firm reveals the true state to its rival may seem a strong assumption. However, for a wide range of possibilities in a similar adoption model, Jensen (1993) shows that firms would enter into an ex ante agreement to share ex post information about whether a new technology adopted is a success. The implications of relaxing this assumption are discussed further in the conclusion. (4) This is true for any game with a finite number of stages. However, it is worth noting that this result also holds in an infinite horizon model with learning period of length T > 0. In this case, if no firm adopts initially, then absent some duex ex machina (to tell the firms the truth or change their profits or conversion cost), nothing changes to induce a firm to adopt in the future. (5) For example, consider a continuoustime model in which the length of the learning period is T > 0, and the innovation can be used forever after. If the interest rate is r, then the learning period payoffs are discounted by (1  [e.sup.rT])/r and the remaining payoffs by [e.sup.rT]. An increase in T then increases the weight (1  [e.sup.rT])/r on the learning period payoffs and decreases that on the remaining payoffs. Moreover, the payoffs in Equations 18 can be obtained by dividing these continuoustime payoffs by (1  [e.sup.rT]) and setting [beta] = [e.sup.rT]/(1  [e.sup.rT]) This "modified" discount factor is decreasing in T and greater than 1 for all T < (log 2)/r. References Adams, Walter, and Hans Mueller. 1982. The steel industry. In The structure of American industry, edited by Walter Adams. New York: MacMillan, pp. 74125. Bhattacharya, Sudipto. Kalyan Chatterjee, and Larry Samuelson. 1986. Sequential research and the adoption of innovations. Oxford Economic Papers 38(Supp):21943. David, Paul. 1969. A contribution to the theory of diffusion. Stanford Center for Research in Economic Growth Memorandum No. 71. Davies, Steven. 1979. The diffusion of process innovations. New York: Cambridge University Press. Fellner, William. 1951. The influence of market structure on technological progress. Quarterly Journal of Economics 65:5607. Ghemawat, Pankaj. 1993. Commitment to a process innovation: Nucor, USX, and thinslab casting. Journal of Economics and Management Strategy 2:13561. Ghemawat, Pankaj. 1995. Competitive advantage and internal organization: Nucor revisited. Journal of Economics and Management Strategy 3:685717. Ghemawat, Pankaj. 1997. Games businesses play. Cambridge, MA: MIT Press. Ghemawat, Pankaj, and Barry Nalebuff. 1985. Exit. Rand Journal of Economics 16:18494. Goetz, Georg. 2000. Strategic timing of adoption of new technologies under uncertainty: A note. International Journal of Industrial Organization 18:36979. Hoppe, Heidrun C. 2002. The timing of new technology adoption: Theoretical models and empirical evidence. The Manchester School 70:5676. Jensen, Richard. 1982. Adoption and diffusion of an innovation of uncertain profitability. Journal of Economic Theory 27:18293. Jensen, Richard. 1983. Innovation adoption and diffusion when there are competing innovations. Journal of Economic Theory 29:16171. Jensen, Richard. 1988. Information capacity and innovation adoption, International Journal of Industrial Organization 6:33550. Jensen, Richard. 1993. Sharing cost information: A counterexample. Economic Theory 3:58992. Lester, Richard K., and Mark J. McCabe. 1993. The effect of industrial structure on learning by doing in nuclear power plant operation. Rand Journal of Economics 24:41838. Mansfield, Edwin. 1968, The economics of technological change. New York: Norton. Mansfield, Edwin, Anthony Romeo, Mark Schwartz, David Teece, Samuel Wagner, and Peter Brach. 1982. Technology transfer, productivity, and economic policy. New York: Norton. Reinganum, Jennifer. 1989. The timing of innovation: Research, development, and diffusion. In Handbook of industrial organization, edited by Richard Schmalensee and Robert Willig. New York: North Holland, pp. 849908. Rogers, Everett M. 1995. Diffusion of innovations. New York: Free Press. Scherer, Frederick M., Alan Beckenstein, Erich Kaufer, Dennis R. Murphy, and Francine BougeonMassen. 1975. The economics of multiplant operation: An international comparisons study. Cambridge, MA: Harvard University Press. Stenbacka, Rune, and Mikhel Tombak. 1994. Strategic timing of adoption of new technologies under uncertainty. International Journal of Industrial Organization 12:387411. Stoneman, Paul. 1983. The economic analysis of technological change. New York: Oxford University Press. Stoneman, Paul. 1995. Handbook of the economics of innovation and technological change. Oxford: Blackwell Publishers. Van Zandt, Timothy, and Roy Radner. 2001. Realtime information processing and returns to scale. Economic Theory 17:54575. Whinston, M. 1988. Exit with multiplant firms. Rand Journal of Economics 19:56888. Richard A. Jensen, Department of Economics and Econometrics, University of Notre Dame, Notre Dame, IN 465560783, USA; Email: rjensen1@nd.edu. 

Reader Opinion