Adaptive incrementalism and complexity: experiments with two-person cooperative signaling games.
There has been much literature written on decision making in decentralized environments with limited information. Decision makers are limited in both what they can control in the decision domain and what information they can obtain. Under these circumstances, decision makers adapt to decisions made by others in a decision process, not in a singular event. The overall outcome of the decision process is the collective result of the interactions among the decision makers. The process consists of sequences of many, often relatively small moves made by all players. Decision makers in this environment are said to "incrementally adapt."
The study of incremental adaptation spans several disciplines. Scholars of public budgeting explain the incremental allocations of budget resources from one year to the next using a model of interaction among the chief executive, public agencies, and the legislature (Wildavsky 1964; Crecine 1969; Davis, Dempster, and Wildavsky 1966). Similarly, public policy and bureaucratic-politics scholars have examined the "muddling through" and "disjointed incrementalism" of policy making among interest groups, executive agencies, and the legislature (Lindblom 1959; Bendor and Moe 1985; Rourke 1984). In most of these models and in organization theory, the literature has focused on "bounded rationality" and trial-and-error decision making among organizations and the environment (Simon 1957; March and Olson 1976). Jones (1999) has an excellent review.
In each case, incremental adaptation is explained by the complexity of the problems combined with the cognitive limitations, or bounded rationality, of decision makers (Simon 1957; Williamson 1985). In Crecine's (1969, 38) simulation of the budget process, for example, the entire budget process is viewed "as an organized means for the decision-maker to deal with the potential complexity of the budgetary problem." In contrast to the idealized model of homo economicus, boundedly rational individuals acquire information through trial-and-error processes, basing their decisions on past information, standard operating procedures, and projections from the current state of affairs. Often they "satisfice," that is, search for proximate, locally optimal, or even tolerably satisfactory solutions near the current alternatives. They constrain the decision space by "pruning" the decision tree of alternatives and often cannot form transitive preference orderings of alternatives, even when considering all of them (Simon 1957, chapter 14). Steinbrunner (1974), in comparing incrementalism to a thermostat control, indicates that goals become aspiration levels that rise or fall depending on the success or failure of past efforts, and there is no attempt rationally to integrate different values. March and Olson's (1976) "garbage can" models of institutional choice elaborate this notion. (1)
Despite its wide use as a concept across disciplines, the incrementalism literature lacks a firm connection with the literature in institutional economics. A major work such as Williamson (1985), however, demonstrates the considerable overlap between the two literatures, and, indeed, in his discussion of contracting he frequently cites Simon's (1957) bounded rationality. Although both literatures deal with very similar problems, they do so from different perspectives. The incrementalism literature in organization theory and public administration has emphasized the cognitive limitations of decision makers to cope with the ill-defined nature or enormity of problems. Presumably, with faster computers, better-educated and smarter decision makers, and better models, decision making would become less incremental. Even the quantitative political science literature on budgeting (e.g., Davis, Dempster, and Wildavsky 1966) focuses on the enormousness of the budget and on measuring the size of the increment, not on the politics of interaction among the actors. This point was raised by McCubbins and Schwartz (1984) when they considered two different monitoring styles for members of Congress, one of which looked incremental but was more consistent with their incentives, that is, chosen rationally.
In contrast, institutional economics emphasizes the strategic aspects of such situations more than cognitive limits. Coase (1937) provided the fundamental insight, namely, that transaction costs are the essential component for understanding the outcomes of joint decisions. An important element of transaction costs is the bargaining and strategy for producing coordinated outcomes. Similarly, Tsebelis (1989) shows how the actions of one decision maker alter the choices of the other decision maker through mutual interdependence, introducing probabilistic uncertainty through mixed strategies. This strategic aspect is found in early writings on incrementalism, such as Lindblom (1965) and Wildavsky (1964), but is discussed at a general level with examples and anecdotes. Careful definitions of strategic interaction are not provided at a level of detail sufficient to make testable statements. Empirical studies of incrementalism have tended to focus on measuring whether a decision process is incremental without showing why and how it occurs.
In this article we are concerned with how strategic interdependence explains incremental adaptation. We address this question by designing and implementing an experimental game that adds levels of risk and uncertainty to the strategic interaction of a decentralized decision situation. We are interested in determining how these aspects of complexity in the decision situation affect incremental adaptation in decision making. The remainder of the article is divided into six sections: (1) a rationale for the game we have chosen to examine, (2) details of the experimental design for testing the game's predictions, (3) the laboratory setting of the experiment, (4) the results of the experiment, (5) a discussion, and (6) the conclusion.
Incremental Adaptation in a Duopoly Game
The Duopoly Game
To study the effect of strategic interdependence on incrementalism requires a decision setting precise enough to make testable predictions. In The Intelligence of Democracy Lindblom (1965, 33) describes a two-person game for explaining the requirements for adaptive incremental adjustment. A decision maker, let us call him Xavier, simply adapts to the decisions of the other decision maker, whom we will name Yolanda, without directly negotiating or threatening a desired outcome. In effect, each decision maker decides in his or her own dimension and ignores the other dimension entirely, adjusts to the other dimension in order to improve his or her own outcome, or adjusts to his or her own advantage but shows some concern for the effects on the other decision maker. Lindblom limits the degree of discussion, negotiation, and bargaining between the two players in this most basic incremental adaptation game.
This kind of two-person game can be thought of as an institutional structure in which one person is allowed to make nonamendable proposals in one dimension and another player is allowed to make nonamendable proposals in another dimension. Suppose, for example, that the institutional rules let player Xavier choose an action from the set X, and player Yolanda has been handed the authority to choose an action from set Y. This can be thought of as a two-person game, but it can also be thought of as a generalized institutional structure. As Shepsle (1979) says, institutional rules allocate jurisdictional and amendment-controlling authority to various actors.
In this article, therefore, we are concerned with a two-person signaling game, which is comparable to a duopoly situation in economics. We examine this model because the setting that Lindblom (1965) describes seems to be that of two-person game theory and appears to be quite comparable to how duopolies adapt. In a duopoly, two firms that jointly serve a market must determine their outputs. Given a particular level of overall market demand, each firm decides how much to produce based on what it believes its competitor will produce, so the market clears.
Although a duopoly game emphasizes strategic interaction, we want this interaction to occur in the context of a complex problem. In Wildavsky's (1964) book The Politics of the Budgetary Process, for example, the author includes sections on calculation and strategy, emphasizing both the cognitive difficulty of making choices over the magnitude of the federal budget as well as the political strategy of decentralized interaction. To this end, we embed the duopoly game in a context that allows for several choice options for each play. We also build in a sequence of choices over time to establish a decision process, not just a singular choice. This multiple set of choices and the sequential decision process are held constant across games.
We then focus our attention on the risk and uncertainty of the strategic interdependence between the two players, which occurs even when the players understand the game completely. Situations of strategic interdependence involve varying degrees of risk and uncertainty. Risk is incurred when one's rival in the game chooses an option that is a gain for himself or herself, but when the choice is combined with one's own choice, the mutual result produces a loss for oneself, or vice versa. To add to the risk, it is possible to increase the size and alter the probability of loss. Uncertainty occurs when players have more than one reasonable strategy to play the game. To add to uncertainty, we can alter the structure of the payoff scheme in the game. With more than one plausible strategy, coordination of moves becomes more difficult, and players may not as readily understand their opponent's set of reactions. (2)
What kinds of decision behavior do we expect to find in the duopoly game? The basic game situation is one in which each player has private information not shared by the other player. Because a player does not know everything, it is necessary for the player to form beliefs about this private information in order to make choices. To try to predict the different forms of decision behavior in the duopoly game, therefore, we need to understand the beliefs that players hold about the probability of different moves by the other players under different conditions of uncertainty and risk. When it is feasible we also want to formalize a player's belief structure in order to better identify the player's strategic choices.
We hypothesize that, in situations in which strategic interaction entails relatively low risk and low uncertainty, simple maximizing rational choice algorithms will characterize adaptive behavior. Under these conditions, players can hold fairly confident beliefs about the probabilities of their rival's intentions. The process of converging to the equilibrium point, though requiring interdependent choices, presents no major uncertainties or risks. Some of the better strategic players will reason forward through the set of moves to the equilibrium point and either choose that point right away or follow a few steps to get there. We thus expect to find quick updating of information on the other player's moves, confident assumptions about the other player's likely next move, and a maximizing path to the equilibrium point.
The classic solution to the duopoly problem, which adopts this maximizing path, is associated with Augustine Cournot, a French economist who in 1837 addressed the particular two-person interactive game that arises in a duopoly. Fouraker and Siegel (1963), who conducted a set of experiments to test bargaining behavior in duopoly situations, found that under conditions of incomplete information the results supported the Cournot solution. This model thus has empirical support, matches the Lindblom description, is found in every economics textbook that treats the duopoly problem (e.g., Henderson and Quandt 1980 or Pindyck and Rubinfeld 2001), and is considered a standard solution for this kind of situation. Furthermore, the Cournot solution can easily be extended to predict a path of adjustment, which gives us a hypothetical baseline path against which to compare observed behavior.
Cournot assumed that each player would select the outcome that maximizes his own payoff given any known choice for his opponent. He labeled this set of choices a "reaction function." The reaction function notion requires that a player know what the opponent's choice will be. The assumption that a Cournot decision maker adopts is that "each duopolist acts as if his rival's output were fixed" (Henderson and Quandt 1980, 215). That is, each person assumes that his opponent will pick in the upcoming period exactly what he picked in the previous period. The assumption of the Cournot adjustment model, namely that one's opponent will make the same move in the next play as in the last play, is a rational assumption when no other information is available. The assumption allows for a maximizing strategy under constraint. The Cournot solution is, therefore, a game in which all probability is assigned to the statement "my competitor will play the same as last time."
The Cournot model, however, ignores the risks and uncertainties of interdependence. In the first play of the game, the Cournot player operates under uncertainty because the other player has not yet made any choice. The player in the first move will thus have little reason to assume anything other than a random opponent. But in the second play of the game, the Cournot player suddenly and completely switches from decision making under uncertainty to game-theoretic strategic decision making against a rational, self-interested, and therefore predictable, opponent. Under conditions of risk and uncertainty, this shift may not be reasonable.
Sequential Bayesian Updating
Bayesian updating is an alternative general framework in which to understand choices in decentralized games. A player's beliefs are represented by a subjective probability calculus, that is, by Bayesian probability (Dixit and Skeath 1999; Gintis 2000). The Bayesian player is less confident that the immediate prior move of the other player is 100 percent predictive of the next move. Having no prior knowledge of the other player's preferences at the start of the game, the Bayesian player begins with a uniform prior. It is assumed that she will choose the first move rationally based on a maximum expected value (MEV) criterion and hence will choose the MEV row. In the second play of the game, the Bayesian player will move based on the average of the prior probability distribution and the probability distribution of the first observation. For all subsequent plays of the game, the Bayesian player will update beliefs about probabilities based on all prior observations using a weighted average with recent moves weighted more than earlier moves. Over time the Bayesian players converge to the Nash equilibrium. As the mathematics is rather complex, the calculations for these Bayesian moves are provided in appendix 2. (3)
Cournot and sequential Bayesian convergence are examples of change over time and mutual adjusting behavior. Are they then examples of incrementalism? Observations of Cournot behavior in real-world decision settings may look like incrementalism because of the mutually adjusting behavior over time. Similarly, a sequential Bayesian player could choose incrementally, depending on the structure of the game. Following Bendor and Moe (1985), however, we want to include in the definition of incremental adaptive behavior more limited moves in the direction of greater utility. Thus adaptive incrementalism includes nonoptimizing behavior based on limited information, whereas Cournot behavior is based on maximizing behavior and confident predictions about the opponent's behavior. For this reason, we are led to try to understand what nonoptimizing, incremental adaptive behavior would look like and what conditions would cause it to occur.
We hypothesize that under conditions of greater uncertainty and risk, players will move in smaller steps because of the greater difficulty of establishing credibility and commitment (Dixit and Skeath 1999, 302-13). They will attempt to take small steps to establish a pattern of interaction and be rather cautious about updating information about the decision pattern of the other players. Hence they will not move away from their initial choices very quickly, and when they do move away, they will move in a nonoptimizing adaptive fashion, with smaller steps than might be predicted by rationality assumptions. According to Schelling (1960, 85), players attempt to create stable expectations about the other player's motives and intentions in order to establish mutual trust that the rival will not double-cross or renege on cooperative moves. He writes, "Players have to understand each other, to discover patterns of individual behavior that make each player's actions predictable to the other; they have to test each other for a shared sense of pattern or regularity." This stability of"convergent expectations" depends on credibility, which can be enhanced through signaling, demonstrating commitment to cooperation, "trial balloons," attempts at consistent behavior, and a series of small agreements (Dixit and Nalebuff 1991). It is in this area of cautious moves and attempts at creating stable expectations under risk and uncertainty where we hypothesize that incremental adaptive behavior occurs.
A second type of outcome predicted by incremental theory is a satisficing one. If participants act in a satisficing way, they will incrementally move toward the Nash equilibrium but halt at a suboptimal outcome where both players are minimally accepting of the rewards they receive. Why might this occur? The nature of the risks and uncertainties might be such that both players derive a medium payoff with only small risk, yet moving farther toward higher mutual payoffs entails larger risks. Due to the uncertainty of keeping a mutual understanding intact while moving through the more dangerous larger risks, the players satisfice on a less than optimal outcome.
We maintain that players behave in this incremental and satisficing fashion due to the risk and uncertainty associated with making interdependent decisions. The risks and uncertainties of misjudging rivals' or partners' choices can lead to smaller, cautious steps toward the Nash equilibrium than Cournot rationality assumptions predict and in some situations a failure to reach the Nash equilibrium at all.
To systematically examine these ideas about how strategic interdependence influences decision making, we run a set of games in a factorial experiment (Oehlert 2000). In each, the experimental unit is a pair of players who are presented with a decision task based on the duopoly model (with some alterations). They have a set of preferences but do not know their partner's preferences and, because the experiment is blind, cannot communicate except by making simultaneous moves. They play for a fixed number of moves unknown in advance of play. Each move is recorded.
Table 1 shows an iterative, two-person game (Knott and Miller 1992). There are several features of this game that simulate the uncertain, complex, and limited information environment that Lindblom and others identify with incremental decision making. First, the game involves strategies that are continuous variables such that there is a series of best response (maximizing) reactions by each player to the choices made by the other player. The matrix is a set of payoffs in which the row chooser, Yolanda, has fifteen alternatives to choose from. The payoffs are a function of the row choice and the column choice. The game is also a symmetrical game in which the column chooser, Xavier, has the same matrix of payoffs, only transposed. That is, if Xavier picks 3 and Yolanda picks 5, then Xavier gets the same payoff that Yolanda would get if Xavier picked 5 and she picked 3. In addition, the game involves simultaneous moves by the players, thus creating a decision environment of imperfect information. In the plays of the game there is no communication except the past choices of the players, neither player knows the other player's payoff charts, and they are provided with no information that would provide this knowledge during the play of the game. The only information provided is the past plays of the other player. (4)
In this game we are concerned with the strategic interdependence of the players. In particular, we examine two factors--uncertainty and risk--both of which should lead players to behave more incrementally, that is, make smaller moves toward the Nash equilibrium. By uncertainty, we mean that players are given more than one reasonable strategy to play. By risk, we mean that players face higher negative payoffs for being outside the reaction curve. The table 1 game sheet shows the row and column players' payoffs. The column player's payoffs, which are indicated in boldface type, are a transpose of the row player's, which are underscored. (5) Note that the players in the game only saw their own payoffs, not those of their partners. The Nash equilibrium play is (14, 14), whereas row 7 has the highest average value. In addition, we provide an extra column and an extra row (unavailable to the players) that show the average of the payoffs in each row and column. The Cournot reaction curves in table 1 are indicated by an asterisk. Because of symmetry, the reaction curves cross at (14, 14). The reaction curves shown are derived by simple maximizing behavior. That is, for any possible choice by one's partner, the reaction curve indicates the best possible choice of the row chooser.
We developed four game sheets representing the treatment conditions by crossing the two factors. (These game sheets are provided in appendix 1.) The sheet shown in table 1 and in appendix 1(a) represents the U+R+ condition of high uncertainty and high risk. The high negative numbers off the reaction curve raise the level of risk to players, depending on the play of their opponents. The separation of the maximum average value row from the Nash row introduces alternative strategies for playing the game. In the low uncertainty condition (appendix 1[b]), the row with the highest average value is moved from row 7 to row 14 (the Nash row), while maintaining the reaction curve. To do this, we altered the values of cells (14, 9) to (14, 13) by making them much larger. This preserved the structure outside the reaction curve but made row 14's average value the largest on the table. This change should reduce the strategic uncertainty for the players by making moves toward the Nash the only reasonable strategy for maximizing payoffs. In the low risk case (appendix 1 [c]), all negative payoffs were reduced to 25 percent of their original values (and rounded to the nearest "nice" number). Thus in the U+R- condition, the negative payoffs were 25 percent of the original, but row 7 had the highest average. In the U-R+ condition the exact opposite was true, and in the U-R- (appendix 1[d]) condition, both the negative payoffs were reduced to 25 percent of the original, and the highest average row and the Nash row were the same. Players made sequential moves on sheets of paper, and the investigators provided updates on each play, to a total of fourteen plays, though players were not told this number in advance. (6)
Can we predict how the individuals would play the game? The Cournot solution rests on quick updating and confident assumptions about the opponent's choices. Again, a Cournot decision maker adopts the view that his rival's output is fixed. That is, each person assumes that his opponent will pick in the upcoming period exactly what he picked in the previous period. Thus if we assume that both people pick outcome 2 in the first period, both would pick outcome 7 in the second period and outcome 12 in the third period. Both players would arrive at the Nash equilibrium at outcome 14 at the fourth period. We thus offer the following hypotheses:
H1a: With low risk and uncertainty, players will follow a Cournot rational-maximizing strategy, which will bring them to the Nash equilibrium in four plays of the game.
H1b: Sophisticated players will reason through the four plays and choose the Nash equilibrium row (14, 14) in the first play of the game.
With high risk and uncertainty, decision makers will more cautiously update their assessment of the probability of choices by their opponent. If there is a growing risk in moving toward the Nash equilibrium and the row chooser is risk averse, then making confident predictions about the column chooser's choices may seem too risky. This is especially the case if there are potential high costs in making an incorrect prediction. (In the game, these costs would be indicated by negative or low positive payoffs.)
Under these conditions, a sequential Bayesian player may prefer to make an assessment of the average of the prior moves of the other player. As play proceeds, he may prefer to give more weight to recent moves by his opponent and less weight to earlier moves. Under this kind of belief structure, there is no single predicted path for all games. In games in which the Nash equilibrium row and the MEV row are the same, the sequential Bayesian player is likely to choose the Nash row on the first move. When the MEV row is separate from the Nash equilibrium row, the sequential Bayesian player will take longer than the Cournot player to reach the Nash equilibrium. Based on calculations of Bayesian probability, we offer two additional hypotheses:
H2a: Under conditions of high risk and uncertainty, the sequential Bayesian players will average probabilities of prior moves by the other player, which will take them six plays to reach the Nash equilibrium.
H2b: In games in which the MEV row and the Nash equilibrium row are the same, sequential Bayesian players will choose the Nash equilibrium on the first play of the game.
For some incremental players, risky and uncertain conditions may convince them that it would be better to stick to a safe alternative (a row with no or very few negative numbers). The row choosers may prefer to stick with their original risk-averse or risk-neutral choices based on the assumption that any column choice is equally likely. They are reluctant to "update" their original uncertain expectations about column choices and will only make small, incremental steps toward the Nash equilibrium.
If Yolanda, for example, behaves according to this view of adaptation, then she will not follow the Cournot reaction curve or sequential Bayesian updating. On the contrary, if she and Xavier selected outcome 2 in the first period, she will see that she could be better off by moving to row 3 or row 4, and that will be sufficient. If Xavier also responds adaptively, they could move from outcome (2,2) to (3,3) to (4,4) and so on. They should still end up at the Nash equilibrium of (14, 14) as long as the process of adaptive responses continues, but it will take perhaps twelve periods, instead of the six periods (at most) in the sequential Bayesian model.
An alternative hypothesis that we consider is that under conditions of high risk, following Lustick (1981), decision makers can ill afford to "muddle through" and will seek more aggressively to find the optimal strategy. With low risk, the surface of choices is flatter, and players have the luxury of "wandering around" a bit without fear of major negative consequences. With high risk, the surface is more peaked, and decision makers behave in a more rationally maximizing fashion. We offer the following three hypotheses:
H3a: Under conditions of high risk and high uncertainty, players will update their beliefs about the probability of their opponent's set of choices more slowly and hence take longer than four plays to reach the Nash equilibrium.
H3b: Under conditions of either high risk or high uncertainty, players will update their beliefs about their opponents' choices more slowly than under conditions of low uncertainty and risk, but not as slowly as when both uncertainty and risk are high.
H3c: Under conditions of high risk, players will update their beliefs quickly about opponents' choices because of the cost of making a "wrong" decision.
Satisficing behavior may result from uncertainty over the most effective strategy for playing the game. If the decision maker is attracted to safe choices, for example, the row that maximizes the average value might look attractive as long as he assumes that the column chooser has some probability of picking any of those columns that result in the high payoffs. Playing this way, the player can get reasonably high payoffs without incurring the risk of potential negative payoffs. In this case, such a player may cling to the highest average payoff row until a string of column choices forces him to concede that those nice columns will not be selected. If both players cling to safe choices, they may take a long time to reach the Nash equilibrium or may not reach it at all. If it becomes riskier to move beyond the safe row, this reinforces the reluctance to change. This combination of risk and strategic uncertainty constitutes a reason for a satisficing choice over a maximizing one because both players would be better off if they could move to the Nash equilibrium. We offer the following two hypotheses:
H4a: Under conditions of high uncertainty, some players will settle on a satisficing level of payoffs to avoid the potential risks and uncertainties of moving farther toward the Nash equilibrium.
H4b: Under conditions of high risk and uncertainty, some players will not reach the Nash equilibrium in the duration of the game, thus producing a satisficing outcome.
Thus taking two definitions of optimizing behavior in an interactive game and two simple definitions of nonoptimizing adaptive behavior in the same game, we have four alternative predictions. These predictions are not alternative end-point predictions but rather alternative process predictions. One hypothesis predicts that individuals will follow the reaction curves to the Nash equilibrium in four periods or less. This is Cournot play. The second hypothesis predicts that individuals will not follow the reaction curves and will reach the Nash equilibrium in six plays depending on which row maximizes expected value. This is Bayesian play. A third hypothesis predicts that individuals will not follow the reaction curves and will take much longer than six periods to reach the Nash equilibrium. This is incremental play. The fourth hypothesis states that individuals also will not follow the reaction curve but will get stuck at intermediate outcomes and possibly not reach the Nash equilibrium at all during the course of the game. This is satisficing play. These alternative paths are represented in figure 1. As a summary of the end-point hypotheses in terms of the number of steps until convergence by experimental condition, note that U-R- < U+R- [less than or equal to] U-R+ < U+R+.
[FIGURE 1 OMITTED]
THE EXPERIMENTAL SETTING
We have chosen a laboratory setting with university students as subjects for playing these experimental games. Some experiments in public administration or related fields have been carried out on a sample from the relevant agency or executive population. Zimmerman and Zysno (1983), for example, used experimental procedures to model the "cognitive hierarchy"--the decision criteria themselves and the way they connect--of how agency clerks make decisions about the creditworthiness of clients. They tried to ensure that their study had strong external validity by using actual credit clerks as subjects and used simulated applications (which served as the experimental task) that were close to the kind faced in actual practice. This task is ideal for experimentation as it is highly structured and resembles the real-world problem quite closely.
Experimentation, however, often cannot be carried out with agency employees. Agencies typically are protective of their decision processes and may feel that experiments with employees disrupt work. Executives generally are too busy or have schedules that do not easily accommodate experiments. Drawing a sample of experimental subjects from agencies is also difficult. Agencies are hierarchies with varying decision authority at different levels. Is a representative sample of individuals from each level realistic of the process of how decisions are actually made? The organization chart is likely to be out of date or not fully reflective of actual decision authority in the organization. Informal networks of decision makers may have more power than the formal organization chart might predict.
For these reasons, we have chosen a laboratory setting, with university students as subjects, for carrying out the experimental game. This approach is not without its own limitations and potential biases, and we would be remiss not to highlight them. Undergraduate students at large universities may be much more homogeneous than the real population. This restriction of variance may result in the experimenters' missing a particularly influential cause in the real world because it is held constant in the sample. If the population from which the sample is drawn is itself unrepresentative of the population at large, randomization will not control for this bias. For instance, students may perform well above the population average at tasks such as multiple-choice tests but may lack sufficient experience to understand certain kinds of complex decisions and thus perform below the population average on such tasks. In general, agency heads are much more experienced strategic thinkers than students; indeed, a J. Edgar Hoover or a Donald Rumsfeld is a world-class strategist.
The laboratory setting itself also can be highly biasing. Experimenters such as Quattrone and Tversky (1984) or Milgram (1974) make use of "white coat" authority in the treatment design itself, but more often it is a source of nuisance variation as subjects respond to the unwanted and uninteresting treatment of simply being in a laboratory. For instance, in studying work-group tasks in a laboratory experiment, the problem is that real work groups are rarely as ad hoc and short lived as experimental work groups, and real tasks are usually much more salient and well understood than experimental ones.
In this experiment we recruited students from three distinct sources so as to maximize the heterogeneity as much as possible, at least given that all were from one university. Students came from introductory political science and mathematics courses, both of which draw quite broadly from different majors on campus. In addition, we recruited subjects from one of the campus sports clubs, which also has a broad range of majors. In all there were eighty subjects of mixed race and major, with ten pairs per cell. There were thirty-six men and forty-four women. So as not to confound treatment with a particular run of the experiment, each time we ran the experiment we included one of each treatment type. A particular run of the experiment took about an hour.
During the experiment the subjects were read the following instructions, which also appeared at the top of their game sheets (reproduced in appendix 1):
In this experiment, you will play a game with an anonymous partner. Your task will be to choose a row to play. Your partner will also choose a play at the same time. Depending on what you both choose, you will receive the payoff (in dollars) indicated in the box on the game matrix below. You pick rows. Depending on what your partner plays, you receive the value in the box. For instance, if you picked 1 and your partner picked 1, you would receive $1.00. After each play, you will get to see your partner's play history.
They were also explicitly told that their partner's sheet could be different from theirs and that they would be informed only of their partner's past play during the course of the game. There was no recruitment fee; subjects' earnings came entirely from game play, though we set a minimum payout of $5 and, because of the budget, a maximum payout of $25.
The primary data gathered in this experiment--the play sequences--have two aggregate aspects that we focus on here. Then we examine some of the deviant play sequences. The first aspect is that of the path taken to equilibrium, and the second aspect is the equilibrium reached. Are there systematic differences in the path taken to equilibrium between treatment groups? Are there systematic differences in the number of steps taken to equilibrium between the treatment groups? If the Cournot duopoly model is correct, the treatments should have no systematic effect, and the plays should follow the predicted Cournot path, at least on average. If, on the other hand, the treatments do have an effect, systematic differences should be observed. In fact, systematic differences are observed, as we show in the following discussion, though they are not entirely consistent with the hypotheses we set out.
Figure 2 shows the median play by treatment group as well as the path predicted by the Cournot duopoly game. Clearly the overall paths in all treatment groups go from low row numbers toward row 14, which is the basic shape predicted by the Cournot duopoly game. However, some systematic differences are apparent. In each treatment group, subjects were willing to start with a substantially higher row play than one would expect, with the U- conditions on average starting higher than the U+ conditions. The U+R+ and U+R- conditions were closer to the Cournot play than the U-R+ and U-R- conditions, at least in the aggregate, and converged at the predicted stage. They greatly resemble the sequential Bayesian solution, both in terms of where they start and how long it takes to converge. The U-R+ and U-R- conditions, however, did not converge on average until several steps later than the Cournot prediction. The sequential Bayesian solution prediction is completely inaccurate--players should converge immediately. In fact, the U-R+ median pulls away from the Cournot play of 14 in round twelve, as can be seen by the slight dip in the figure. Thus at a gross level, the Cournot duopoly model fits, particularly for the U+R+ and U+R- conditions, but there are systematic differences due to the treatments. At this point it seems that U dominates R because the curves differ primarily on those treatments. We will examine these systematic differences more fully in subsequent paragraphs and demonstrate that this claim is supported by the experimental evidence.
[FIGURE 2 OMITTED]
Visually, it seems most of the difference is clearly in the first few rounds of play, and the main line of separation seems to be along the U+ versus U- lines. To formally examine the early plays, we define two dummy variables. U = 1 if we have treatment groups U+R+ or U+R- and 0 otherwise. R = 1 if we have treatment groups U+R+ or U-R+ and 0 otherwise. Using the nonparametric (7) Mann-Whitney test for difference of location between two distributions, we see that the U factor clearly separates the distributions in the first round of play and is at least suggestive in the second round of play, with the z-statistic slightly less than 2 in absolute value. All difference is gone by the third round of play. In contrast, there are no statistically significant differences between the two levels of R. Table 2 shows these results. (8)
Next we will consider the convergence of pairs of players to the Nash equilibrium. Figure 3 shows by treatment group the round at which a percentage of the subjects converged to the Nash equilibrium. A notable feature is that equilibrium is not reached by a substantial proportion of players, and when it is, the players take longer than the Cournot solution would predict. A Cournot pair of players should converge by round four. As we saw earlier, the median player has in fact converged by round five in some treatment conditions. However, this fact says nothing about pairs of players. If players played according to the Cournot strategy, we should expect pairs to converge to the Nash equilibrium near round four. The U+R + and U+R- conditions converge somewhat later than that, with the median round to convergence being round seven in both cases. This is consistent with sequential Bayesian play on these games and is not a large deviation from that predicted by the Cournot strategy. However, the median round to convergence in the U-R+ and U-R- conditions is round thirteen and censored (i.e., greater than fourteen), respectively. None of the pairs in these conditions converged by round four or before. Evidently players do not follow the Cournot game but instead converge more incrementally, and the chance a pair will deviate from the Cournot solution is increased by the U manipulation. The sequential Bayesian solution fits reasonably well on the two U+ games but not at all on the U- games.
[FIGURE 3 OMITTED]
There is a further subtlety about this figure, given that a substantial proportion of the pairs did not reach equilibrium. Because the game was played for only fourteen rounds, a number of the pairs are censored before convergence. (9) As such, the figure combines both quantitative information--"At what round do pairs converge?"--and qualitative information--"Do they converge at all in the scope of the experiment?" Clearly there are systematic differences between the curves, with the U- conditions taking substantially longer to reach convergence than the U+ conditions. A substantially larger proportion of the U-R- condition pairs never reached convergence at all, and many of the U-R+ condition pairs only converged in the last round or two of play. In contrast, the R condition seems to make no difference at all, at least in what we observed. (10)
To examine more formally the differences between treatments in the duration taken to convergence to the Nash equilibrium, taking censoring into account, we used a survival analysis model, the Cox Proportional Hazard (CPH) model (Kalbfleisch and Prentice 1980). The Cox model is a semiparametric duration model that focuses on comparing the effect of covariates on the survival curve. In this case survival means "has not converged to the Nash equilibrium." The model computes a baseline hazard--basically a nuisance variable in this case--nonparametrically and models the effect of covariates multiplicatively, thus the term "proportional hazard." Ordinary regression analysis has no means to deal with the partially informative observations of the censored cases except in an ad hoc fashion, but the Cox model uses the information from the censored cases appropriately.
The model shown in table 3 uses the two dummy variables, U and R, defined previously. They are entered additively into the CPH equation. (11) To interpret a coefficient b, one examines the natural exponential of the regression coefficient, exp([beta]). A value near 1 indicates that there is no proportional change from the baseline hazard function. In this case, U = 1 increases the proportional hazard of convergence by roughly a factor of 2. That is, taking censoring into account, when U = 1, pairs converge to the equilibrium a bit more than twice as fast as when U = 0. When R = 1 increases the hazard of convergence slightly, but it is not statistically significant. This formal test supports figure 3. In summary, the aggregate results show that, contrary to our previous end-point hypothesis, U+R+ = U+R- < U-R+ [less than or equal to] U-R-.
Finally, we consider some of the deviant play sequences in more detail, as they both provide more context and shed some light on the reasons why play differed from the given hypotheses. One of our hypotheses was that satisficing play might be observed with a pair of players converging to a suboptimum. Though satisficing was not supported in the aggregate results, a few pairs did show evidence of it. One particularly striking pair rapidly converged to (7,7) and never moved from there, and some others eventually settled on a satisficing-type equilibrium. One player commented in the questionnaire handed out at the end of the experiment, "My partner didn't change his answer much as it was easier." Similarly, in game U+R +, two players followed small incremental steps to reach convergence at the Nash equilibrium on the seventh play of the game. One player started out in the Nash row but bounced to lower rows to accommodate the choices of the other player. This player started in row 7 and incrementally moved over the next seven plays to row 14.
One particularly clever pair of players in the U-R- condition converged to a two-step trading cycle of (12,14) and (14,12), which would clearly be superior to the Nash play in payoff. This trading cycle points to an interesting issue that explains the reversal in the endpoint hypotheses. The U- treatment was implemented by moving the safe row with the highest average value so that it was consistent with the Nash row. This was done by putting some large positive values on the reaction curve but not on the Nash equilibrium. In the Cournot story these values should make no difference at all. However, as is evident from the aggregate results presented, subjects responded quite strongly to them. Unlike the standard Cournot duopoly, subjects in this game did not know the reaction curve but instead had to infer it, so perhaps this is not so surprising. We speculate that subjects were attracted to the large numbers and thus tended to pull away from the Nash equilibrium in an attempt to lure their partners to play a lower value. If this strategy was successful, as in the case of the players who set up the trading cycle, it was possible for both to do better than simply playing the Nash, similar perhaps to logrolling in legislatures. Although the Nash equilibrium in this game is Pareto-efficient in any one play, play here is sequential. That said, it is not terribly surprising that it is very difficult for players actually to manage this trading cycle given how little they know. In fact, one of the earliest experiments in game theory--discussed extensively in Poundstone (1992, 106-21)--was performed by Merrill Flood and Melvin Drescher, using economists Armen Alchian and John Williams as subjects. It exhibits exactly this pattern, though only after a substantial learning time (dozens of plays).
More commonly, a number of pairs exhibited "wandering" behavior in which it seemed that partners did not understand each other's motivations. In such cases, usually one partner clearly tried to play something more or less like the Cournot strategy, but the other player did not understand it and thus did not respond in kind (which lead to substantial monetary losses), so the other partner stopped playing Cournot. Sometimes a pair of players would "dance" with each other one round out of sequence. As mentioned, we asked the subjects a battery of open-ended questions after the experiment, including one about whether they thought their partners understood the game. The responses were generally consistent with the lack of coordination interpretation. In one particularly bizarre case, one player seriously misunderstood the game itself and played a pattern that was highly regular but totally unrelated to the payoff structure (it was tent-shaped, starting low, moving to 14, and then moving back down again). His partner noted in the open-ended response that he was totally confused by the partner's play, as he had expected the partner to respond to the payoffs in the game matrix.
In games U-R+ and U-R-, we found a pattern of negative reinforcement and distrust between two pairs of players. In both cases, one player would start with a high row and the other would pick a middle row. In the next play, their choices would be reversed. This inverse pattern caused each of them to lose money and set them on a pattern of interaction that prevented convergence to the Nash equilibrium. One player commented about her partner, "She countered me with numbers opposite mine, or a set of numbers to make larger amounts of money." The other player observed of her partner, "[Her choices] seemed random, especially after the first seven or so plays."
The other pattern that emerged from U-R + and U-R- games was a mismatch between one player who moved to the Nash row quickly while the other player bounced up and down between high and low rows, not sure of a consistent strategy. One player who consistently picked row 14 commented about the erratic play of her partner, "[She] seemed quite uncooperative. [My partner] was for greed and had a stubborn reluctance to grow as a team." About her own play, she said, "I was trying to accomplish cooperation, so that we might both get the most amount of money."
Another observable incremental pattern that occurred in U-R- was incremental convergence that took several plays of the game. One player in particular played a very cautious game, starting with row 2, then 4, 5, 6, 12, 13, 15, 14, 12, 10, and finally converging back at 14. Her partner had sought to move to row 14 by the third play of the game but moved to lower rows to avoid the costs incurred by the partner's incremental approach. In some cases, therefore, a consistent incrementalist can induce more incremental play in a partner.
The concept of incremental adaptation entered the social sciences literature because empirical observations of behavior did not fit with a fully rational approach to decision making. Early explanations of incrementalism included an ambiguous mixture of cognitive limits by decision makers faced with complex problems and the politics of decentralized interaction. For the most part, however, the public administration, budgetary, and organization literature has emphasized the cognitive limits explanation for incrementalism. Indeed, Model 2 in Graham Allison's book The Cuban Missile Crisis (1971) is based on the March and Simon (1958) notion of how simplified organizational routines serve as heuristics to cope with the complexity of decisions.
The development of institutional economics and game theory has focused much more on the effects of strategic interdependence under limited information and uncertainty on the collective effect of individual choices. In this article, therefore, we have sought to induce incremental behavior in players faced with strategic interdependence through a two-person, simultaneous-play game. We have incorporated risk and uncertainty into the interdependence of the game situation while holding constant the complexity of the choice set and the sequence of play. Our results from this game show that strategic interdependence in the game situation does induce incremental behavior along the lines predicted. After fourteen plays of the game, of the forty pairs in the experiment, 35 percent had not reached the N ash equilibrium. Cournot behavior, defined as convergence before or at four plays of the game, was exhibited in only 30 percent of the pairs.
On the other hand, the aggregate results for number of plays to reach convergence to the Nash equilibrium showed a predictable, although slow, Cournot pattern. This overall median pattern of convergence occurred despite the fact that individual pairs of players varied in several directions in each of the games. Hence we conclude from these results that although some decision makers exhibited incremental behavior in these complex decision situations, the overall average follows a pattern of choice relatively close to the one predicted by the Cournot strategy. This pattern is not, however, necessarily exhibited within a given pair of players.
The predictions about the effects of specific dimensions of interdependence were not borne out. The results did reveal examples of classic incremental adaptation and satisficing behavior. A few players settled on row 7 as a medium strategy that gave them satisfactory payoffs without the greater risks of moving toward the Nash equilibrium. More players moved cautiously from low-numbered rows in a stepwise fashion over several plays to reach the Nash equilibrium row. Players also found themselves in an unstable pattern of inverse choices that reinforced mistrust and lack of credibility, preventing them from reaching the Nash equilibrium at all. In some cases, steady play by one player at the Nash row slowly induced the other player to conform; in other cases, steady play of this sort did not produce this result.
If the overall pattern approximated the Cournot solution, how much importance should we place on the incremental subpatterns and the unstable inverse patterns that occurred in some cases? One answer is that we should not place emphasis here because on average it seems to all work out according to Cournot rational theory. Perhaps an analogy could be made to the economy, where under conditions of competition, the overall economy performs efficiently even if individual firms make mistakes and go out of business. Of course, the number of players in these limited experiments is far too small for this kind of analogy to work, but the average success of an approximate Cournot strategy suggests the power of adaptive Cournot solutions.
The literature often asserts that incrementalism is not rational but does not really explain that claim. However, we ask whether or not it might be rational in some conditions to choose incrementally. This gets into the matter of the level of risk aversion or acceptance in a given player as well as the cost of information. A pessimistic risk-averse player values certainty over potential gain, whereas an optimistic risk-acceptant player values potential gains over certainty. These are different preference structures. Rational choice theory does not claim to choose between preference structures but instead takes them as given; in other words, it does not purport to "account for tastes." In this respect, a Cournot player represents an optimistic player who confidently (arguably too confidently, if the sequential Bayesian rule is the standard of rationality) updates information. In addition, some sorts of incremental play could well be rational in the sense that they would be consistent with a player's preferences and beliefs. Similarly, if information is costly, incrementalism might be rational, depending on how players value it. That said, incremental play would clearly be irrational if the player made choices inconsistent with preferences and beliefs. In our view, claiming that incremental play is irrational excessively focuses on retrospective assessments (i.e., "Monday morning quarterbacking"). But players in a limited information game do not have the luxury of such assessments because they do not possess the information available retrospectively.
Another way of looking at these results is in terms of the lessons to be learned about strategic interaction from the behavior of adaptive incrementalism. In some instances a subpattern of instability persisted for the entire play of the game, preventing a convergence to the Nash equilibrium. From these patterns we can learn how negative feedback causes instability and the possible ways to break out of inverse reactions. In other instances, subpatterns of incremental adaptation led to several plays before the equilibrium point was achieved. From these patterns we can learn how incremental moves in the right direction can induce mutual gain and even possibly prevent inverse instability from occurring. A few players settled on row 7 as a medium strategy that gave them satisfactory payoffs for the entire play of the game without the greater risks of moving toward the Nash equilibrium. From this pattern we can learn how initial instability and inverse decisions can lead to suboptimal patterns that persist due to prior negative interactions.
What causes the mismatch remains to be seen, but we speculate that it is the result of two effects, one relating to preferences and the other to cognitive ability. It is possible that the mismatch represents variation in acceptance of risk. It may be that when an optimistic player willing to take chances partners with a pessimistic player who is not, the players have difficulty forming stable mutual expectations because the optimistic player, by playing aggressively, ends up being inadvertently punished by the more conservative player. Mismatch may also be caused by cognitive ability differences between players in a similar fashion. It was quite clear from the free response items that some players had much better understanding of the game than others. A player who manages to solve the game quickly and plays a high row quickly might, however, be punished by a partner who does not.
These observations show that it is too limiting to identify incrementalism only with problem complexity and limitations in rationality for dealing with it. In Allison's models of decision making (1971), for example, Model 2 is based squarely on the incremental literature in organization theory and decision making. Model 2, however, ignores the effect of strategic interdependence on incremental behavior (Bendor and Hammond 1992). Allison includes interdependence in Model 3, the bureaucratic politics model, through his references to Lindblom (1965), although there is still a strong emphasis on limited analysis. Under some conditions of decentralized interaction, given limited information, sequential play, and low risk acceptance, decision makers will behave incrementally in Model 3. In addition, more knowledge and analysis about the decision problem or faster computational ability cannot eliminate this kind of incremental behavior. Mismatches in risk preferences, moreover, can lead to unstable interactions, but the players are not necessarily behaving irrationally. Interestingly, a player who does not understand the problem playing against an opponent who does can still produce this kind of unstable interaction.
In Model 1, Allison allows for variants in the classical-rational model to include sequential play and the preference functions of a leadership clique, or "hawks and doves," as decision makers. Such variants can produce Cournot behavior. Less confident assumptions about risk preferences and other preference functions of the opponent can produce incremental behavior. Given the mutual effect of interdependence on players' choices (Tsebelis 1989), cautious updating of assumptions is not necessarily irrational. Consequently, these models can overlap in concept and in predicted behavior without abandoning rationality assumptions. Recognition of the effects of strategic interdependence on decision making, therefore, may help integrate rational explanations of the patterns of sequential interaction among committees, interest groups, and agencies, as well as among foreign policy decision makers.
In sum, the game shows not only the explanatory power of the Cournot and sequential Bayesian solutions but also the influence of strategic interdependence on inducing other patterns of behavior. In addition, it points to serious anomalies in the standard theoretical explanations in some cases, which indicates that current theory is in need of further elaboration. To test the specific hypotheses of how dimensions of interdependence influence behavior would require additional players over more runs of the game. Because some players never converged, adding more plays may also give insight into the persistence of instability and incremental patterns. Unfortunately, our experience running this as an experimental game shows that it would be difficult, perhaps impossible, to run the experiment for more plays. Developing a computer simulation of the game would permit both more players and a larger number of plays. In addition, it would also allow explicit control over (virtual) players with mismatched strategies (e.g., one playing a Cournot strategy while the other satisfices, something that cannot be done in the laboratory at all). The extent to which players do not reinforce each other's expectations may be an important predictor of convergence. Finally, it would be possible to formalize the prior probabilities of the players, examining the effects of differing belief structures or schemes of updating.
The results of our experiment do point to one central unanswered question in the incrementalism literature: How would one empirically recognize incrementalism? We endeavored to create a situation in which there was a clearly defined, rational alternative through which to determine if players would play incrementally. Unfortunately, in most real world settings the objective functions of players are not known, and much pertinent information is hidden from view. In fact, a large part of the task that decision makers in a duopoly situation face is to find out what their partner's objective function is. Thus in some circumstances, rational play might well appear--or indeed even be--incremental, given certain preference and belief structures.
APPENDIX 1 Game Sheets Given to Subjects with Treatments Noted (a) U+R+ Your Play Partner's Play Column: 1 2 3 4 5 6 7 Row 1 1.00 1.00 0.75 0.85 0.85 0.90 0.90 2 2.00 1.20 1.00 1.00 1.00 1.00 1.00 3 0.00 1.40 1.25 1.15 1.15 1.10 1.10 4 -1.00 1.60 1.50 1.30 1.25 1.20 1.20 5 -2.00 1.80 1.75 1.45 1.35 1.30 1.30 6 -3.00 2.00 2.00 1.50 1.45 1.40 1.40 7 -4.00 2.20 2.20 1.75 1.55 1.55 1.50 8 -4.50 0.00 2.50 1.90 1.65 1.60 1.60 9 -4.75 -1.00 0.00 2.05 1.75 1.70 1.70 10 -5.00 -1.25 -1.00 0.00 1.85 1.80 1.75 11 -5.25 -1.50 -1.25 -1.00 0.00 1.90 1.80 12 -5.50 -1.75 -1.50 -1.25 -1.00 0.00 1.95 13 -5.75 -2.00 -1.75 -1.50 -1.25 -1.00 0.00 14 -6.00 -2.25 -2.00 -1.75 -1.50 -1.25 -1.00 15 -6.25 -2.50 -2.25 -2.00 -1.75 -1.50 -1.25 Your Play Partner's Play Column: 8 9 10 11 121 13 14 15 Row 1 0.90 0.90 0.90 0.90 0.90 0.90 0.90 0.90 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3 1.10 1.10 1.10 1.10 1.11 1.10 1.10 1.10 4 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 5 1.30 1.30 1.30 1.30 1.30 1.30 1.30 1.30 6 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 7 1.50 1.50 1.50 1.50 1.50 1.50 1.50 1.50 8 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 9 1.70 1.70 1.70 1.70 1.70 1.70 1.70 1.70 10 1.75 1.80 1.80 1.80 1.80 1.80 1.80 1.80 11 1.80 1.80 1.80 1.90 1.90 1.90 1.90 1.90 12 1.90 1.95 1.95 2.00 2.00 2.00 2.00 2.00 13 2.50 2.00 2.00 2.10 2.10 2.10 2.10 2.10 14 0.00 2.50 2.50 2.50 2.50 2.50 2.50 2.50 15 -1.00 0.80 1.00 1.20 1.40 1.60 1.80 2.00 (b) U-R+ Your Play Partner's Play Column: 1 2 3 4 5 6 7 Row 1 1.00 1.00 0.75 0.85 0.85 0.90 0.90 2 2.00 1.20 1.00 1.00 1.00 1.00 1.00 3 0.00 1.40 1.25 1.15 1.15 1.10 1.10 4 -1.00 1.60 1.50 1.30 1.25 1.20 1.20 5 -2.00 1.80 1.75 1.45 1.35 1.30 1.30 6 -3.00 2.00 2.00 1.50 1.45 1.40 1.40 7 -4.00 2.20 2.20 1.75 1.55 1.55 1.50 8 -4.50 0.00 2.50 1.90 1.65 1.60 1.60 9 -4.75 -1.00 0.00 2.05 1.75 1.70 1.70 10 -5.00 -1.25 -1.00 0.00 1.85 1.80 1.75 11 -5.25 -1.50 -1.25 -1.00 0.00 1.90 1.80 12 -5.50 -1.75 -1.50 -1.25 -1.00 0.00 1.95 13 -5.75 -2.00 -1.75 -1.50 -1.25 -1.00 0.00 14 -6.00 -2.25 -2.00 -1.75 -1.50 -1.25 -1.00 15 -6.25 -2.50 -2.25 -2.00 -1.75 -1.50 -1.25 Your Play Partner's Play Column: 8 9 10 11 12 13 14 15 Row 1 0.90 0.90 0.90 0.90 0.90 0.90 0.90 0.90 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3 1.10 1.10 1.10 1.10 1.10 1.10 1.10 1.10 4 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 5 1.30 1.30 1.30 1.30 1.30 1.30 1.30 1.30 6 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 7 1.50 1.50 1.50 1.50 1.50 1.50 1.50 1.50 8 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.61 9 1.70 1.70 1.70 1.70 1.70 1.70 1.70 1.70 10 1.75 1.80 1.80 1.80 1.80 1.80 1.80 1.80 11 1.80 1.80 1.80 1.90 1.90 1.90 1.90 1.90 12 1.90 1.95 1.95 2.00 2.00 2.00 2.00 2.00 13 2.50 2.00 2.00 2.10 2.10 2.10 2.10 2.10 14 0.00 5.00 7.00 7.00 7.00 5.00 2.50 2.50 15 -1.00 0.80 1.00 1.20 1.40 1.60 1.80 2.00 (c) U+R- Your Play Partner's Play Column: 1 2 3 4 5 6 7 Row 1 1.00 1.00 0.75 0.85 0.85 0.90 0.90 2 2.00 1.20 1.00 1.00 1.00 1.00 1.00 3 0.00 1.40 1.25 1.15 1.15 1.10 1.10 4 -0.30 1.60 1.50 1.30 1.25 1.20 1.20 5 -0.50 1.80 1.75 1.45 1.35 1.30 1.30 6 -0.80 2.00 2.00 1.50 1.45 1.40 1.40 7 -1.00 2.20 2.20 1.75 1.55 1.55 1.50 8 -1.10 0.00 2.50 1.90 1.65 1.60 1.60 9 -1.20 -0.30 0.00 2.05 1.75 1.70 1.70 10 -1.30 -0.30 -0.30 0.00 1.85 1.80 1.75 11 -1.30 -0.40 -0.30 -0.30 0.00 1.90 1.80 12 -1.41 -0.40 -0.40 -0.30 -0.30 0.00 1.95 13 -1.40 -0.50 -0.40 -0.40 -0.30 -0.30 0.00 14 -1.50 -0.60 -0.50 -0.40 -0.40 -0.30 -0.30 15 -1.60 -0.60 -0.60 -0.50 -0.40 -0.40 -0.30 Your Play Partner's Play Column: 8 9 10 11 12 13 14 15 Row 1 0.90 0.90 0.90 0.90 0.90 0.90 0.90 0.90 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3 1.10 1.10 1.10 1.10 1.10 1.10 1.10 1.10 4 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 5 1.30 1.30 1.30 1.30 1.30 1.30 1.30 1.30 6 1.40 1.40 1.40 1.40 1.4C 1.40 1.40 1.40 7 1.50 1.50 1.50 1.50 1.50 1.50 1.50 1.50 8 1.60 1.60 1.60 1.60 1.60 1.61 1.60 1.60 9 1.70 1.70 1.70 1.70 1.70 1.70 1.71 1.70 10 1.75 1.80 1.80 1.80 1.80 1.80 1.80 1.80 11 1.80 1.80 1.80 1.90 1.90 1.90 1.90 1.90 12 1.90 1.95 1.95 2.00 2.00 2.00 2.00 2.00 13 2.50 2.00 2.00 2.10 2.10 2.10 2.10 2.10 14 0.00 2.50 2.50 2.50 2.50 2.50 2.50 2.50 15 -0.30 0.80 1.00 1.20 1.40 1.60 1.80 2.00 (d) U-R- Your Play Partner's Play Column: 1 2 3 4 5 6 7 Row 1 1.00 1.00 0.75 0.85 0.85 0.90 0.90 2 2.00 1.20 1.00 1.00 1.00 1.00 1.00 3 0.00 1.40 1.25 1.15 1.15 1.10 1.10 4 -0.30 1.60 1.50 1.30 1.25 1.20 1.20 5 -0.50 1.80 1.75 1.45 1.35 1.30 1.30 6 -0.80 2.00 2.00 1.50 1.45 1.40 1.40 7 -1.00 2.20 2.20 1.75 1.55 1.55 1.50 8 -1.10 0.00 2.50 1.90 1.65 1.60 1.60 9 -1.20 -0.30 0.00 2.05 1.75 1.70 1.70 10 -1.30 -0.30 -0.30 0.00 1.85 1.80 1.75 11 -1.30 -0.40 -0.30 -0.30 0.00 1.90 1.80 12 -1.40 -0.40 -0.40 -0.30 -0.30 0.00 1.95 13 -1.40 -0.50 -0.40 -0.40 -0.30 -0.30 0.00 14 -1.50 -0.60 -0.50 -0.40 -0.40 -0.30 -0.30 15 -1.60 -0.60 -0.60 -0.50 -0.40 -0.40 -0.30 Your Play Partner's Play Column: 8 9 10 11 12 13 14 15 Row 1 0.90 0.90 0.90 0.90 0.90 0.90 0.90 0.90 2 1.00 1.00 1.0C 1.00 1.00 1.00 1.00 1.00 3 1.10 1.10 1.10 1.10 1.10 1.10 1.10 1.10 4 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 5 1.30 1.30 1.30 1.30 1.30 1.30 1.30 1.30 6 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 7 1.50 1.50 1.50 1.50 1.50 1.50 1.50 1.50 8 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 9 1.70 1.70 1.70 1.70 1.70 1.70 1.70 1.70 10 1.75 1.80 1.80 1.80 1.80 1.80 1.80 1.80 11 1.80 1.80 1.80 1.90 1.90 1.90 1.90 1.90 12 1.90 1.95 1.95 2.00 2.00 2.00 2.00 2.00 13 2.50 2.00 2.00 2.10 2.10 2.10 2.10 2.10 14 0.00 4.00 5.00 5.00 5.00 4.00 2.50 2.50 15 -0.30 0.80 1.00 1.20 1.40 1.60 1.80 2.00
Derivation of Updating Equations and the Sequential Bayesian Strategy
In the context of these games, Bayesian updating and Cournot adjustment can be placed within the same mathematical framework using updating equations to alter the probability distribution representing a player's belief over her opponent's possible plays. (It is assumed that the player knows what plays are possible, but does not know the payoffs his or her opponent attaches to them.) Because of the information structure of the games described earlier, a number of useful simplifications naturally arise.
Let t = 0,1., ..., [t.sub.max], be the round of play; p be a vector of probabilities over the opponent's possible plays at round r; o be the observation vector, which has zeros everywhere except a 1 in the position of the play selected by the opponent (making it a degenerate probability distribution); B is the player's payoff matrix; and v [member of] (0,1) an updating equation weight parameter, with larger values indicating more weight given to prior information. If we adopt an expected value maximizing strategy, the player should choose the row that maximizes the expected value at round t, [e.sub.t], given by the vector product [e.sub.t] = B [p.sub.t]. All we need to project probabilities is an initial probability assignment and the observations of the last round's play. Assume that [p.sub.0], the prior probability, is given. Then [p.sub.t+1] = v [p.sub.t] + (1 - v) [o.sub.r], which is simply an invocation of the theorem of total probability. It is not difficult to back-substitute to replace all [p.sub.t] terms for t > 0 with [P.sub.0], the prior, and sums of [o.sub.t-1], [o.sub.t-2] ..., [o.sub.1] alone, weighted by powers of v and (1 - v).
In the games given here it is rational to choose an uninformative (i.e., uniform) prior because nothing is known about the opponent's payoffs aside from the number of available plays. The weighting parameter v has a large impact on the updating of probability as it determines the value of new information. In terms of the theorem of total probability, it is the subjective probability assigned by the player to the past states and the observed play. For v = 0.5, it is the player. For v < 0.5, new information is weighted relatively more than old, whereas for v > 0.5, old information is weighted relatively more than new. If v = 0, the up dating equations give Cournot adjustment, because the player in this case assigns no probability to past states whatsoever. In contrast, if v = 1, the player does not update. The prior decays geometrically in [v.sup.t], so an informative prior will have relatively limited effect after a few rounds of play, assuming that v is at least moderate in size.
In a symmetric game such as the ones used in this experiment--which, recall, were constructed to all have the same Cournot path--we can calculate the sequential Bayesian path without too much trouble if we assume a uniform prior and the same weighting parameter for both players, though the paths differ by game sheets. Games U+R+ and U+R- have paths (7, 8, 12, 12, 13, 14 ...), predicting convergence by round six, not dissimilar to the Cournot path but a bit more gradual and with a higher starting value. In contrast, games U-R+ and U-R- should both converge in the first round of play because row 14 has the highest expected value under uniformity. Other solutions are straightforward to calculate given the updating equations and different prior probability vectors.
We wish to thank Janet Glaser, Vice Chancellor for Research, and the Research Board for generous funding as well as help navigating through the paperwork necessary for research involving human subjects, James Kuklinski and Hamish Gow for important advice at various points, Mary Parker, the students and instructors of the classes and athletic team that made up the subject pool, Jennifer Jerit and Phil Habel for help administering the experiment, and the participants on panels at the American Political Science Association conference in August 2000 and the Midwest Political Science Association conference in April 2001, where previous versions of this article were given.
Table 1 Game Sheet with Both Players' Payoffs Shown Column Row 1 2 3 4 5 6 1 1.00 1.00 * 0.75 0.85 0.85 0.90 1.00 2.00 * 0.00 -1.00 -2.00 -3.00 2 2.00 * 1.20 1.00 1.00 1.00 1.00 1.00 * 1.20 1.40 1.60 1.80 2.00 3 0.00 1.40 1.25 1.15 1.15 1.10 0.75 1.00 1.25 1.50 1.75 1.50 4 -1.00 1.60 1.50 1.30 1.25 1.20 0.85 1.00 1.15 1.30 1.45 1.45 5 -2.00 1.80 1.75 1.45 1.35 1.30 0.85 1.00 1.15 1.25 1.35 1.40 6 -3.00 2.00 2.00 1.50 1.45 1.40 0.90 1.00 1.10 1.20 1.30 1.40 7 -4.00 2.20 * 2.20 1.75 1.55 1.55 0.90 1.00 * 1.10 1.20 1.30 1.40 8 -4.50 0.00 2.50 * 1.90 1.65 1.60 0.90 1.00 1.10 * 1.20 1.30 1.40 9 -4.75 -1.00 0.00 2.05 * 1.75 1.70 0.90 1.00 1.10 1.20 * 1.30 1.40 10 -5.00 -1.25 -1.00 0.00 1.85 * 1.80 0.90 1.00 1.10 1.20 1.30 * 1.40 11 -5.25 -1.50 -1.25 -1.00 0.00 1.90 * 0.90 1.00 1.10 1.20 1.30 1.40 * 12 -5.50 -1.75 -1.50 -1.25 -1.00 0.00 0.90 1.00 1.10 1.20 1.30 1.40 13 -5.75 -2.00 -1.75 -1.50 -1.25 -1.00 0.90 1.00 1.10 1.20 1.30 1.40 14 -6.00 -2.25 -2.00 -1.75 -1.50 -1.25 0.90 1.00 1.10 1.20 1.30 1.40 15 -6.25 -2.50 -2.25 -2.00 -1.75 -1.50 0.90 1.00 1.10 1.20 1.30 1.40 Column average 0.90 1.08 1.06 1.11 1.16 1.20 Column Row 7 8 9 10 11 12 1 0.90 0.90 0.90 0.90 0.90 0.90 -4.00 -4.50 -4.75 -5.00 -5.25 -5.50 2 1.00 * 1.00 1.00 1.00 1.00 1.00 2.20 * 0.00 -1.00 -1.25 -1.50 -1.75 3 1.10 1.10 * 1.10 1.10 1.10 1.10 2.20 2.50 * 0.00 -1.00 -1.25 -1.50 4 1.20 1.20 1.20 * 1.20 1.20 1.20 1.75 1.90 2.05 * 0.00 -1.00 -1.25 5 1.30 1.30 1.30 1.30 * 1.30 1.30 1.55 1.65 1.75 1.85 * 0.00 -1.00 6 1.40 1.40 1.40 1.40 1.40 * 1.40 1.50 1.60 1.70 1.80 1.90 * 0.00 7 1.50 1.50 1.50 1.50 1.50 1.50 * 1.50 1.60 1.70 1.75 1.80 1.95 * 8 1.60 1.60 1.60 1.60 1.60 1.60 1.50 1.60 1.70 1.75 1.80 1.90 9 1.70 1.70 1.70 1.70 1.70 1.70 1.50 1.60 1.70 1.80 1.80 1.95 10 1.75 1.75 1.80 1.80 1.80 1.80 1.50 1.60 1.70 1.80 1.80 1.95 11 1.80 1.80 1.80 1.80 1.90 1.90 1.50 1.60 1.70 1.80 1.80 2.00 12 1.95 * 1.90 1.95 1.95 2.00 2.00 1.50 * 1.60 1.70 1.80 1.80 2.00 13 0.00 2.50 * 2.00 2.00 2.10 2.10 1.50 1.60 * 1.70 1.80 1.80 2.00 14 -1.00 0.00 2.50 * 2.50 * 2.50 * 2.50 * 1.50 1.60 1.70 * 1.80 * 1.80 * 2.00 * 15 -1.25 -1.00 0.80 1.00 1.20 1.40 1.50 1.60 1.70 1.80 1.80 2.00 Column average 1.25 1.17 1.00 0.83 0.64 0.45 Column Row Row 13 14 15 Average 1 0.90 0.90 0.90 0.90 -5.75 -6.00 -6.25 2 1.00 1.00 1.00 1.08 -2.00 -2.25 -2.50 3 1.10 1.10 1.10 1.06 -1.75 -2.00 -2.25 4 1.20 1.20 1.20 1.11 -1.50 -1.75 -2.00 5 1.30 1.30 1.30 1.16 -1.25 -1.50 -1.75 6 1.40 1.40 1.40 1.20 -1.00 -1.25 -1.50 7 1.50 1.50 1.50 1.25 0.00 -1.00 -1.25 8 1.60 * 1.60 1.60 1.17 2.50 * 0.00 -1.00 9 1.70 1.70 * 1.70 1.00 2.00 2.50 * 0.80 10 1.80 1.80 * 1.80 0.83 2.00 2.50 * 1.00 11 1.90 1.90 * 1.90 0.64 2.10 2.50 * 1.20 12 2.00 2.00 * 2.00 0.45 2.10 2.50 * 1.40 13 2.10 2.10 * 2.10 0.25 2.10 2.50 * 1.60 14 2.50 * 2.50 * 2.50 * 0.12 2.10 * 2.50 * 1.80 * 15 1.60 1.80 * 2.00 -0.58 2.10 2.50 * 2.00 Column average 0.25 0.12 -0.58 Note: Payoffs are in dollars. Row player's reaction curve payoffs are underscored. Column player's reaction curve payoffs are in boldface type. *, Cournot reaction curves; **, Nash equilibrium play and highest average value. Table 2 Medians by Factor in First Five Rounds Round 1 Round 2 Round 3 Round 4 Round 5 Cournot play 2 7 12 14 14 U = 0 Median play 8 11 10.5 12 12.5 U = I Median play 4.5 8 10.5 11 14 Z -2.269 -1.914 -.054 -.256 -1.016 R = 0 Median play 7 10 9.5 11 13.5 R = I Median play 7 10 12 9.5 14 Z -.083 -.200 -.906 -.803 -.890 Note: U = 1 if we have treatment groups U+R+ or U+R and 0 otherwise. R = 1 if we have treatment groups U+R+ or U-R+ and 0 otherwise. Table 3 Cox PH Model Results Factor [beta] SE p-Value exp([beta]) U .833 .400 .037 2.301 R .325 .398 .414 1.384 Note: U = 1 if we have treatment groups U+R+ or U+R- and 0 otherwise. R = 1 if we have treatment groups U+R+ or U-R+ and 0 otherwise.
(1) For an excellent review of critiques of incrementalism, see Lustick (1981) and Bendor (1995).
(2) Some may object to our definition of uncertainty, which differs somewhat from the one commonly used by game theorists to denote a game in which there is a move by "nature." To some extent our games do have this element experimentally because the level of risk-acceptance and the cognitive ability of a player's partner is random. However, this is not under our control, so it is not something we can manipulate. Nevertheless, we believe, following Tsebelis (1989), that there are two sources of probability in a game: unpredictable moves by "nature" and the strategic interaction of the players themselves. We increase uncertainty in the strategic sense by introducing more than one attractive strategy and are thus using the term uncertainty in a broader way. It is also important to recall that the games are of limited information.
(3) We should also note that there is more than one way to use Bayes's rule in this situation. We have chosen one that matches up with the overall decision environment. To distinguish it from other possible uses of Bayes's rule, we call this "sequential Bayes" or "S-Bayes."
(4) During play, many players seem to assume the payoffs are the same, however.
(5) In the experiment, this was accomplished by having all players be row players. Thus one's partner is naturally the column player.
(6) The experiment was run for fourteen plays to avoid any end-point effects that might be present if the subjects were to guess the experiment would end at a "nice" number, such as fifteen.
(7) Because the variables considered in this analysis are row selections, they are best treated as ordinal, not interval. Thus the analysis undertaken here will be entirely based on nonparametric or semiparametric methods. In general, the relative efficiency of nonparametric methods is somewhat lower than parametric alternatives when the assumptions behind the parametric methods are true. However, nonparametric methods require weaker assumptions and thus lead to more robust inferences. When all goes well conclusions are similar, but nonparametrics provide useful protection against method artifacts.
(8) One criticism that might be leveled here is that the data are structured into pairs of players, and the observations are thus not independent. Note however that in the first round of play they are in fact independent because the players have no information about past play and are thus responding to the stimulus alone. The second round has some correlation due to last round's information, but it is still relatively little.
(9) It is unclear whether nonconverging pairs would ever converge to the Nash equilibrium if given sufficient rounds to play. In most cases we believe they would. In a few other cases the players converged to a regular pattern of play for instance (7,7) or alternating between (12,14) and (14,12)--but it was not the Nash equilibrium.
(10) However, because the experiment is censored before all pairs converge, there is no way of knowing when the U-R- condition would have converged, though it is interesting to note that the U-R+ condition has a rush of conversion near the end of the game. Without further plays of the game, we cannot know the outcome. Unfortunately, subjects' attention spans are substantially taxed even by fourteen plays, and it would be very difficult to continue the experiment much beyond that point while maintaining validity.
(11) We fit the model with U+ and R+ interacted, but the coefficient is not significanl and the entire model becomes insignificant. It is not shown here.
Allison, Graham T. 1971. Essence of decision: Explaining the Cuban missile crisis. Boston: Little, Brown.
Bendor, Jonathan. 1995. A model of muddling through. American Political Science Review 89, no. 4:819-40.
Bendor, Jonathan, and Terry Moe. 1985. An adaptive model of bureaucratic politics. American Political Science Review 79, no. 3:755-74.
Bendor, Jonathan, and Thomas H. Hammond. 1992. Rethinking Allison's models. American Political Science Review 86, no. 2:301-22.
Coase, Ronald. 1937. The nature of the firm. Economica 4:386- 405.
Crecine, John P. 1969. Governmental problem solving: A computer simulation of municipal budgeting. Chicago: Rand McNally.
Davis, Otto A., M. A. H. Dempster, and Aaron Wildavsky. 1966. A theory of the budgetary process. American Political Science Review 60, no. 3:529-47.
Dixit, Avinash, and Barry Nalebuff. 1991. Thinking strategically: The competitive edge in business, politics, and everyday life. New York: W. W. Norton.
Dixit, Avinash, and Susan Skeath. 1999. Games of strategy. New York: W. W. Norton.
Fouraker, Lawrence E., and Sidney Siegel. 1963. Bargaining behavior. New York: McGraw-Hill.
Gintis, Herbert. 2000. Game theory evolving. Princeton, N.J.: Princeton University Press.
Henderson, James M., and Richard E. Quandt. 1980. Microeconomic theory: A mathematical approach. 3d ed. New York: McGraw-Hill.
Jones, Bryan D. 1999. Bounded rationality. Annual Review of Political Science 2:297-321.
Kalbfleisch, J. D., and R. L. Prentice. 1980. The statistical analysis of failure time data. New York: Wiley.
Knott, Jack H., and Gary Miller. 1992. The dynamics of convergence and cooperation: Rational choice and adaptive incrementalism. Unpublished ms. Michigan State University.
Lindblom, Charles. 1959. The science of "muddling through." Public Administration Review 19, no. 1:78-88.
Lindblom, Charles. 1965. The intelligence of democracy. New York: Free Press.
Lustick, Ian S. 1981. Explaining the variable utility of disjointed incrementalism. American Political Science Review 74, no. 3:342-53.
March, James G., and Herbert A. Simon. 1958. Organizations. New York: Wiley.
March, James G., and Johan P. Olson. 1976. Ambiguity and choice in organizations. Bergen, Germany: Universitetsforlaget.
McCubbins, Mathew D., and Thomas Schwartz. 1984. Congressional oversight overlooked: Police patrols vs. fire alarms. American Journal of Political Science 28, no. 1:165-79.
Milgram, Stanley. 1974. Obedience to authority. New York: Harper & Row.
Oehlert, Gary W. 2000. A first course in design and analysis of experiments. New York: W. H. Freeman.
Poundstone, William. 1992. Prisoner's dilemma. New York: Doubleday.
Pindyck, Robert S., and Daniel L. Rubinfeld. 2001. Microeconomics. 5th ed. New York: Prentice Hall.
Quattrone, George A., and Amos Tversky. 1984. Causal versus diagnostic contingencies: On self-deception and on the voter's illusion. Journal of Personality and Social Psychology 46, no. 2:237-48.
Rourke, Francis. 1984. Bureaucracy, politics, and public policy. Boston: Little, Brown.
Schelling, Thomas. 1960. The strategy of conflict. Cambridge, Mass.: Harvard University Press.
Shepsle, Kenneth A. 1979. Institutional arrangements and equilibrium in multidimensional voting models. American Journal of Political Science 23, no. 1: 27-59.
Simon, Herbert A. 1957. Models of man. New York: Wiley.
Steinbrunner, John D. 1974. The cybernetic theory of decision. Princeton, N.J.: Princeton University Press.
Tsebelis, George 1989. The abuse of probability in political analysis: The Robinson Crusoe fallacy. American Political Science Review 83, no. 1:77-91.
Wildavsky, Aaron. 1964. The politics of the budgetary process. Boston: Little, Brown.
Williamson, Oliver. 1985. The economic institutions of capitalism. New York: Free Press.
Zimmerman, H.-J., and P. Zysno. 1983. Decisions and evaluations by hierarchical aggregation of information. Fuzzy Sets and Systems 10:243-60.
Jack H. Knott
University of Illinois at Urbana-Champaign
Gary J. Miller
University of Illinois at Urbana-Champaign
|Printer friendly Cite/link Email Feedback|
|Author:||Knott, Jack H.; Miller, Gary J.; Verkuilen, Jay|
|Publication:||Journal of Public Administration Research and Theory|
|Date:||Jul 1, 2003|
|Previous Article:||Entrepreneurial strategies for managing interagency collaboration.|
|Next Article:||Sources of public service improvement: a critical review and research agenda. (Methods And Epistemology).|