Printer Friendly

New evidence about circumstantial evidence.

Abstract

Judicial factfinders are commonly instructed to determine the reliability and weight of any evidence, be it direct or circumstantial, without prejudice to the latter. Nonetheless, studies have shown that people are reluctant to impose liability based on circumstantial evidence alone, even when this evidence is more reliable than direct evidence. Proposed explanations for this reluctance have focused on factors such as the statistical nature of some circumstantial evidence and the tendency of factfinders to assign low subjective probabilities to circumstantial evidence. However, a recent experimental study has demonstrated that even when such factors are controlled for, the disinclination to impose liability based on non-direct evidence--dubbed the anti-inference bias-remains.

The present Article describes eight new experiments that explore the scope and resilience of the anti-inference bias. Among other things, it shows that this bias is significantly reduced when legal decision-makers confer benefits, rather than impose liability. In so doing, the Article points to a new legal implication of the psychological phenomenon of loss aversion. In contrast, the Article finds no support for the hypothesis that the reluctance to impose legal liability on the basis of circumstantial evidence correlates with the severity of the legal sanctions. It thus casts doubt on the common belief that sanction severity affects the inclination to impose legal liability. Finally, the Article demonstrates the robustness of the anti-inference bias and its resilience to simple debiasing techniques. These and other findings reported in the Article show that the anti-inference bias reflects primarily normative intuitions, rather than merely epistemological ones, and that it reflects conscious intuitions, rather than wholly unconscious ones. The Article discusses the policy implications of the new findings for procedural and substantive legal norms.
TABLE OF CONTENTS

ABSTRACT
INTRODUCTION
I.   EXPERIMENT 1: THE ANTI-INFERENCE BIAS AND AMOUNT OF
     EVIDENCE
       A. Motivation
       B. Participants, Materials, and Procedure
       C. Results and Discussion
II.  THE ANTI-INFERENCE BIAS AND LOSS AVERSION
       A. Theoretical Background
       B. Experiment 2a: Dairy Farmer
            (1) Participants, Materials, and Procedure
            (2) Results
       C. Experiment 2b: Dairy Farmer Disputants
            (1) Participants, Materials, and Procedure
            (2) Results
       D. Experiment 2c: A Within-Subject Design
            (1) Motivation
            (2) Participants, Materials, and Procedure
            (3) Results
       E. Discussion
III. SEVERITY OF SANCTIONS
       A. The Severity-Leniency Hypothesis
       B. Experiment 3a: Highway
       C. Experiment 3b: Bus
       D. Experiment 3c: Antibiotics
       E. Discussion
IV.  EXPERIMENT 4: DEBIASING THE ANTI-INFERENCE BIAS?
       A. Participants, Materials, and Procedure
       B. Results
       C. Discussion
V.   GENERAL DISCUSSION
CONCLUSION
APPENDIX


INTRODUCTION

In common and legal parlance alike, a distinction is often drawn between direct and circumstantial evidence. Direct evidence, such as a testimony of an eyewitness who saw the defendant committing the crime, aims to prove a material fact without the mediation of a deductive process. In contrast, circumstantial evidence, such as a testimony of an eyewitness who saw the defendant fleeing from the crime scene with a gun in his hand, requires an additional mental step of inference to determine whether the alleged material fact is or is not true. (1)

This distinction has been analytically contested on the ground that any evidence involves an inference. (2) For example, when a witness testifies that she saw the accused shooting the victim, and the factfinder's impression is that the witness is telling the truth, the factfinder is inferring that the accused actually shot the victim. Direct evidence may thus be described as requiring a single inference (from the evidence to the material fact), while circumstantial evidence requires at least two: from the evidence to an underlying fact, and from the underlying fact to the material one. Arguably, the distinction between the two types of evidence is a matter of degree: some evidence requires more inferential steps than others.

Echoing the analytical claim that there is no fundamental difference between direct and circumstantial evidence, the modern trend is to abandon rules that limited the use of circumstantial evidence in the past. (3) Instead, factfinders are instructed to determine the reliability and weight of any evidence without prejudice to circumstantial evidence. (4) and commentators generally support this position. (5) However, empirical and experimental studies have long demonstrated that factfinders are far more inclined to base their conclusions on direct evidence than on indirect evidence. (6) This inclination has been described as an "unsettling paradox," given that circumstantial evidence is often more reliable than direct evidence (7) due to the various pitfalls of eyewitness testimony. (8)

Many explanations have been put forward for the tendency not to impose liability based on circumstantial evidence alone. These include the claims that circumstantial evidence is objectively less conclusive, or is subjectively perceived as such; (9) that the vivid and concrete nature of direct evidence facilitates the formation of a coherent story--which makes the imagination of an alternative story less likely (compared to circumstantial evidence, which is often pallid, abstract, and general); (10) that unlike direct evidence, circumstantial evidence, particularly in statistical form, is often not case-specific, but instead rests on too little information and is less resilient to contradictory information; (11) that basing liability on circumstantial evidence alone entails greater responsibility being taken by the factfinder; (12) and more. (13)

However, all these explanations are at best incomplete because the; conflate the distinction between direct and circumstantial evidence with related distinctions, such as between: eyewitness testimonies and scientific data, concrete and statistical proof, and probabilistic and conclusive evidence. Once one recognizes that circumstantial evidence may be both eyewitness or forensic, concrete or statistical, probabilistic or conclusive and so forth, the limitations of these explanations become apparent. Indeed in a series of controlled experiments, Eyal Zamir, liana Ritov, and Doron Teichman have demonstrated that none of the previous explanations fully account for the reluctance to impose liability based on circumstantial evidence alone. (14) Using first-year law students and advanced LL.B. and LL.M. students as subjects, they found that this reluctance was not at all, or only partially, mediated by respondents' subjective probabilit; assessments. Moreover, it was manifested across different scenarios; held true for both criminal and civil liability; was apparent when both kinds of evidence were of the same type (e.g., technological, scientific, or eyewitness); and irrespective of whether the question was who had committed the wrong or whether a wrong had been committed at all. In their experiments, the anti-inference bias pertained to case-specific circumstantial evidence, rather than merely statistical evidence. The strong bias was exhibited in ex-post judicial decision-making, in ex-ante policymaking, and when respondents were asked to make the decision themselves or to prescribe how another person should decide a case.

For example, in one experiment, subjects were asked whether they would convict a driver for exceeding the speed limit when the permitted speed on a certain road was 100 kilometers per hour (kph), and according to the camera system installed on that road, he drove his car at a speed of 125 kph late at night. In the Direct-Evidence (Direct) condition, the car's speed was documented by a speed camera. In the Inference condition, the cameras in the system did not document the speed of the passing vehicles, but rather the precise time at which each vehicle passed by each of two cameras at either end of a section of the road, and the driver's speed in that section was then inferred from the distance between the cameras and the time that had elapsed between the two points. In both conditions, it was stated that "the probability of an error in the camera system is 2%." Nonetheless, the rate of conviction in the two conditions differed significantly: while in the Direct condition 81.4% of the respondents convicted the driver, only 60% of respondents did so in the Inference condition. (15)

These and other comparable results suggest that there is a deep-seated psychological bias against basing liability on inferences--an anti-inference bias, which may result from over-generalization:
   Much like other simplifying heuristics, the anti-inference
   heuristic functions as a substitute for extensive algorithmic
   processing, and yields judgments that are usually accurate.
   Ordinarily, when we see something with our own eyes, or
   when someone tells us that she saw something herself, that
   event actually happened; this is not necessarily true of
   conclusions drawn from circumstantial evidence. We tend
   to trust our power of vision more than we trust our power
   of deduction. However--like other biases and heuristics--the
   anti-inference heuristic also gives rise to systematic
   error; it strongly affects decisions even when the objective
   and subjective probabilities of the pertinent occurrence are
   equal according to circumstantial and direct evidence. (16)


Even assuming that there are good reasons for the reluctance to base liability on some forms of circumstantial evidence, such as naked statistical evidence (a hotly debated topic), (17) the general disinclination to rely on circumstantial evidence cannot be justified. Hence, the anti-inference bias is normatively troubling, and legal policymakers should consider ways to overcome or circumvent it. Four conceivable reactions to the anti-inference bias are (1) debiasing, (2) reforming evidence law, (3) changing substantive legal rules, and (4) altering initial behavior. Debiasing may, for example, take the form of specifically instructing factfinders that circumstantial evidence must not be treated differently from direct evidence, and perhaps even drawing attention to the existence of the anti-inference bias. (18) Evidence-law reforms may strive to attain similar results by introducing legal presumptions, such that once a basic fact is proven, the existence of the ultimate fact is presumed as a matter of law. (19) Substantive legal rules may obviate the need for an inference altogether--as, for example, when criminal liability is imposed for preparatory offenses without having to prove a criminal attempt to commit the target offense. (20) Finally, it is sometimes possible to take measures such that, if judicial factfinding becomes necessary, factfinders would have more and better direct evidence at their disposal, thereby rendering the reliance on circumstantial evidence unnecessary. A pertinent example is the use of body cameras by law enforcement officers, (21) and an even better one might be the use of CCTV systems.

The previous studies leave important questions unanswered and raise new ones, which are of interest to both jurists and psychologists. One question is whether the anti-inference bias may simply stem from differences in the amount of evidence necessary to establish legal liability (and hence in the meaning of the information provided regarding the objective reliability of the evidence). Arguably, in the experiments conducted by Zamir and his colleagues, minor differences of this sort were present between the Direct and Inference conditions, which could have affected participants' decisions. In Part I we describe a new experiment designed to examine this possibility. The experiment refutes the conjunction that the anti-inference bias is a product of the difference in amount of evidence necessary to establish liability.

Another major question is whether the anti-inference bias exists only when legal decisions involve the imposition of liability, or whether it also exists when they confer benefits. A central tenet of Prospect Theory-arguably the most influential theory in judgment-and-decision-making studies and behavioral economics--is the gain-loss asymmetry in people's judgments and decisions. (22) As further described below, (23) people ordinarily perceive outcomes--for themselves and for others--as gains and losses, rather than as final states of wealth or welfare. The disutility generated by a loss is greater than the utility produced by a similar gain, hence people tend to display loss aversion. The notion that losses loom larger than gains is reflected in prevailing moral intuitions, as well. Commonsense morality prioritizes the prohibition of harming other people (encapsulated in the Latin phrase primum non nocere) over the imperative to benefit others. (24) Hence, it is possible that the anti-inference bias would be more pronounced in the realm of losses than in the realm of gains. Since in civil litigation, one litigant's losses are the other's gains, and since legal decisions may be framed as primarily entailing gains or losses, the interaction between the anti-inference bias and loss aversion is of considerable theoretical and practical significance. Accordingly, Part 11 of the Article describes three new experiments in which we sought to explore the relationships between the anti-inference bias and loss aversion. The experiments revealed that the anti-inference bias is indeed stronger in the realm of losses than in the realm of gains. We discuss the normative and policy implications of this finding.

In addition to exposing the relationship between the anti-inference bias and loss aversion, the experiments described in Part II contribute to the understanding of the anti-inference bias in two important respects. First, while previous experiments ruled out the possibility that the anti-inference bias is merely a product of differences in the subjective probability assessments of the disputed facts, they did not examine the possibility that it is a product of differences in people's confidence in their probability assessments. As we shall see, our experiments rule out this possibility as well, thus reinforcing the claim that there is an independent psychological reluctance to rely on inferential evidence.

The second contribution of the new experiments pertains to the experimental design. Previous studies used a between-subjects experimental design--meaning that each participant was presented with either direct or circumstantial evidence. (25) They revealed statistically significant differences in the willingness to impose liability between the two conditions. However, some heuristics and biases that have been demonstrated in between-subjects experimental designs disappear, or at least diminish, in within-subject designs, namely when the same subject is called to make a decision under both scenarios. A within-subject design draws participants' attention to the differences between the decision-tasks presented to them, thus making the researcher's intent more transparent. It provides participants with a chance to consider whether these differences are significant and to correct their errors. (26) As detailed below, Experiment 2c found that the anti-inference bias is evident in a within-subject design as well, thus attesting to its robustness.

The finding that the anti-inference bias is stronger in the realm of losses than in the realm of gains raised the question of whether it would be stronger when the consequences of legal liability are more severe. There is a prevalent, albeit contested, belief that the harsher the expected sanction, the more reluctant judicial decision-makers are to find a defendant liable. (27) Would subjects be more reluctant to rely on circumstantial evidence when the sanctions are harsher? Part III describes three new experiments in which we set out to examine this question by varying the severity of the sanction presented to the subjects. These experiments replicated the finding of the anti-inference bias but found no correlation between this bias and the severity of sanctions. Moreover, these experiments found no correlation between the severity of sanctions and the inclination to impose legal liability based on either direct or indirect evidence--thus contributing to the doubts concerning the common belief about the relationship between sanction severity and the inclination to impose legal liability.

Finally, the identification of the anti-inference bias and its problematic ramifications for judicial factfinding raises the issue of debiasing, which has never been tested experimentally in this context. To fill this gap, Part IV of the Article describes an experiment in which we examined the robustness and resiliency of the anti-inference bias by applying two debiasing techniques. One technique was to emphasize the positive, direct findings that establish the basic fact from which the inference is drawn. The other technique--inspired by Sherlock Holmes's famous dictum that "When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth" (28)--highlighted the logical inevitability of the inference. We found that neither using each technique alone, nor applying them together, eliminated the anti-inference bias, or even statistically significantly reduced it. While these results do not establish that nothing can be done about the anti-inference bias, they do show that the bias is robust and merits the attention of legal policymakers.

The new findings--in particular (1) the considerably greater effect of the anti-inference bias when factfinders impose losses, rather than confer gains; (2) the manifestation of the bias in a within-subject design that draws subjects' attention to the similarity between the two types of evidence; and (3) the bias's persistence even when subjects are confronted with the logical inevitability of the inference (29)--shed new light on the very nature of the anti-inference bias. These findings indicate that the bias is not merely (or even primarily) associated with the prevailing notions of how facts can be determined but also (and perhaps primarily) with normative judgments. People seem to believe that imposing liability based on circumstantial evidence is less appropriate even when the direct and circumstantial evidence do not differ in terms of the objective or subjective probability of the truth of the material facts, or in terms of their confidence in the probability assessment. People adhere to this belief even when their attention is drawn to the logical inevitability of the inference. These findings also indicate that the bias is not necessarily unconscious.

The remainder of this Article consists of five parts. Each of the first four describes experimental studies designed to examine the robustness of the anti-inference bias when the amount of available evidence is equal in the direct and inference conditions (Part I), the interaction of the anti- inference bias with loss aversion (Part II), its interaction with severity of sanction (Part III), and the possibility of debiasing this bias (Part IV). Each of these parts first surveys the theoretical background and motivation for the experiments, then describes the experiments and their results, and finally discusses the implications of those results. Part V pulls the threads together and offers general observations about our findings and their implications.

I. EXPERIMENT 1: THE ANTI-INFERENCE BIAS AND AMOUNT OF EVIDENCE

A. Motivation

To establish the existence of the anti-inference bias, Zamir, Ritov, and Teichman used vignettes that differed in terms of the type of evidence supporting the imposition of liability: direct or inferential. To isolate the effect of inference, the objective reliability of the evidence was made as similar as possible. However, there were still minor differences between the direct and inference conditions in terms of the amount of evidence necessary to establish liability, and hence in the possible interpretation of the information regarding the reliability of the evidence. Thus, in the Highway experiment described above, (30) although in both conditions subjects were informed that "the probability of an error in the camera system is 2%," some of them might have believed that since the two-camera system is comprised of two cameras and the speed-camera system is comprised of only one, the overall probability of error in the former is larger. Similarly, in their Antibiotics experiment, (31) the direct-evidence condition described two lab examinations, each of which independently supported the imposition of liability. In contrast, the inference evidence contained only one examination supporting liability. Although the probability of mistake in the laboratory findings was described in the same way in both conditions, two independent pieces of evidence may have appeared more reliable than one.

Although Zamir, Ritov, and Teichman found that subjects' subjective probability assessments did not mediate the differences found between the direct and inference conditions, we sought to reexamine their results when the amount of evidence in the two conditions was equalized, hence the information about evidence reliability cannot be interpreted differently in the two conditions.

B. Participants, Materials, and Procedure

A total of 183 participants took part in Experiment 1, of which 164 (58 women and 106 men) correctly answered the two comprehension questions included in the experiment, and only their responses feature in the analysis. Participants' ages ranged from 12 to 59, with a mean of 26.3.

Like all the experiments described in this Article, Experiment 1 was conducted through Amazon's Mechanical Turk (M-Turk)--a popular web-based platform used to study people's judgments and decision-making. The participants were mostly non-lawyers, U.S. residents, who accordingly may actually fulfill the role of judicial factfinders when serving as jurors. To prevent participants from figuring out the nature of the experimental manipulations, they were prevented from taking part in more than one experiment. All experiments included comprehension checks, and only respondents who answered those checks correctly were included in the reported analyses (the inclusion of all respondents, however, did not significantly change any of the results). Participants in Experiment 1 were paid $0.70 for their participation in the experiment, which lasted a few minutes.

The vignette used in this experiment was a modified version of the vignette used by Zamir, Ritov, and Teichman in their Experiment 4 ("Antibiotics"). (32) We employed a between-subjects design with two conditions, Direct and Inference (see Appendix). Each participant was randomly assigned to one of the two conditions.

In the Direct condition, respondents were asked to imagine that they were serving as a judge in a monetary suit filed by a small dairy against a dairy farmer. The dairy buys the farmer's milk. According to the contract, the fanner should make sure that there are no antibiotics residues in the milk (that is, he must ensure that if one of the cows is treated by antibiotics, her milk will not be sent to the dairy) because such residues obstruct the production of various products. Every day in which the milk delivered by the farmer contains antibiotics residues, according to the contract he must pay the dairy an agreed sum of $5,000. The dairy claims that the farmer delivered milk with antibiotics residues, so he must pay it the agreed sum.

The vignette further indicated that the milk is delivered to the dairy by a tank truck that transports the milk of two farmers. Since the milk of the two farmers is mixed in the tank, a sample is taken from each fanner's milk before pumping it into the tanker, and the samples are delivered to a laboratory, where they will be examined if necessary. When the yogurt production process failed, the milk samples of the two farmers were to be examined (it was undisputed that the source of the antibiotics residues could only be the milk of one of them), but it turned out that the sample of the other farmer's milk had been lost in the laboratory; hence it was only possible to examine the defendant's sample. This examination revealed that there were antibiotics residues in the defendant's milk. The reliability of the laboratory testing was not perfect. According to the vignette, the probability that the results of the laboratory examinations were correct (that is, that there were antibiotics residues in the defendant's milk) was 85%. Based on these results, the dairy claimed that the fanner should pay it the agreed sum.

In the Inference condition, the case was similar to the Direct condition--except that now it was not the other farmer's sample that was lost. Instead, here it turned out that the sample of the defendant's milk was lost in the laboratory; hence it was only possible to examine the other farmer's sample. This examination revealed that there were no antibiotics residues in the other farmer's milk. Here too, it was undisputed that the source of the antibiotics residues could only be the milk of one of the farmers, and the reliability of the laboratory testing was 85%. Based on these results, the dairy claimed that the farmer should pay it the agreed sum.

Evidently, in this experiment both the Direct and Inference conditions refer to a single, equally reliable, piece of evidence: a laboratory examination of one milk sample (in contradistinction to Zamir, Ritov, and Teichman's experiment, where the direct evidence involved the examination of the two fanners' samples, and only the inference conditions involved a lost sample and the examination of the remaining one). Since it was undisputed that the source of the antibiotics residues could only be the milk of the two farmers, the inference that if there were no such residues in the other farmer's milk, there must have been residues in the defendant's milk, is straightforward. Moreover, since according to the contract the defendant's duty to pay the agreed sum was independent of the presence or absence of antibiotics in the other fanner's milk, not knowing if there were such residues in the Direct condition (because the other farmer's sample was lost) was immaterial for the defendant's liability. If anything, it should have decreased the inclination to impose liability in the Direct condition because the production process might have failed anyway (if there were residues in both fanners' milk). Finally, in Zamir, Ritov, and Teichman's Antibiotics experiment, only the Inference condition involved the losing of a sample, which might have shed negative light on the Laboratory's trustworthiness in that condition. In contrast, in the present experiment the two conditions were similar in this regard, as both involved a lost sample.

After reading the case description, respondents were asked to indicate how they would decide the case on a scale of 1 and 9, where 1 indicated "I would certainly accept the claim and order the farmer to pay the dairy the agreed sum" and 9 indicated "I would certainly dismiss the claim and would not order the farmer to pay the dairy the agreed sum." They were also asked to specify the probability, in percentage, that there were antibiotics residues in the defendant's milk. We dubbed these questions Decision and Probability, respectively.

C. Results and Discussion

Respondents were more likely to accept the claim in the Direct than in the Inference condition. On a scale of 1 to 9, where 1 meant certainly accepting the claim and 9 certainly dismissing it, the mean scores in Direct and Inference were 4.2 and 5.84, respectively. Independent sample t-test revealed that this difference was highly significant t(162)=4.325, p<0.001). (33) Comparing the probability assessment revealed a significant difference as well--the mean scores being 59.53% in Inference and 75.46% in Direct (an independent sample t-test yielded the result of t(162)=4.651, p<0.001). However, the assessed probability did not mediate the effect of evidence type on Decision. Including it as a covariate in an ANOVA of Decision still yielded significant effect of type of evidence (F(1,163)=7.055 p<.01). (34)

Experiment 1 was meant to address the concern that the results obtained in the series of studies conducted by Zamir, Ritov, and Teichman, have not established a deep-seated psychological bias, but might have been due to different understandings of the infonnation about evidence reliability in the two conditions. By slightly modifying the vignette. Experiment 1 allowed us to rule out this possibility, thus reinforcing the existence and robustness of the anti-inference bias.

II. THE ANTI-INFERENCE BIAS AND LOSS AVERSION

A. Theoretical Background

The hypothesis that people are less reluctant to make decisions on the basis of circumstantial evidence in the realm of gains than in the realm of losses stems from the close relationship between the psychological phenomenon of loss aversion and the moral prohibition of harming other people. Hence, a few words about the two are in order.

According to rational choice theory, among the available options, people choose the one that would maximize their expected utility, as determined in absolute terms. (35) In contrast, prospect theory posits that people generally do not perceive outcomes as final states of wealth or welfare, but rather as gains and losses. (36) Gains and losses are defined relative to a baseline or reference point. The value function is normally steeper for losses than gains, indicating loss aversion. (37) People's choices therefore crucially depend on how they frame any given choice. In particular, an individual's reference point determines whether she perceives changes as gains or losses. The centrality of reference points and the notion that losses loom larger than gains hold true for risky and riskless choices alike. (38)

Loss aversion and related psychological phenomena, such as the endowment effect (people's tendency to value things they already have higher than things they are yet to acquire) and the status quo bias (the inclination to stick to the status quo when departing from it involves both gains and losses, prospects and risks), have had a significant impact on the law and legal theory. (39) This impact is due in part to the correspondence between loss aversion and the prevailing moral convictions. By and large, the law conforms to prevailing moral intuitions, (40) and since the latter are closely linked to notions of reference-points and loss aversion, (41) these notions shape the law as well.

Commonsense morality is deontological. People believe that enhancing good outcomes is desirable, yet they also hold that attaining this goal is subject to moral constraints. Most importantly, there is a moral prohibition on intentionally/actively harming other people. It is immoral, for example, to kill one person and harvest her organs to save the lives of three other people, even though the benefit of such act (saving three people) outweighs its cost (killing one person). (42)

Deontological morality distinguishes between harming someone and not benefiting her. Were promoting the good as compelling as eliminating the bad, the distinctions between doing and allowing and between intending and foreseeing, which are essential to the deontological moral constraint against harming people (or at least one of the two is), would have collapsed. According to these distinctions, while it is forbidden to intentionally/actively harm others, there is a considerably less stringent constraint against merely foreseeing or allowing people to suffer an injury or a loss. (43) The prohibition on killing one person in order to save the lives of three other people necessarily implies that intentionally or actively killing an involuntary "donor" is worse than merely foreseeing or allowing the death of three people to occur. Otherwise, there would be a prohibition against both killing the one and not killing her (thus foreseeing or allowing the death of the three). Now, whenever an agent abides by the prohibition against intentionally or actively doing harm (e.g., she refrains from killing one person), she simultaneously avoids intending or doing harm to the one and avoids intending or doing good to the three. The intending/foreseeing and doing/allowing distinctions thus inevitably entail a distinction between intending good and intending bad and between doing good and doing bad. Promoting the good is less morally compelling than eliminating the bad. (44)

The moral distinction between promoting the good and eliminating the bad corresponds straightforwardly with the psychological notions of reference points and loss aversion. Losses, unhappiness, disutility, and harm loom larger than gains, happiness, utility, and benefit. As various experimental studies have demonstrated, loss aversion characterizes not only people's perceptions and choices about their own health, wealth, or welfare, but also regarding the effects of their decisions on the health, wealth, or welfare of others. (45)

We therefore conjectured that, since most people share the conviction that inflicting losses on other people is more objectionable than not conferring benefits upon them, people would be more reluctant to rely on inferential evidence in the realm of losses than in the realm of gains.

Another goal of the experiments described in this Part was to examine an issue that was raised, but not resolved, in previous studies--namely whether the reluctance to impose liability based on circumstantial evidence can be explained by the subjects' degree of confidence in their subjective probability assessments. (46) If people's disinclination to impose liability based on circumstantial evidence is not (or only partly) explained by their subjective assessment of probability--as shown by Zamir, Ritov, and Teichman--perhaps it might be explained by their lower degree of confidence in the accuracy of their subjective assessment (in principle, the two are separable: a person may be very confident in her assessment that the probability that something has happened is low, or conversely judge the probability of an event to be high, but have little confidence in that assessment).

This possibility may be linked to ambiguity aversion--the aversion to situations in which not only the outcomes, but also their probabilities, are unknown, or only partially known. Ambiguity aversion is often demonstrated by the famous Ellsberg Paradox. (47) Imagine that there are two urns, containing red and black balls, from which a single ball is drawn at random. It is known that one of them contains precisely 50 red balls and 50 black ones. The other urn contains 100 balls as well, but it is unknown how many of them are red and how many are black. You win a prize if you draw a red ball from either urn: which urn would you prefer to draw a ball from? Most people prefer to draw a ball from the former urn (the one with the known probabilities). They prefer drawing a ball from the urn with the known probabilities, even if immediately after making their first choice they are offered a similar prize if they draw a black ball from one of the two urns--thus ruling out the possibility that they preferred this urn because they suspected that there were fewer red--or black--balls in the other one. (48) Thus, when people are less confident about the accuracy of their probability assessment, they might display a greater anti-inference bias due to ambiguity aversion.

The idea that people may be more reluctant to rely on circumstantial evidence because they are less confident about their subjective assessment of the probability that the material fact is true may also be associated with the theoretical distinction between the probability that a certain event happened and the weight or resiliency of the evidence supporting that probability. The latter distinction is one of the justifications offered for the reluctance to impose liability based on naked statistical evidence (although, in principle, it is relevant to any type of evidence). According to this claim, decision-makers may dismiss a lawsuit even when the probability that the plaintiffs version is correct meets the controlling standard of proof, if their assessment of this probability rests on too little information, or on general, non-case-specific evidence. For example, the fact that there were a thousand spectators at a rodeo, but only 300 tickets were sold, means that for any spectator chosen at random, there is a 70% probability that she is a gatecrasher (since 700 of the 1000 spectators did not purchase a ticket). The factfinder may nonetheless dismiss a claim based solely on this evidence on the grounds that it is too thin an evidentiary basis for a confident assessment of guilt with regard to any given defendant. (49) To be sure, one's stated confidence in one's own probability assessment, the phenomenon of ambiguity aversion, and the notion of weight of evidence are three distinct concepts. There is no logical entailment between them-and yet they are somewhat related. (50)

To test these issues, we conducted three experiments through Amazon's Mechanical Turk (M-Turk). Participants were paid $0.70 (in the first two experiments described below) or $1.00 (in the third experiment).

B. Experiment 2a: Dairy Farmer

(1) Participants, Materials, and Procedure

A total of 403 participants took part in Experiment 2a, of which 366 (144 women and 222 men) correctly answered the two comprehension questions included in the experiment, and only their responses feature in the analysis. Participants' ages ranged from 19 to 92, with a mean of 35.

The vignette used in this experiment was again a modified version of the vignette used by Zamir, Ritov, and Teichman in their Experiment 4 ("Antibiotics") (51) because it was the most amenable to gain and loss versions. We employed a 2 (Domain: Gain, Loss) x 2 (Evidence type: Direct, Inference) between-subjects factorial design (see Appendix). Each participant was randomly assigned to one of the four conditions, which differed from one another only in those two independent variables. By comparing participants' decisions in the four conditions, we could determine how the domain and type of evidence affect decisions and whether they interact with one another.

In the Gain-Direct condition, respondents were asked to imagine that they were serving as a judge in a monetary suit filed by a dairy farmer against a small dairy that buys the farmer's milk. According to the contract, every day in which the milk's protein content is above 5%, the dairy must pay the farmer $1000 over and above the regular payment. It was further explained that the milk is delivered to the dairy by a tank truck that delivers the milk of two farmers and that the two farmers produce the same amount of milk. Since the milk of the two farmers is mixed in the tank, a sample is taken from each fanner's milk before it is pumped into the tanker, and the samples are delivered to a laboratory, where they are tested if necessary.

One day, the milk in the tanker contained 6% protein. When the dairy asked the lab to test each of the farmers' milk samples, the results revealed that the protein content in the milk produced by the plaintiff was 7%, while the protein content in the other farmer's milk was 5%. However, the reliability of the lab examinations was not perfect: the probability that the results were accurate (namely, that the protein content in the plaintiff's milk was indeed 7%, and that of the other fanner's milk 5%) was said to be 80%. The dairy paid the plaintiff the regular payment, and he claimed that he was entitled to the extra $1000 for high protein content.

In the Gain-Inference condition, the same description was provided--except that when the dairy asked the lab to test the milk samples, it was found that the plaintiffs sample had been lost at the lab, so only the other farmer's sample could be tested. The results revealed that the protein content in the milk produced by the other farmer was 5%. Once again, the description went on to explain that the reliability of the lab examination was not perfect: the probability that the result was accurate (namely, that the protein content in the other farmer's milk was 5%) was 80%.

In the Loss-Direct condition, the case was similar to the Gain-Direct condition--except that now the contract stipulated that the dairy deducts $1000 from its regular payment to the farmer if the protein content in his milk falls below 5%. When it was found that the milk in the tanker contained 4% protein, the two samples were tested, and the results showed that the protein content in the milk produced by the plaintiff was 3%, and the protein content in the milk produced by the other farmer was 5%. Here too, the probability that the results were correct was said to be 80%. The dairy deducted $1000 from the regular pay for low protein content, and the plaintiff claimed that he was entitled to the regular payment.

In the Loss-Inference condition, the contract was similar to the one in the Loss-Direct condition, as was the rest of the description--except that the plaintiff's sample had been lost at the lab, so the dairy deducted the $1000 from his payment based on the results of the examination of the other farmer's sample (which was 5%).

In all conditions, the farmer was the plaintiff, but in the Gain conditions he sued for the additional amount, while in the Loss conditions he sued to recover the deduction, that is, to eliminate his loss. This formulation was used to overcome decision-makers' tendency to dismiss claims unless the plaintiffs evidence is very compelling (due to their omission bias), as has been demonstrated in previous experimental studies. (52) By describing the farmer as the plaintiff in all conditions, we sought to neutralize the possible effect of describing him as a plaintiff in the Gain conditions and as a defendant in the Loss conditions. Consequently, in the Loss conditions, reliance on the (direct or inferential) evidence would lead to dismissal of the plaintiff's suit, whereas in the Gain conditions such reliance would lead to accepting it.

After reading the case description, respondents were asked the following questions:

How would you decide this case? Please specify a number between I and 9, where 1 indicates "I will surely accept the claim" and 9 indicates "I will surely dismiss the claim."

In your opinion, what is the probability, in percentage, that the [Gain: high; Loss: low] protein level is attributable to the plaintiffs milk?

How confident are you about your probability assessment in the former question? Please specify a number between 1 and 9, where 1 indicates "I am not confident at all" and 9 indicates "I am absolutely confident."

How fair do you think it will be to accept the claim in this case? Please specify a number between I and 9, where I indicates that accepting the claim will be "absolutely fair" and 9 indicates that accepting the claim will be "absolutely unfair."

We dubbed these questions Decision, Probability, Confidence, and Fairness, respectively.

(2) Results

To compare the results in the Gain conditions with those in the Loss conditions, we created a new variable that reflects the degree to which the decision is in the direction indicated by the evidence. We dubbed this measure Reliance. Thus, higher values of Reliance indicate a greater tendency to decide in line with the (direct or inferential) evidence. In the Loss conditions, Reliance is identical to the decision rating: the higher the Reliance value, the more likely the respondents are to reject the claim. In the Gain conditions, we inverted the decision rating: the higher the Reliance value, the more likely the responder is to accept the claim. The mean Reliance ratings are presented in Figure 1.

An ANOVA of Reliance by Gain/Loss domain and Direct/Inference evidence type as between-subject independent variables yielded the following results. First, as predicted, the data replicated the anti-inference bias found by Zamir, Ritov, and Teichman: participants were significantly more reluctant to rely on inferential evidence than on direct evidence: mean Reliance scores were 5.04 in the Inference conditions, and 6.08 in the Direct conditions--where 1 indicates no reliance on the evidence, and 9 indicates total reliance. (53) The effect of Gain versus Loss was also highly significant: the mean Reliance score was higher in the Gain conditions than in the Loss ones: 6.34 vs. 4.71, respectively. (54)

We also found a significant interaction between the Gain/Loss and the Direct/Inference factors:55 respondents were more reluctant to rely on the inferential evidence in the domain of losses than in the domain of gains. The mean Reliance scores in the Loss conditions were 3.91 and 5.51 for Inference and Direct, respectively, while the corresponding means in the Gain conditions were 6.06 and 6.63 for the Inference and Direct conditions, respectively. In fact, while the difference between the Loss-Direct and Loss-Inference was highly statistically significant (p<.001), the difference between Gain-Direct and Gain-Inference was only marginally significant (p=.092).

The respondents' assessment of the probability that the deviation in the protein content was attributable to the fanner in question was higher in the Gain than in the Loss conditions. (56) Evidence type also had a marginally significant effect on this assessment of probability: (57) participants estimated the probability to be lower in the Inference conditions than in the Direct ones. The interaction between domain and evidence type did not reach a significant level (p=.22)--meaning that we found no support for the possibility that they affect each other, as far as the probability assessment is concerned.

The higher the probability estimates, the stronger the expressed Confidence in the assessment of probability. (58) Furthermore, Confidence was influenced by the type of evidence, such that it was higher in the Direct conditions than in the Inference ones (6.38 and 6.63 in Gain-Direct and Loss-Direct, respectively, and 5.78 and 5.82 in Gain-Inference and Loss-Inference, respectively). (59) Respondents were more confident about their assessment of probability when the evidence was direct. They were marginally more confident in the Gain conditions than in the Loss ones. (60) However, the interaction of domain and evidence type was not significant (p=.88). Thus, both the assessment of probability and the Confidence judgment were similarly affected by the type of evidence, regardless of the domain.

Furthermore, contrary to the previously noted possibility, the Probability and Confidence judgments did not mediate the effect of evidence type. Including Confidence and Probability as covariates in the ANOVA of Reliance by type of evidence and domain still yielded significant effects of type of evidence, (61) domain, (62) and a marginally significant effect of the interaction between them. (63) Finally, the fairness of accepting the claim was highly correlated with the decision to accept it. (64)

In summary, across the two domains, respondents relied on direct more than on inferential evidence. This effect cannot be accounted for by respondents' probability assessments or by their confidence in those assessments. More importantly for the present study, the tendency to rely on direct more than on inferential evidence was greater for decisions involving losses than for decisions involving gains.

C. Experiment 2b: Dairy Farmer Disputants

In Experiment 2a, the direct and inferential evidence pointed in opposite directions in the Loss and Gain conditions: in Gain, the evidence supported the claim, while in Loss the evidence supported its rejection. This design was meant to neutralize the possible effect of changing the identity of the plaintiff, which might have affected participants' decisions. However, there was a concern that the results were affected by the different role of the evidence in each condition: supporting the claim or supporting its rejection. To ensure the results were not driven by this factor, and to examine the generality of our findings, in Experiment 2b we described the same dispute without depicting the disputants as either plaintiffs or defendants.

(1) Participants, Materials, and Procedure

A total of 348 people took part in the experiment--274 of whom (120 women and 154 men) correctly answered the three comprehension questions included in the experiment, and only their responses were included in the analysis. (65) Participants' ages ranged from 19 to 76, with a mean of 35.

Experiment 2b employed a 2 (Domain: Gain, Loss) x 2 (Evidence type: Direct, Inference) between-subjects factorial design, using a vignette similar to the one used in Experiment 2a. However, unlike Experiment 2a, neither party was specified as being plaintiff or defendant. Thus, instead of "Assume that you are serving as a judge in a monetary suit that was filed by a dairy fanner against a small dairy," the opening sentence of the vignette was: "Assume that you are serving as a judge in a monetary dispute between a dairy fanner and a small dairy." Accordingly, the first question was not whether the suit should be accepted, but rather:
   How would you decide this dispute? Please specify a
   number between 1 and 9, where 1 indicates "A sum of
   $1000 should certainly be [Gain: added to; Loss: reduced
   from] the farmer's payment" and 9 indicates "A sum of
   $1000 should certainly not be [Gain: added to; Loss:
   reduced from] the farmer's payment."


(2) Results

The results of Experiment 2b substantially replicated those of Experiment 2a. Here, too, for the sake of clarity, we inverted the scale of Decision, such that Reliance increases when the decision is supported by the evidence. Figure 2 shows mean Reliance ratings in the four conditions. Once again, an ANOVA of Reliance by domain (Gain/Loss) and evidence type (Direct/Inference) revealed an anti-inference bias, as participants were significantly more reluctant to base their decision on the lab examination evidence when the evidence was inferential rather than direct. (66) Domain also had a highly significant effect: respondents were more inclined to increase the dairy farmer's remuneration in the Gain conditions than to decrease it in the Loss conditions. (67) More importantly for the present purpose, domain significantly interacted with evidence type (68)--indicating a stronger reluctance to rely on the inferential evidence when the decision involved losses as opposed to gains. As in Experiment 2a, while the difference between the Loss-Direct and Loss-Inference was statistically significant (p<.001), the difference between Gain-Direct and Gain-Inference was only marginally so (p=.056). Judgments of Probability, Confidence, and Fairness similarly replicated the patterns found in Experiment 2a. (69)

In summary. Experiments 2a and 2b showed a consistent pattern, whereby decisions about monetary outcomes are more strongly affected by evidence type when they involve losses rather than gains. This interaction effect was not evident in the attribution of probabilities, Confidence, or Fairness. While a greater inclination to rely on direct evidence than on inferential evidence was found in both the domains of losses and gains, in the Gain conditions the difference was only marginally significant.

D. Experiment 2c: A Within-Subject Design

(1) Motivation

In the third experiment, we set out to examine the robustness and generality of the findings of Experiments 2a and 2b, in several dimensions. Some heuristics and biases that have been demonstrated in between-subjects experimental designs disappear, or at least diminish, in within-subject designs because the within-subject design draws participants' attention to the differences between the decision-tasks, thus providing them with a chance to consider whether these differences are significant. (70) Biases that are manifested in both between- and within-subject experimental designs may thus be described as particularly strong.

In terms of external validity, on the one hand, judges (and juries all the more so) usually face a single decision, which is either in the domain of losses or in the domain of gains, and involves direct or merely circumstantial evidence. Hence, the between-subjects design, used in Experiments 2a and 2b, appears to be more appropriate. On the other hand, it is possible that professional judges, who make numerous decisions of all sorts, gain experience and learn to disregard those differences. Be that as it may, to test the robustness of the previous findings, we tested the anti- inference bias and its interaction with the loss/gain distinction in a within-subject design.

Experiment 2c also sought to ensure that the anti-inference bias was not a product of the subjects' inability to draw simple, straightforward inferences. To that end, we added a simple inferential question. Finally, similarly to Experiment 1 (and in contrast with Zamir, Ritov, and Teichman's Antibiotics experiment), all vignettes in Experiment 2c involved a lost sample, thus equalizing the amount of evidence and ruling out the possibility that the laboratory's perceived reliability differed between the direct and inference versions.

(2) Participants, Materials, and Procedure

A total of 102 people took part in the experiment--92 of whom (41 women and 51 men) correctly answered the two comprehension questions included in the experiment, and only their answers were included in the analysis. (71) Participants' ages ranged from 19 to 76, with a mean of 35.

Experiment 2c employed a 2 (Domain: Gain, Loss) x 2 (Evidence type: Direct, Inference) fully within-subjects factorial design, using a vignette similar to the one used in Experiment 2b. Since all subjects were presented with all four versions, the opening paragraph included both the loss and gain options:
   According to the contract, the payment to the fanner varies
   according to the percentage of protein in the milk. Every
   day in which the protein percentage in the milk is higher
   than 5%, $1000 is added to the regular payment, and every
   day in which the protein percentage is lower than 5%,
   $1000 is reduced from the regular payment.


Participants were then introduced to all four versions of Experiment 1b--Gain-Direct, Gain-Inference, Loss-Direct, and Loss-Inference--in random order. To minimize the differences between the Direct and Inference decision-tasks, each condition referred to only one lab result- pertaining to the sample provided by the farmer in question (in the Direct conditions) or that of the other fanner (in the Inference conditions)--since the other sample was lost. Thus, the Direct conditions stated: "When the dairy approached the lab to examine each of the farmer's milk samples, it was found that the milk sample of the other farmer was lost in the lab, so only the sample of the farmer in question could be examined."

After each scenario, participants were presented with the Decision and Probability questions, as in Experiment lb. In addition, at the end of the experiment, participants were asked a new question, (dubbed Ball), to check their basic inference comprehension:
   Imagine that you are asked to determine the color of a ball
   that you cannot see. The ball can be either blue or yellow.
   A friend tells you that the ball is not yellow. Assuming that
   your friend is telling the truth, what color is the ball? (Blue
   / Yellow / Impossible to know for sure / Another color).


(3) Results

The results of Experiment 2c substantially replicated those of Experiments 2a and 2b. Once again, for the sake of clarity, we inverted the scale of Decision, such that Reliance increases when the decision is supported by the evidence. Figure 3 shows mean Reliance ratings in the four scenarios. A repeated measure ANOVA with two within-subject factors--type of evidence (Direct vs. Inference) and domain (Gain vs. Loss)--yielded the following results. First, type of evidence and domain both significantly affected the decisions, as participants were significantly less willing to rely on the evidence when it was inferential and when it was in the Loss domain. (72) More importantly, the two factors significantly interacted with each other. (73) As Figure 3 shows, the type of evidence affected the decision more strongly in the Loss domain than in the Gain domain. Similarly to Experiments 2a and 2b, while the difference between the Loss-Direct and Loss-Inference was statistically significant (p<.001), the difference between Gain-Direct and Gain-Inference was only marginally so (p=.069).

Analysis of the probability judgments yielded significant main effects of evidence type and domain (74) but no significant interaction between them (p=.759). The subjective probability assessments were significantly higher in the Gain conditions than in the Loss conditions as well as in the Direct conditions than in the Inference ones. (75)

Finally, of the ninety-two respondents, only two did not answer the independent inference question (Ball) correctly. Thus, restricting the analysis to participants who answered the question correctly did not have a substantial influence on the results.

E. Discussion

The three experiments described above replicated the findings of Zamir, Ritov, and Teichman regarding the anti-inference bias. (76) They reinforced the generality and robustness of those findings, and for the first time revealed an important interaction between the anti-inference bias and loss aversion.

The experiments demonstrated that the reluctance to base decisions on circumstantial evidence extends far beyond the sphere of naked statistical evidence, which has previously been the focus of psychological research on circumstantial evidence. (77) The experiments were conducted with laypersons, who may nevertheless fulfill the role of judicial factfinders when serving as jurors. Similarly to Experiment 1, Experiment 2c demonstrated that the anti-inference bias persists with smaller differences between the Direct and Inference scenarios (in terms of the number of laboratory results), and that it is not a product of a failure to draw simple logical inferences. The robustness of the bias was further established in a within-subject design, which juxtaposed direct and inference evidence.

In Experiments 2a and 2b we asked the participants about the fairness of the decision based on the available evidence. We found a statistically significant positive correlation between the participants' willingness to rely on the evidence and their judged fairness of doing so. However, it is not clear how to interpret this correlation. Possibly, the fairer a decision to impose liability is perceived to be, the more the responder finds such imposition of liability appropriate and acceptable.

Previous studies examined seven theories as to why the anti-inference bias may be justified or explained, and found that none of them can fully account for it. (78) One theory that has not been tested was that the anti- inference bias reflects the factfinders' limited confidence in their ability to assess the probability that the disputed incident had taken place based on the circumstantial evidence. (79) Our findings reveal that this theory does not, in fact, account for the anti-inference bias, since the respondents' degree of confidence in their assessment of probability did not mediate their disinclination to rely on the inferential evidence (Experiments 2a and 2b).

Indirectly, these findings shed light on another possible explanation for the disinclination to rely on circumstantial evidence, namely its greater complexity compared to direct evidence. Were factfinders more reluctant to rely on circumstantial evidence because of their difficulty in understanding and drawing conclusions from such evidence, one would expect this difficulty to affect, first and foremost, their confidence in their assessment of probability. The fact that confidence does not mediate the disinclination to rely on circumstantial evidence suggests that the anti-inference bias cannot be fully accounted for by the greater complexity of such evidence.

More importantly, the current set of experiments extends our understanding of the anti-inference bias and of the extensively studied phenomenon of loss aversion. The anti-inference bias appears to be considerably stronger in the domain of losses: in all three experiments, while subjects were significantly more reluctant to rely on circumstantial evidence in the domain of losses, they were only marginally so in the domain of gains. The finding that the anti-inference bias is more pronounced in the domain of losses than in the domain of gains reveals a new facet of the well-studied phenomenon of loss aversion, namely, a greater reluctance to inflict a loss than to provide a gain, based on inferential evidence. This reluctance echoes the consistency found between loss aversion and the prevailing moral convictions that distinguish between the moral duty to benefit others and the much stricter moral prohibition against harming others--thus adding a new facet to these psychological and normative distinctions.

The finding that the anti-inference bias is more pronounced in the domain of losses than in the domain of gains suggests that this bias is not merely epistemological. It is not only a matter of people's convictions about knowledge, how it can be acquired and what justifies one's beliefs, but also, and perhaps primarily, a normative matter: pertaining to how one ought to act, morally speaking. This observation is supported by the correlation between the willingness to rely on the evidence and the judged fairness of doing so. It is further corroborated by Experiment 1c, in which both the anti-inference bias and the gain/loss distinction were demonstrated in a fully within-subject design. Even when subjects' attention was drawn to the similarities between the direct and indirect evidence and between gains and losses, most of them felt that it would be less appropriate to rely on circumstantial evidence--particularly when it involved inflicting losses. While the results of the between-subject design might be interpreted as pointing to an unconscious bias, the within-subject design indicates that the anti-inference bias is at least partly conscious. Of course, this does not mean that it is justified.

Be that as it may, the finding that the anti-inference bias is more pronounced in the domain of losses than in that of gains has practical implications, especially inasmuch as the perception of an outcome as involving gains or losses is subject to framing effects. The perception of outcomes as belonging to the domain of gains or to the domain of losses is sometimes manipulable. Thus, in the famous Asian Disease experiment, Amos Tversky and Daniel Kahneman demonstrated that people may be induced to perceive the same situation as involving either gains or losses simply by describing the expected results of a medical treatment in terms of either the patients' survival-rate (gain) or their mortality-rate (losses). (80) In the judicial context, criminal sanctions are almost invariably perceived as inflicting losses on the defendant. However, in civil litigation, a judgment that entails a loss to one party (for example, because it requires the defendant to pay damages), at the same time typically provides gains to the other party, the plaintiff. Accordingly, a civil suit may be framed either way. In fact, previous studies have demonstrated that the same outcomes of litigation can sometimes be framed as either providing gains or eliminating losses to the same party, depending on the induced reference point. (81) Similarly, in the three experiments described above, although the farmer's gain inevitably implied a loss to the daily and the farmer's loss implied a gain to the dairy, we managed to frame the decision tasks as involving either gains or losses. Insofar as our findings adequately capture the inclinations of legal factfinders, it follows that when a litigant relies exclusively on circumstantial evidence, her opponent would do well to emphasize the losses that would be inflicted if the evidence is relied upon, rather than the ensuing gains- thereby triggering a stronger anti-inference bias. Of course, the other party might try to invoke the opposite framing.

Having studied the relationship between the anti-inference bias and loss aversion, we now turn to our second mission: to explore the relationship between the anti-inference bias and the severity-leniency hypothesis.

III. SEVERITY OF SANCTIONS

A. The Severity-Leniency Hypothesis

There is a common belief among jurists and legal commentators that judicial decision-makers adjust their verdicts to the severity of the expected sanction: the harsher the sanction, the less judges and jurors are inclined to convict, hence they apply a higher standard of proof. (82) This belief--commonly dubbed the severity-leniency hypothesis--has an intuitive normative appeal, as the costs of erroneous conviction increase the harsher the sanction is. (83) However, this reasoning is inconclusive, as judicial factfinders may also be concerned about Type II errors (acquitting the guilty) or apply the prescribed standard of proof regardless of the severity of sanction. (84)

Turning to empirical studies, there is support for the proposition that the rate of acquittals increases with the severity of offenses and prescribed penalties. (85) However, one difficulty with observational studies of court decisions is that reduced conviction rates for more serious offenses and harsher sanctions may result from variables other than the factfinders' inclinations--such as the effort defendants put into their defenses. It stands to reason that defendants who face more serious charges and more severe sanctions put more effort into their defenses and therefore do better. (86)

Moreover, even some of the experimental studies that controlled for such additional variables failed to distinguish between the severity of the sanction and the seriousness of the offense. (87) A more serious offense would normally include additional elements, such as the requirement of premeditation for first-degree, but not for second-degree, murder. Hence, a lower conviction rate for more serious offenses does not establish a correlation between the severity of sanction and the inclination to convict, since it is perfectly sensible that a given set of evidence would result in fewer convictions when more elements are necessary for conviction. While there is consistent and unsurprising empirical support for a negative correlation between the seriousness of charges and the inclination to convict, there is mixed evidence (and disagreement about the interpretation of the available evidence) regarding the correlation between the severity of sanctions (controlling for the charge) and the conviction rate. (88) Even when capital punishment is involved, the picture is not unequivocal. (89)

A recent study by Angela Jones, Shayne Jones, and Steven Penrod possibly sheds light on the mixed results regarding the severity-leniency hypothesis. (90) It found that while harsher sanctions had no effect on the overall conviction rate, they did affect factfinders' inclination to convict, though in opposite directions. While civil libertarians (people who place more emphasis on civil liberties and the rights of the accused) were less likely to convict at higher levels of punishment, so-called legal authoritarians (people who put less emphasis on liberties and the rights of the accused) were more likely to convict. Perhaps tough-on-crime people are more inclined to convict when the sanction is higher because they perceive the sanction as a mark of the importance of fighting a certain type of offense.

Resolving the controversy over the severity-leniency hypothesis lies beyond the scope of the present article. Nevertheless, we sought to examine the related question of whether the severity of the outcomes of imposing legal liability would affect people's inclination to rely on inferential evidence. We hypothesized that, inasmuch as the severity-leniency hypothesis is correct, the reluctance to impose liability when its outcomes are more severe might also exacerbate the anti-inference bias. Accordingly, we conducted three experiments in which we varied the severity of the outcome. As with the experiments described above, these ones were conducted through M-Turk, and participants were paid $0.70 for their participation.

B. Experiment 3a: Highway

A total of 346 people took part in Experiment 2a. We report the answers of the 303 participants (129 women and 174 men) who correctly answered three comprehension questions. Participants' ages ranged from 20 to 72, with a mean of 34.

The experiment used a speed violation scenario based on Zamir, Ritov, and Teichman (see Appendix). (91) We employed a 2 (Sanction severity: Mild, Severe) x 2 (Evidence type: Direct, Inference) between-subject factorial design. In the Direct conditions, a speed camera documented a driver exceeding the speed limit by 15 mph. In the Inference conditions, the camera system did not document the cars' speed, but there were two cameras that documented the precise time that each car passed by them, allowing the driver's speeds to be inferred from the distance between the cameras and the time that elapsed between the two points. In all conditions, participants were told that "the probability of an error in the camera system is 2%." In the Mild-Sanction conditions, the penalty for exceeding the speed limit was a $50 fine. In the Severe-Sanction conditions, the penalty was suspension of the driving license for one month, plus a $100 fine.

An ANOVA of Reliance by sanction (Severe/Mild) and type of evidence revealed a significant anti-inference bias, (92) but no significant effect for sanction severity, (93) and no interaction between the two factors (p=578).

C. Experiment 3b: Bus

To examine the generality of the results of Experiment 2a, Experiment 2b employed a similar design but with a different vignette (see Appendix). A total of 359 people took part in Experiment 3b--31 lof whom (148 women and 163 men) correctly answered the comprehension questions, and only their answers were included in the analysis. Participants' ages ranged from 19 to 72, with a mean of 35.

A 2 (Sanction Severity: Mild, Severe) x 2 (Evidence Type: Direct, Inference) between-subject factorial design was used in this experiment, as well. In the Direct conditions, participants were asked to imagine that a tourist bus was stuck on an isolated backroad late in the evening. A policeman that arrived at the place assisted the driver to summon minibuses from nearby to drive the tourists to their destination. While doing so, the policeman got on the bus and counted 54 tourists on it -four more than the 50 passengers allowed under the terms of the bus's permit. Based on the policeman's report, the driver was charged with carrying an excessive number of passengers. The driver pleaded not guilty. The judge got the impression that the policeman was reliable and assessed the chances that he had made a counting error to be very low: 1 in 25. After being reminded that the required standard of proof was beyond reasonable doubt, participants were asked whether the judge should convict the driver.

The only difference between the Direct and Inference conditions was that in the latter, the policeman did not get on the bus. Instead, he noted that after two minibuses had arrived empty, filled up with passengers (which he ascertained), and driven away (the policeman counted the number of seats in the minibuses and made sure that they were filled up before driving away), four tourists were left, for whom an additional vehicle was ordered. Since there were 25 seats in each minibus, the policeman inferred that the bus had carried 54 tourists- four more than the number allowed under the terms of the bus's permit. The judge's

assessment of the policeman's reliability and the remaining instructions were identical to those in the Direct conditions.

As for the sanctions, in the Mild-Sanction conditions, the penalty for carrying an excessive number of passengers was $50. In the Severe-Sanction conditions, the penalty was suspension of the driving license for one month, plus a $100 fine.

Similarly to Experiment 3a, an ANOVA of Reliance by Sanction (Severe/Mild) and type of evidence revealed significant anti-inference bias (94) but no significant effect for sanction severity (95) or interaction between the two factors (p=.785).

D. Experiment 3c: Antibiotics

Since the difference between the mild and severe sanctions in Experiments 3a and 3b was relatively small (although a one-month license suspension is quite harsh for a professional driver), and since the between- subjects design may have made this difference inconspicuous in the absence of a reference point, we conducted yet another experiment. In this experiment the damages sued in the Severe-Sanction conditions were 25 times greater than in the Mild-Sanction conditions.

A total of 292 people took part in the experiment, 250 of whom (133 women and 117 men) correctly answered the two comprehension questions correctly, and only their answers were included in the analysis. (96) Participants' ages ranged from 19 to 70, with a mean of 35.

As with Experiments 3a and 3b, Experiment 3c used a between-subjects design with four conditions. Similarly to Experiment 1, the case description was based on Zamir, Ritov, and Teichman's Antibiotics Experiment. (97) It began with the following description:
   Imagine that you are a judge in a suit for damages filed by
   a small dairy against a dairy farmer. The farmer sells the
   dairy the milk he produces. According to the contract, the
   farmer should make sure that there are no antibiotics
   residues in the milk (that is, he must ensure that if one of
   the cows is treated by antibiotics, her miik will not be sent
   to the dairy), because such residues obstruct the production
   of various products. The dairy claims that the farmer
   delivered milk with antibiotics residues, and that
   consequently the yogurt production process failed and it
   had to discard of all of the raw materials it used in the
   process. As a result, it suffered [Mild: a small loss of
   $1,000; Severe: a large loss of $25,000].


The vignettes went on to describe the delivery of the milk from the two farmers in a tank truck, the sampling of each farmer's milk, and its testing in a laboratory, if necessary. In this experiment, the reliability of the laboratory tests was said to be 85%.

In the Direct conditions, both samples were tested when the yogurt production process failed--revealing antibiotics residues in the defendant's milk and no such residues in the other farmer's milk. In the Inference conditions, the sample of the defendant's milk was lost at the laboratory-so only the other farmer's sample was tested, revealing no antibiotics residues in the other farmer's milk.

The four questions at the end of the vignettes--Decision, Probability, Confidence, and Fairness--were based on the same four questions used in Experiments 2a-c, while emphasizing the severity of the sanction. Thus, the Decision question was worded as follows:
   How would you decide this case? Please specify a number
   between 1 and 9, where 1 indicates "I will surely accept
   the claim and order the farmer to compensate the dairy for
   its [Mild: small; Severe: large] loss" and 9 indicates "I
   will surely dismiss the claim."


An ANOVA of Reliance by Sanction (Severe/Mild) and evidence type (Direct/Inference) once again revealed an anti-inference bias, as participants were more reluctant to rely on the lab examination evidence when the evidence was inferential rather than direct. (98) However, Sanction had no significant effect, (99) nor did it interact significantly with evidence type. (100)

E. Discussion

Experiments 3a-c demonstrated an anti-inference bias, but no effect for sanction severity. Due to the between-subjects design of the experiments, and in the absence of a reference point, it may be argued that the differences in the sanctions were inconspicuous. However, this is not the case in Experiment 3b, where the penalty is a one-month driving license suspension for a professional driver, plus an increase in the monetary fine, or in Experiment 3c, where there is 1:25 ratio in the respective penalties, which is further underscored by the adjectives "large" and "small." It appears, therefore, that the anti-inference bias is unaffected by sanction severity, at least in the range of differences examined in our experiments.

These results shed light on the underlying debate as to whether or not the severity of sanction per se affects the inclination to impose legal liability, since no such effect was found in our experiments. While drawing conclusions from the absence of an effect is obviously very risky, our experiments strengthen the doubts about the belief that severity of sanctions per se has a net effect on people's inclination to impose liability. As previously noted, a more modest hypothesis, according to which the inclination to impose liability depends on the seriousness of a misconduct and the severity of its consequences, is more plausible and enjoys greater empirical support. We have not tested the latter hypothesis because it is less relevant to the anti-inference bias, which is the focus of our inquiry. A lesser inclination to impose legal liability when more elements are substantively required for such a liability is perfectly sound irrespective of the type of evidence involved. More studies are necessary to tease out the connection between the seriousness of the legal accusation, in terms of its constituent elements, and the anti-inference bias.

Be that as it may, all six experiments described above replicated the finding of a robust anti-inference bias, and some of them further extended its generality and robustness. We thus turn to our final topic--namely the possibility of debiasing this bias.

IV. EXPERIMENT 4: DEBIASING THE ANTI-INFERENCE BIAS?

Inasmuch as the anti-inference bias accurately reflects judicial decision-making in real life, it may result in problematic and hardly justifiable judicial decision-making. Hence, an important policy question is whether this bias can be eliminated or reduced. If this bias stems from a superficial processing of the information, it might perhaps be reduced by inducing respondents to consider the evidence more extensively. In particular, we set out to examine whether highlighting the logical inevitability of the inference, and/or emphasizing the positive, direct findings that give rise to the inference in the inference condition, would affect respondents' decisions.

A. Participants, Materials, and Procedure

A total of 362 subjects took part in Experiment 3 through M-Turk (being paid $0.70 for their participation), 314 of whom (153 women and 161 men) correctly answered the two comprehension questions, and only their responses were included in the analysis. (101) The participants' ages ranged from 18 to 78, with a mean of 35.

Experiment 4 included five conditions in a between-subjects design. The case description was the same as used in Experiment 3c--only this time the dairy's loss was $5,000, and it did not vary between the conditions. Only the Direct condition used the direct evidence, while the other four used the inferential evidence, as described in previous experiments. In addition to the questions asked in Experiment 3c--i.e., Decision, Probability, Confidence, and Fairness (hereinafter, collectively referred to as the original questions)--the present experiment included two additional questions, which we dubbed Logic and Proof. Logic went as follows:
   Given that the source of the antibiotics residues could only
   be the milk of one of the two farmers, would a proof that
   there were no antibiotics residues in the other farmer's
   milk necessarily imply that there were antibiotics residues
   in the defendant's milk? (Yes / No).


The Proof question read:
   Was it proven that there were no antibiotics residues in the
   other farmer's milk? Please specify a number between 1
   and 9, where 1 indicates "It was definitely proven" and 9
   indicates "It was definitely not proven."


While all six questions appeared in all five conditions, their order varied between conditions. In all conditions, the original questions appeared in the same order: Decision, Probability, Confidence, and Fairness. In both the Direct (D) and No-Guidance (NG) conditions, the Logic and Proof questions were added, in that order, after the four original ones. Since participants could not change their answers to previous questions, answering Logic and Proof could not have affected the answers to the original questions in these conditions. In the Proof-Before (PB) condition, the Proof question was presented before the four original questions, and Logic was presented at the end. In the Logic-Before (LB) condition, only Logic was posed before the four original questions, and Proof was posed at the end. Finally, in the Logic-and-Proof-Before (LPB) condition, both Logic and Proof were presented before the original questions. Thus, while the PB and LB conditions examined the possible debiasing effect of the Proof and Logic questions, respectively, the LPB condition examined the cumulative effect of both questions. (102)

B. Results

For the sake of clarity of presentation and consistency with the earlier studies, we converted the decision ratings into a Reliance score, such that higher values reflect the respondent's greater willingness to rely on the evidence and accept the claim. Figure 4 presents the mean Reliance score for the five conditions. An ANOVA of Reliance by conditions yielded a highly significant result. (103) A planned comparison test comparing Direct with the four Inference conditions reveals a significantly lower inclination to accept the claim in the latter conditions -thus, once again, demonstrating a highly statistically significant anti-inference bias. (104)

Asking the guidance questions before the original questions did not, however, eliminate the anti-inference bias--as post-hoc Tukey's HSD tests show: (105) every difference between Direct and one of the other conditions was significant. (106) Focusing solely on the Inference conditions, a one-way ANOVA of Reliance yielded no significant effect of condition (p=.212). Furthermore, comparing the No-Guidance (NG) inference condition with the three guided conditions (BP, LP, and LPB) yielded only a marginally significant result (t(257)= 1.792, p=.074).

C. Discussion

It appears that guiding participants by asking them about the logical conclusion of the inference from the proven facts, and/or about the proof of those facts, does not eliminate or even statistically significantly mitigate the anti-inference bias. These findings attest to the robustness of the anti- inference bias and to its practical implications, as there seems to be no simple way to undo or circumvent its impact. Arguably, these findings also support the understanding of the anti-inference bias as being associated with people's normative judgments, rather than their epistemological convictions, as the combined effect of the Logic and Proof questions should have presumably convinced them that there is no epistemological basis for differentiation between the two types of evidence.

To be sure, the experimental findings do not (and could not) prove that there is no way to overcome the anti-inference bias. They do, however, show that the bias is quite strong and resilient, and therefore merits the attention of legal policymakers.

V. GENERAL DISCUSSION

The eight experiments reported in this Article extend our understanding of the judicial factfinding process, specifically its susceptibility to the anti-inference bias. They replicate previous findings of this bias under new experimental designs and more exacting conditions, including in a within-subject design and with fewer differences between the pertinent conditions than in previous studies. They demonstrate, for the first time, that this bias is considerably more pronounced in the realm of losses than in the realm of gains, thus contributing to the study of loss aversion and people's moral judgments. Within limits, they show that the anti-inference bias is evident when the expected sanction is large or small. They demonstrate the great difficulty of debiasing the anti-inference bias by simple, but apparently powerful, means.

In none of our experiments did the direct and circumstantial evidence differ in terms of the nature of the evidence. In all experiments they were both laboratory (in Experiments 1, 2a-c, 3c, and 4), technological (Experiment 3a) or an eyewitness testimony (Experiment 3b). In that respect, the anti-inference bias is fundamentally different, and more general, than the disinclination to impose liability based on purely statistical (rather than case-specific) evidence, as in the gatecrasher example mentioned above (the Wells effect). (107) However, there are also important similarities between the anti-inference bias and the Wells effect.

Much like the latter, the anti-inference bias reflects a deep-seated normative intuition about the appropriateness of imposing liability based on a certain type of evidence, which cannot be accounted for by subjects' inability to draw logical conclusions from the available information. The normative nature of this intuition is highlighted by the findings that the bias was not mediated by the participants' subjective probability assessments or by their confidence in those assessments (Experiments 2a-b), was considerably more pronounced in the realm of losses than in the realm of gains (Experiments 2a-c), persisted when the similarity between the two types of evidence could not be ignored (Experiment 2c), and exhibited even when subjects were confronted with the logical inevitability of the inference (Experiment 4). There appears to be a common intuition that imposing liability based on circumstantial evidence is inherently less appropriate than on the basis of direct evidence. Moreover, the fact that even subjects whose attention was drawn to the inevitability of the logical inference (and who were able to draw simple logical inferences), displayed the anti-inference bias (Experiment 4) reveals a conscious objection to relying on circumstantial evidence, especially in the domain of losses.

The systematic aversion to basing liability on circumstantial evidence may drive judicial decision-making astray. It may lead to unjust verdicts (treating substantively similar cases differently) and compromise the deterrent effect of the law. Inasmuch as the legal system strives to gain popular support, reflecting prevalent misconceptions about circumstantial evidence may have a desirable side effect. (108) However, this does not appear to be a compelling reason to favor biased decision-making. Therefore, there is at least a prima facie case for trying to debias the anti-inference bias. Alas, the results of Experiment 4 do not give rise to much optimism in this regard. Drawing factfinders' attention to the inevitability of an inference did not eliminate or even statistically significantly reduced the anti-inference bias. Other measures may perhaps prove more effective. Two straightforward options that come to mind are instructing factfinders to attribute equal weight to direct and circumstantial evidence, and drawing their attention to the existence of the anti-inference bias, perhaps through expert testimonies. However, the experience of using such techniques in similar contexts does not give much basis for optimism, either. Empirical studies of the effect of jury instructions in other areas show that instructions are often ineffective. (109)

Since the impact of the anti-inference bias in the domain of gains is considerably smaller than in the domain of losses, one indirect way of diminishing its effect may possibly be to reframe the decision as involving gains, rather than losses. While such reframing of criminal sanctions is highly unlikely, it is more conceivable in civil litigation, where one party's loss is typically the other's gain. (110) The fact that the litigants have conflicting interests regarding the framing of the decision may induce them to propose such competing framings.

Other, possibly more promising, techniques, which are already in use, include adopting evidentiary presumptions, redefining the grounds for legal liability, and improving initial evidence gathering. One example of the first measure is the rule that, in the absence of a reasonable explanation, a person in possession of recently stolen property is presumed to have stolen it. Presumptions steer factfinders toward more accurate decisions. An example of the second measure is redefining certain acts of preparation for a crime as punishable per se--thereby removing the need to prove that the defendant had attempted or contemplated a crime. (111) An example of the third measure is the use of CCTV systems. In reestablishing the broad scope and robustness of the anti-inference bias, the present study corroborates the need for such measures (notwithstanding, of course, conflicting normative considerations, which must be considered in each case).

As demonstrated in our discussion of studies of the severity-leniency hypothesis, experimental studies are uniquely suited to studying the effect of particular variables on people's judgment and decision-making. At the same time, such studies raise concerns about their external validity, as there are various differences between the experimental decision-making environment and real-world judicial environment. Given such differences--in terms of the method of presenting evidence, the time frame, and the interaction between the various participants in the litigation process (litigants, lawyers, judges, juries, and witnesses)--one must be very cautious about drawing policy recommendations from laboratory results.

While our findings resolve some issues concerning the anti-inference bias, they leave considerable room for further research. For one thing, it would be useful to examine the generality of our findings with other vignettes, experimental designs, and populations--including, if possible, professional judges. Future research should also directly examine the effectiveness of other debiasing techniques, such as drawing factfinders' attention to the existence of the anti-inference bias, requiring them to provide explanations for their decisions (accountability), and so forth. Finally, since judicial factfinding is often carried out in groups--juries or panels of judges--it would be interesting to examine how group decisionmaking affects this bias. (112)

CONCLUSION

For several decades, behavioral studies have greatly enhanced our understanding of judicial decision-making and provided important normative and policy implications. (113) More recently, with the emergence of empirical legal studies, legal scholars are no longer content with borrowing insights from the psychological literature, but rather directly engage in experimental studies which are designed to clarify legal problems. The involvement of legal scholars in such studies may contribute to the identification of questions that are of particular interest to the law, expose differences between the experimental environment and the real world, and underlie the need for caution in drawing policy conclusions from empirical findings. Hopefully, the experiments reported in this Article and ensuing analyses meet these challenges.

An additional, more ambitious, goal of experimental legal studies may be to advance psychological research beyond the legal context. While we have not dealt with such broader implications of the anti-inference bias in this Article, our findings may well motivate further research in this direction. After all, judicial factfinding is not the only context where people often follow the maxim "seeing is believing."

APPENDIX

Experiment 1--Antibiotics--One Sample

Imagine that you are a judge in a monetary suit filed by a small dairy against a dairy farmer. The farmer sells the dairy the milk he produces. According to the contract, the farmer should make sure that there are no antibiotics residues in the milk (that is, he must ensure that if one of the cows is treated by antibiotics, her milk will not be sent to the dairy), because such residues obstruct the production of various products. Every day in which the milk delivered by the farmer contains antibiotics residues, according to the contract he must pay the dairy an agreed sum of $5,000. The dairy claims that the farmer delivered milk with antibiotics residues, so he must pay it the agreed sum.

The milk is delivered to the dairy by a tank truck that transports the milk of two farmers. Since the milk of the two farmers is mixed in the tank, a sample is taken from each farmer's milk before pumping it into the tanker, and the samples are delivered to a laboratory, where they will be examined if necessary.

When the yogurt production process failed, the milk samples of the two farmers were to be examined (it is undisputed that the source of the antibiotics residues could only be the milk of one of them), but it turned out that the sample of [Direct: the other farmer's milk was lost in the laboratory; hence it was only possible to examine the defendant's sample. This examination revealed that there were antibiotics residues in the defendant's milk; Inference: the defendant's milk was lost in the laboratory; hence it was only possible to examine the other farmer's sample. This examination revealed that there were no antibiotics residues in the other fanner's milk]. The reliability of the laboratory testing is not perfect. The probability that the results of the laboratory examinations are correct (that is, that there [Direct: were antibiotics residues in the defendant's milk; Inference: were no antibiotics residues in the other fanner's milk]) is 85%.

Based on these results, the dairy claims that the farmer should pay it the agreed sum.

1. How would you decide this suit? Please specify a number between 1 and 9, where 1 indicates "I would certainly accept the claim and order the fanner to pay the dairy the agreed sum" and 9 indicates "I would certainly dismiss the claim and would not order the fanner to pay the dairy the agreed sum."

2. In your opinion, what is the probability, in percentage, that there were antibiotics residues in the defendant's milk?

3. The milk of how many fanners was mixed in the tank truck? (1/2/3/4).

4. What was the accuracy rate of the lab examinations? (80%/ 85%/90%/95%).

Experiment 2a--Dairy Farmer

Assume that you are serving as a judge in a monetary suit that was filed by a dairy fanner against a small dairy. The farmer sells the dairy the milk he produces. According to the contract, the dairy [Gain: increases; Loss: decreases] the payment to the fanner if the milk he produces is [Gain: rich; Loss: low] with protein, because this kind of milk has a [Gain: higher; Loss: lower] value. Every day in which the protein percentage in the milk is [Gain: higher; Loss: lower] than 5%, $1000 is [Gain: added to; Loss: reduced from] the regular payment.

The milk is delivered to the dairy by a tank truck that transports the milk of two farmers. Each one of the farmers produces the same amount of milk. Since the milk of the two farmers is mixed in the tank, a sample is taken from each fanner's milk before pumping it into the tanker, and the samples are delivered to a laboratory, where they will be examined if necessary.

One day, the milk in this tanker contained [Gain: 6%; Loss: 4%] protein. When the dairy approached the lab to examine each of the farmer's milk samples, Direct: the results revealed that the protein percentage in the milk produced by the plaintiff was [Gain: 7%; Loss: 3%], and the protein percentage in the milk produced by the other farmer was 5%. Inference: it was found that the plaintiff's milk sample was lost in the lab, so only the other farmer's sample could be examined. The results revealed that the protein percentage in the milk produced by the other farmer was 5%.

However, the reliability of the lab examinations is not perfect. The probability that the results are accurate (namely, Direct: that the protein percentage in the plaintiffs milk was 7%, arid the protein percentage in the other farmer's milk was 5% Inference: that the protein percentage in the other farmer's milk was 5%) is 80%.

[Gain: The dairy paid the plaintiff the regular payment, and he claims that he is entitled to the additional $1000 for the high percentage of protein; Loss: The dairy deducted $1000 from the regular pay for the low percentage of protein, and the plaintiff claims he is entitled to the regular payment.]

1. The milk of how many fanners is mixed in the tank truck? (1 / 2 / 3 / 4 or more).

2. Who is the plaintiff in the suit described above? (The fanner / The dairy).

3. How would you decide this case? Please specify a number between 1 and 9, where 1 indicates "I will surely accept the claim" and 9 indicates "I will surely dismiss the claim."

4. In your opinion, what is the probability, in percentage, that the [Gain: high; Loss: low] protein level is attributable to the plaintiff's milk?

5. How confident are you about your probability assessment in the former question? Please specify a number between 1 and 9, where 1 indicates "I am not confident at all" and 9 indicates "I am absolutely confident."

6. How fair do you think it will be to accept the claim in this case? Please specify a number between 1 and 9, where 1 indicates that accepting the claim will be "absolutely fair" and 9 indicates that accepting the claim will be "absolutely unfair."

Experiment 3a--Highway

Direct: Speed cameras were installed on a toll road. The probability of an error in the camera system is 2%. The speed limit on this road is 55 MPH. According to the camera, a driver drove his car at a speed of 70 MPH late at night.

Inference: Cameras that document the exact time at which each vehicle passes by them were installed on a toll road. The cameras do not document the speed of the passing vehicle, but from the distance between the cameras and the time that elapses between the points they document, it is possible to infer the driver's speed in that section of the road. The probability of an error in the camera system is 2%. The speed limit on this road is 55 MPH. According to the time elapsed and the distance between the two cameras, a driver drove his car at a speed of 70 MPH late at night.

The penalty for driving beyond the speed limit is [Mild: a $50 fine; Severe: suspension of the driving license for one month and a $100 fine].

1. What was the speed limit on the described road? (55 / 60 / 65/70).

2. To convict a person of an offense, including a traffic violation, the person's guilt must be proven beyond a reasonable doubt. Under the described circumstances, would you convict the driver of driving beyond the speed limit? Please specify a number between 1 and 9, where 1 indicates "I will surely convict the driver" and 9 indicates "I will surely acquit the driver."

3. In your opinion, what is the probability, in percentage, that the driver drove beyond the speed limit?

4. How confident are you with your probability assessment in the previous question? Please specify a number between 1 and 9, where 1 indicates "I am not confident at all" and 9 indicates "I am absolutely confident."

5. To what extent is it fair to convict the driver in this case? Please specify a number between 1 and 9, where 1 indicates that conviction would be "absolutely fair" and 9 indicates that conviction would be "absolutely unfair."

6. At what speed was the driver driving his car? (55 / 60 / 65 / 70).

7. What is the probability of error in the camera system? (0.1% / 0.2% / 1% / 2%).

Experiment 3b--Bus

Imagine that late in the evening a tourist bus was stuck in an isolated backroad. A policeman that arrived at the place assisted the driver to summon minibuses from nearby to drive the tourists to their destination. [Direct: While doing so, the policeman got on the bus and counted 54 tourists on it, despite the fact that according to the bus's permit it was allowed to carry only 50 passengers. Inference: After two minibuses that arrived empty were filed up and drove away, four tourists were left, and an additional vehicle was ordered for them (the policeman counted the number of seats in the minibuses and made sure that they were filled up before driving away). Since there were 25 seats in each minibus, the policeman inferred that the bus driver carried 54 tourists, which is a violation of the bus's permit, allowing to carry only 50 people.]

Based on the policeman's report, the driver was accused of carrying an excessive number of passengers. The driver pleaded not guilty. The judge got the impression that the policeman was a reliable person and assessed the chance of error in counting to be very low: 1 in 25. To convict a person of an offense, including a traffic violation, the person's guilt must be proven beyond a reasonable doubt. The penalty for carrying an excessive number of passengers is [Mild: $50; Severe: suspension of the driving license for one month and a $100 fine].

1. According to the bus's permit, how many passengers should be allowed on it? (45 / 50 / 55 / 60).

2. In your opinion, should the judge find the driver guilty of carrying an excessive number of people? Please specify a number between 1 and 9, where 1 indicates "the judge should surely convict the driver" and 9 indicates "the judge should surely acquit the driver."

3. In your opinion, what is the probability, in percentage, that the driver actually carried an excessive number of people?

4. How confident are you with your probability assessment in the previous question? Please specify a number between 1 and 9, where 1 indicates "I am not confident at all" and 9 indicates "I am absolutely confident."

5. To what extent is it fair to convict the driver in this case? Please specify a number between 1 and 9, where 1 indicates that conviction would be "absolutely fair" and 9 indicates that conviction would be "absolutely unfair."

6. How many tourists did the policeman say were on the bus? (50/52/54/56).

7. What was the judge's assessment of the probability of counting error? (1 in 25 / 1 in 30 / 1 in 50 / 1 in 100).

(1.) Charles T. McCormick, McCormick on Evidence 308 (Kenneth S. Broun et al. eds., 6th ed. 2006); Albert J. Moore, Paul Bergman & David A. Binder, Trial Advocacy: Inferences, Arguments, and Techniques 2-3 (1996); Peter Murphy, Murphy on Evidence 20-21 (10th ed. 2008).

(2.) 1A John Henry Wigmore, Evidence in Trials at Common Law 952-56 (revised by Peter Tillers, 1983); Richard K. Greenstein, Determining Facts: The Myth of Direct Evidence. 45 Hous. L. Rev. 1801 (2009).

(3.) See, e.g., Wigmore, supra note 2, at 957-61, 963 (describing 19th and 20th century English and American law); Irene Merker Rosenberg & Yale L. Rosenberg, "Perhaps What Ye Say is Based Only on Conjecture"--Circumstantial Evidence, Then and Now, 31 Hous. L. Rev. 1371, 1376-402 (1995) (discussing the Talmudic prohibition on criminal convictions based on circumstantial evidence and 19th century American law). On the roots of the trend to abandon those rules in early modern English law, see Barbara J. Shapiro, Beyond Reasonable Doubt and Probable Cause: Historical Perspectives on the Anglo-American Law of Evidence 200-43 (1991).

(4.) See, e.g., Holland v. United States, 348 U.S. 121, 140 (1954) (stating that, in terms of reliability, "[circumstantial evidence ... is intrinsically no different from testimonial evidence"); Rosenberg & Rosenberg, supra note 3, at 1400-02 (showing that most courts in the United States follow the Holland dictum). On some traces of the old rules, see Eyal Zamir, liana Ritov & Doron Teichman, Seeing is Believing: The Anti-Inference Bias, 89 Ind. L.J. 195, 199-200 (2014).

(5.) See, e.g., Sidney L. Phipson, Phipson on Evidence 5 (Hodge M. Malek et al. eds., 17th ed. 2010) ("Little is to be gained from a comparison of [direct and indirect evidence's] weight, since ... both forms admit of every degree of cogency from the lowest to the highest."): Wigmore, supra note 2, at 957-64, 1120-38 (denying that direct and circumstantial evidence necessarily differ in terms of their persuasiveness).

(6.) See, e.g., Jane Goodman, Jurors' Comprehension and Assessment of Probabilistic Evidence, 16 Am. J. Trial Advoc. 361, 375 (1992) ("where the probabilistic evidence is most incriminating and most probative, jurors may tend to underuse or undervalue the evidence"); Dale A. Nance & Scott B. Morris, Juror Understanding of DNA Evidence: An Empirical Assessment of Presentation Formats for Trace Evidence with a Relatively Small Random-Match Probability, 34 J. Legal Stud. 395, 395 (2005) (reporting the results of a large-scale empirical study indicating that jurors tend to undervalue forensic match evidence); Steven Penrod & Brian Cutler, Witness Confidence and Witness Accuracy: Assessing Their Forensic Relation, 1 Psychol. Pub. Pol'y & L. 817, 817 (1995) (explaining the tendency for jurors to "overbelieve eyewitnesses").

(7.) Kevin Jon Heller, The Cognitive Psychology of Circumstantial Evidence, 105 Mich. L. Rev. 241, 244 (2006).

(8.) Id.; see also Greenstein, supra note 2, at 1803 ("at least some types of circumstantial evidence are actually more reliable than familiar categories of direct evidence") (emphasis added). On the deficiencies of eyewitness testimonies and the limited ability of fact finders to determine the truthfulness of testimonies, see generally Dan Simon, In Doubt: The Psychology of the Criminal Justice Process 57, 167 (2012).

(9.) Laurence H. Tribe, Trial by Mathematics: Precision and Ritual in the Legal Process. 84 Harv. L. Rev. 1329, 1344-50 (1971).

(10.) Heller, supra note 7, at 258-303 (discussing the ease-of-simulation explanation for the disinclination to rely on circumstantial evidence).

(11.) See Zamir, Ritov & Teichman, supra note 4, at 201-04 (surveying the literature about the circumstantial evidence versus direct evidence phenomenon).

(12.) Heller, supra note 7, at 287.

(13.) See Zamir, Ritov & Teichman, supra note 4, at 201-04.

(14.) Zamir, Ritov & Teichman, supra note 4.

(15.) Id. at 205-07.

(16.) Id. at 220. On over-generalization as a source of psychological heuristics and biases, see generally Jonathan Baron, Heuristics and Biases, in The Oxford Handbook of Behavioral Economics and the Law 3, 15-17 (Eyat Zamir & Doron Teichman eds., 2014).

(17.) See, e.g., Alex Stein, Foundations of Evidence Law 40-56, 80-106 (2005) (justifying the reluctance to impose liability on the basis of naked statistical evidence); Amit Pundik, What is Wrong with Statistical Evidence? The Attempts to Establish an Epistemic Deficiency, 27 Civ. Just. Q. 461 (2008) (criticizing this reluctance).

(18.) Zamir, Ritov & Teichman, supra note 4, at 221-24.

(19.) Id. at 224-25.

(20.) Id. at 225-27; see also Doron Teichman, Convicting with Reasonable Doubt: An Evidentiary Theory of Punishment, 93 Notre Dame L. Rev. 22-28 (forthcoming 2017), https://ssrn.com/abstract=2932743) (discussing the evidentiary function of "proxy crimes").

(21.) Rikkilee Moser, As If All the World Were Watching: Why Today's Law Enforcement Needs To Be Wearing Body Cameras, 7 N. Ill. U. L. Rev. Online J. 1, 16-18 (2015).

(22.) Daniel Kahneman & Amos Tversky, Prospect Theory: An Analysis of Decision Under Risk. 47 Econometrica 263 (1979). For an overview, see Eyal Zamir, Law, Psychology, and Morality: The Role of Loss Aversion 3-16 (2015).

(23.) See infra Section II.A.

(24.) Eyal Zamir & Barak Medina, Law, Economics, and Morality 91-93 (2010). See also infra Section U.A.

(25.) See, e.g., Gary L. Wells, Naked Statistical Evidence of Liability: Is Subjective Probability Enough?, 62 J. Personality & Soc. Psychol. 739, 741-744 (1992); Zamir, Ritov & Teichman, supra note 4, at 205-214.

(26.) See generally Baruch Fischhoff, Paul Slovic & Sarah Lichtenstein, Subjective Sensitivity Analysis, 23 Org. Behav. & Hum. Performance 339, 340 (1979); Gary Charness, Uri Gneezy & Michael A. Kuhn, Experimental Methods: Between-Subject and Within-Subject Design, 81 J. Econ. Behav. & Org. 1, 1 (2012); Daniel Kahneman & Amos Tversky, On the Reality of Cognitive Illusions, 103 Psychol. Rev. 582, 583 (1996).

(27.) See infra Section III.A.

(28.) See, e.g., Arthur Conan Doyle, The Adventure of the Blanched Soldier, reprinted in The Case-Book of Sherlock Holmes 33, 54 (House of Stratus 2001) (1927).

(29.) See infra Part II (Experiments 2a-c), Section II.D (Experiments 2c), and Part IV (Experiment 4), respectively.

(30.) See Zamir, Ritov & Teichman, supra note 4, at 205-07. See also supra note 15 and accompanying text.

(31.) Zamir, Ritov & Teichman, supra note 4, at 211-14.

(32.) Id.

(33.) T-test is a common statistical measure used to determine if two sets of data are significantly different from one another. A significant effect in statistical measurement refers to the odds that a certain result is the product of mere chance. The common threshold for statistical significance in the social sciences is .05, meaning that the probability that the results were obtained by chance is 5% or less. When the probability is between 5 to 10 percent, it is commonly referred to as "marginally significant." In the present analysis, p<.001 means that the probability that the reported result was obtained by chance is lower than one in one thousand, which is highly significant.

(34.) ANOVA is a common statistical measure used to identify the causes of variance among participants. It allows us to determine how much of the differences between the two conditions in Decision can be attributed to the differences in subjects' answers to Probability.

(35.) Russell B. Korobkin & Thomas S. Ulen, Law and Behavioral Science: Removing the Rationality Assumption from Law and Economics. 88 CAL. L. Rev. 1051, 1060-66 (2000) (discussing several versions of the rational choice theory).

(36.) Kahneman & Tversky, supra note 22, at 274.

(37.) Id. at 279.

(38.) Richard H. Thaler, Toward a Positive Theory of Consumer Choice, 1 J. Econ. Behav. & Org. 39, 43-47 (1980).

(39.) For a book-long elaboration of this claim, see Zamir, supra note 22. For a synopsis, see Eyal Zamir, Law's Loss Aversion, in the oxford handbook of behavioral Economics and the Law, supra note 16, at 268.

(40.) See Zamir, supra note 22, at 193-95 (showing that this assertion holds true under practically any philosophical theory of law).

(41.) Id. at 182-88 (surveying experimental studies that demonstrate that the moral conviction of most people are moderate deontological).

(42.) Shelly Kagan, Normative Ethics 70-78 (1998) (describing the essence of deontological morality); see also Zamir & medina, supra note 24, at 41-42.

(43.) See generally Kagan, supra note 42, at 94-100. For psychological studies substantiating the prevalence of this judgment, see liana Ritov & Jonathan Baron, Reluctance to Vaccinate: Omission Bias and Ambiguity, 3 J. Behav. Decision Making 263 (1990); Jonathan Baron & liana Ritov, Reference Points and Omission Bias, 59 Org. Behav. & Human Decision Proc. 475 (1994).

(44.) On this distinction, see Shelly Kagan, The Limits of Morality 121-25 (1989) (a critique); F.M. Kamm, Non-Consequentialism, the Person as an End-in-Itself and the Significance of Status, 21 Phil. & Pub. Aff. 354, 381-82 (1992) (a defense).

(45.) See, e.g., Fredrick E. Vars, Attitudes Toward Affirmative Action: Paradox or Paradigm?, in Race Versus Class: The New Affirmative Action Debate 73 (Carol M. Swain ed., 1996) (studying people's attitudes to affirmative action plans that involve either not giving or taking from others); Avital Moshinsky & Maya Bar-Hillel, Loss Aversion and the Status Quo Label Bias, 28 Soc. cognition 191 (2010) (studying policy choices); Ritov & Baron, supra note 43 (describing experiments in which participants were asked whether they would support a law mandating the vaccination of all children, where the vaccination would eliminate the risk of death from flu but may have fatal side effects); Baron & Ritov, supra note 43, at 483-89 (describing an experiment in which respondents were asked to imagine themselves as government officials facing a decision whether to change a policy that is expected to affect unemployment rate). While the last two studies focused on the omission bias, as the second study has established, this bias is closely connected to loss aversion.

(46.) Zamir, Ritov & Teichman, supra note 4, at 229.

(47.) Daniel Ellsberg, Risk, Ambiguity, and the Savage Axioms, 75 Q.J. Econ. 643 (1961).

(48.) See also Stefan T. Trautmann & Gijs van de Kuilen, Ambiguity Attitudes, in 1 The Wiley Blackwell Handbook of Judgment and Decision Making 89 (Gideon Keren & George Wu eds., 2015) (reviewing the literature on ambiguity aversion).

(49.) For detailed discussions of this claim, see L. Jonathan Cohen. The Probable And The Provable 36-39 (1977); Stein, supra note 17, at 40-56, 80-106; L. Jonathan Cohen, The Role of Evidential Weight in Criminal Proof, 66 B.U. L. Rev. 635 (1986); Christian Dahlman. Lena Wahlberg & Farhan Sarwar. Robust Trust in Expert Testimony, 28 Humana.Mente J. Phil. Stud. 17, 17-20 (2015) (discussing the issue under the heading of "robustness"); D.H. Kaye, Apples and Oranges: Confidence Coefficients and the Burden of Persuasion, 73 cornell L. Rev. 54 (1987). See also Pundik, supra note 17, at 474-87 (criticizing the argument).

(50.) For a fourth meaning of confidence in probability assessments, see Neil B. Cohen, Confidence in Probability: Burdens of Persuasion in a World of Imperfect Knowledge, 60 NYU L. Rev. 385, 397-421 (1985).

(51.) Zamir. Ritov & Teichman, supra note 4, at 211-14.

(52.) Eyal Zamir & liana Ritov, Loss Aversion, Omission Bias, and the Burden of Proof in Civil Litigation. 41 J. Legal Stud. 165 (2012); Mark Schweizer, Loss Aversion, Omission Bias and the Civil Standard of Proof, in European Perspectives on Behavioural LAW and economics 125 (Klaus Mathis ed., 2015). These studies have shown that, due to factfinders' omission bias, to prevail in court, plaintiffs must meet a considerably higher standard of proof than the proclaimed preponderance of the evidence or balance of probabilities.

(53.) F(1,362)=19.811, p< 001.

(54.) F(1,362)=45.400, p<00l.

(55.) F(1,362(=4.466, p<05.

(56.) F(1,362)=11.662, p=.00l, in an ANOVA with domain and evidence type as independent variables. Assessments of probability were 70.89% and 69.27% in Gain-Direct and Gain-Inference, respectively, and 65.54% and 57.99% in Loss-Direct and Loss-Inference, respectively.

(57.) F(1,362)=3.548, p=.06.

(58.) r=.389,p<.001.

(59.) F(1,361)=7.807, p=.005, in an ANOVA with domain and type of inference as independent variables, controlling for probability.

(60.) F(1,361)=2.820, p=.094.

(61.) F(1,360)= 13.979, p<.001.

(62.) F(1,360)=35.268, p<.001.

(63.) F(1,360)=3.162, p=.076.

(64.) r=.596, p< 001. Fairness ratings were significantly affected by both domain and evidence type, such that accepting the claim was deemed fairer in the Gain conditions than in the Loss conditions (F(1,362)=47.663, p<.001). It was also deemed fairer in the Direct than in the Inference conditions (F(1,362)=8.026. p=.005). However, the two factors did not interact with each other in their effect on the judged fairness (p=.90).

(65.) The comprehension questions were: (1) The milk of how many farmers was mixed in the tank truck? (2 / 3 / 4 / 5); (2) What was the lab result for the percentage of protein in the milk of "the other farmer" (not the farmer in question)? (4% / 5% / 6% / 7%); (3) What is the probability that the lab test results are accurate? (80% / 85% / 90% / 95%). While a 21% failure rate in the comprehension questions may seem high, it is not uncommon when subjects are not inherently motivated to answer a questionnaire diligently. See Daniel M. Oppenhcimer, Tom Mcyvis & Nicolas Davidenko, Instructional Manipulation Checks: Detecting Satisficing to Increase Statistical Power, 45 J. experimental soc. Psychol. 867, 868 (2009) (discussing the use of manipulation checks in experiments and reporting a 28.7% failure rate in a non-motivated sample).

(66.) F(1,274)-22.064, p<.001.

(67.) F(1,274)=41.147, p<.001.

(68.) F(1,274)=4.481, p<.05.

(69.) The different domains influenced the assessment of probability of whether the deviation in the protein content was perceived as attributable to the fanner in question (F(1,274)=8.818, p<005, in an ANOVA with domain and evidence type as independent variables), where the probability assessments were higher in the Gain than in the Loss conditions (75.36% and 67.83% in Gain-Direct and Gain-Inference, respectively, and 68.75% and 57.62% in Loss-Direct and Loss- Inference, respectively). The assessment of probability was also significantly affected by the type of evidence (F(1,274)=10.854, p=.001): participants estimated the probability to be lower in the Inference conditions than in the Direct ones. The interaction between domain and evidence type did not reach a significant level (p=.527). Furthermore, the interaction of domain and type of evidence in deciding the case was not mediated by assessment of probability: ANOVA of Reliance by domain and type of evidence, including the probability estimate as a covariate, still yielded a significant interaction of the two predictors (F(1,273)=4.058, p<.05), as well as significant effects of type of evidence (F(1,273)= 14.566, p<.001), domain (F(1,273)=32.431), and probability (F(1,273)=26.055, p<.001). As in Experiment 2a, the higher the probability estimates, the stronger the expressed confidence in the probability assessment (r=.319, p<.001). Unlike Experiment 2a, however, an ANOVA with domain and type of evidence as independent variables, controlling for probability (as a covariate), revealed that neither domain and evidence type nor their interaction had significantly affected Confidence. Finally, fairness of accepting the claim showed a similar pattern, whereby both domain and evidence type significantly affected the judged Fairness (F(1,274(=34.676, p<.001 for domain; F(1,274)=7.872, p=.005 for type of evidence), but the interaction of the two factors did not reach a significant level (p=09).

(70.) See supra note 26 and accompanying text.

(71.) The comprehension questions were the same as in experiment 1 (see Appendix).

(72.) F(1,91)=14.310, p<.001 and F(1,91)=5.968, p=.017 for type of evidence and domain, respectively.

(73.) F(1,91)=4.394. p=.039.

(74.) F(1,91)=7.316, p<01 and F(1,91)=8.856, p<.01, respectively.

(75.) 72.61% and 66.14% in Gain-Direct and Gain-Inference, respectively, and 68.07% and 60.51% in Loss-Direct and Loss-Inference, respectively.

(76.) Zamir, Ritov, & Teichman, supra note 4.

(77.) See, e.g., Wells, supra note 25, at 741-42 (showing that, while people realize that if 80% of buses in a given town belong to the blue bus company and 20% to the gray bus company, the probability that an accident caused by an unidentified bus was caused by a bus of the blue company is around 80%, and yet they are unwilling to impose liability for the accident based on these facts alone. People's reluctance to impose liability based on naked statistical evidence has come to be known as the Wells Effect)-, see also Hal R. Arkes, Brittany Shoots-Reinhard & Ryan S. Mayers, Disjunction Between Probability and Verdict in Juror Decision Making, 25 J. Behav. Decision Making 276 (2012) (examining alternative explanations for the Wells effect); Keith E. Niedermeier, Norbert L. Kerr & Lawrence A. Messe, Jurors' Use of Naked Statistical Evidence: Exploring Bases and Implications of the Wells Effect, 76 J. Personality & Soc. Psychol. 533 (1999) (examining alternative explanations for the Wells effect).

(78.) Zamir, Ritov, & Teichman, supra note 4.

(79.) Id. at 229; see also supra notes 46-50 and accompanying text.

(80.) Amos Tversky & Daniel Kahneman, The Framing of Decisions and the Psychology of Choice, 211 Sci. 453. 453 (1981) (showing that people tend to choose the riskier treatment when outcomes are framed as involving losses and the less risky one when outcomes are framed as gains). See also Anton Kuhberger, The Influence of Framing on Risky Decisions: A Meta- Analysis, 75 Org. Behav. & Hum. Decision Processes 23 (1998) (offering a typology and meta- analysis of numerous studies of framing effects); Irwin p. Levin, Sandra L. Schneider & Gary J. Gaeth, AU Frames Are Not Created Equal: A Typology and Critical Analysis of Framing Effects, 76 Org. Behav. & Hum. Decision processes 149 (1998) (offering another typology and meta-analysis of numerous studies of framing effects).

(81.) See Russell Korobkin & Chris Guthrie, Psychological Barriers to Litigation Settlement: An Experimental Approach, 93 Mich. L. Rev. 107, 129-38 (1994) (showing that framing of litigation outcomes as gains or losses affects decisions about settlement offers); Eyal Zamir & Ilana Ritov, Revisiting the Debate over Attorneys' Contingent Fees: A Behavioral Analysis, 39 J. Legal Stud. 245, 262-64 (2010) (demonstrating that manipulation of the perceived reference point affects plaintiffs' choice of fee arrangement).

(82.) See, e.g., Talia Fisher. Constitutionalism and the Criminal Law: Rethinking Criminal Trial Bifurcation. M U. Toronto L.J. 811, 816 (2011) (stating that research by social scientists and legal scholars points to the possible effect of expected sentences on guilt determination); Ehud Guttel & Doron Teichman, Criminal Sanctions in the Defense of the Innocent, 110 Mich. L. Rev 597, 597 (2012) ("Experimental studies as well as real world examples indicate ... that fact finders often adjust the evidentiary threshold for conviction in accordance with the severity of the applicable sanction."). For a survey of earlier expressions of this widely accepted belief among legal practitioners and academics, see Norbert L. Kerr, Severity of Prescribed Penalty and Mock Jurors' Verdicts. 36 J. Personality & Soc. PSYCHOL. 1431. 1432(1978).

(83.) See, e.g., Henrik Lando, The Size of the Sanction Should Depend on the Weight of the Evidence, 1 Rev. L. & Econ. 277 (2005) (arguing that from a deterrence perspective, the sanction should depend on the weight of evidence); Erik Lillquist, Recasting Reasonable Doubt: Decision Theory and the Virtues of Variability, 36 U.C. Davis L. Rev. 85 (2002) (advocating a variable standard of proof in criminal cases); Fisher, supra note 82 (extending Lando's argument); Elisabeth Stoffelmayr & Shari Seidman Diamond, The Conflict Between Precision and Flexibility in Explaining "Beyond a Reasonable Doubt," 6 Psychol. Pub. Pol'y & L. 769, 778-84 (2000) (highlighting the advantage of a flexible standard of proof, which allows to adjust its stringency to the severity of the offense and the costs of error).

(84.) Jonathan L. Freedman et al., Severity of Penalty Seriousness of the Charge, and Mock Jurors' Verdicts, 18 Law & Hum. Behav. 189, 190 (1994).

(85.) See, e.g., Martha A. Myers. Rule Departures and Making Law: Juries and Their Verdicts, 13 Law & Soc'y Rev. 781. 793-94 (1979) (analyzing a large sample of actual jury verdicts and finding that juries generally follow the instructions and that they are less likely to convict in more serious crimes): Carol M. Werner et al., The Impact of Case Characteristics and Prior Jury Experience on Jury Verdicts, 15 J. Applied Soc. Psychol. 409, 417 (1985) (finding a similar correlation).

(86.) James Andreoni. Criminal Deterrence in the Reduced Form: A New Perspective on Ehrlich's Seminal Study. 33 Econ. Inquiry 476, 476 (1995).

(87.) See, e.g., Neil Vidmar, Effects of Decision Alternatives on the Verdicts and Social Perceptions of Simulated Jurors, 22 J. Personality & Soc. Psychol. 211 (1972) (studying the effect of the number and severity of the decision alternatives on conviction rate); see also Kerr, supra note 82, at 1432-34 (analyzing previous studies).

(88.) See, e.g., Martin F. Kaplan & Sharon Krupa, Severe Penalties Under the Control of Others Can Reduce Guilt Verdicts, 10 Law & Psychol. Rev. 1 (1986) (finding a correlation between punishment severity and conviction rate when decisions were real rather than hypothetical, the penalty was decided by someone else, and the evidence was weak); Kerr, supra note 82 (finding a mild correlation between severity of sanction land conviction rate in some conditions but not in others); Freedman et al., supra note 84 (pointing to the inconclusiveness of previous studies and finding neither a correlation between conviction rate and severity of sanction, nor a correlation between conviction rate and seriousness of charge, once the amount of evidence necessary to prove guilt is equated for all charges); Martin F. Kaplan, Setting the Record Straight (Again) on Severity of Penalty: A Comment on Freedman et al.. 18 Law & Hum. Behav. 697 (19941 (criticizing Freedman and his colleagues' claims); Jonathan L. Freedman, Penalties and Verdicts: Keeping the Record Straight, 18 Law & Hum. Behav. 699 (1994) (replying to Kadan's critiauel: Martin F. KaDlan. Keeping the Record Complete, 18 Law & Hum. Behav. 702 (1994) (responding to Freedman's reply).

(89.) See, e.g., Jonathan L. Freedman, The Effect of Capital Punishment on Jurors ' Willingness to Convict, 20 J. Applied Soc. Psychol. 465, 472 (1990) (reporting that 30% of jurors who convicted a defendant in a first-degree murder in which there was no capital punishment, responded that they would have been less likely to convict were there a death penalty): Reid K. Hester & Ronald E. Smith, Effects of a Mandatory Death Penalty on the Decisions of Simulated Jurors as a Function of Heinousness of the Crime, 1 J. Crim. Just. 319 (1974) (finding that a capital punishment significantly reduced conviction rate in a gang-war-murder condition, but not in a heinous-murder one).

(90.) Anseia M. Jones. Shavne Jones & Steven Penrod. Examining Levai Authoritarianism in the Impact of Punishment Severity on Juror Decisions, 21 Psychol. Crime & Law 939 (2015).

(91.) Zamir, Ritov & Teichman, supra note 4, at 205-07.

(92.) F(1,299)=6.58, p<.05.

(93.) F(1,299)= 1.339, p=.248.

(94.) F(1,307)=6.23, p<05.

(95.) F(1,307)=0.05, p=.815.

(96.) The comprehension questions were the same as those in Experiment 1.

(97.) Zamir, Ritov & Teichman, supra note 4, at 211-14.

(98.) F(1,2461=44.858, p<.001.

(99.) F(1,246)=0.039.p=.844.

(100.) F(1,246)=1.897, p=.17.

(101.) The comprehension questions were identical to those in Experiment 1.

(102.) To sum up, the orders of questions in the five conditions were as follows: Direct and No-Guidance'. Decision, Probability, Confidence, Fairness, Logic, and Proof; Proof-Before: Proof, Decision, Probability, Confidence, Fairness, and Logic; Logic-Before: Logic, Decision, Probability, Confidence, Fairness, and Proof; Logic-and-Proof-Before: Logic, Proof, Decision, Probability, Confidence, and Fairness.

(103.) F(1,315)=5.830, p<.001.

(104.) t(315)=4.307, p<.001.

(105.) A Tukey's HSD test is applied after performing the ANOVA analysis, to test which groups in the experiment differed.

(106.) p<0.05, except for the difference between Direct-Evidence and LPB where p=0.052.

(107.) See supra note 49 and accompanying text. Notable contributions to this line of psychological research include Wells, supra note 25; Niedermeier, Kerr & Messe, supra note 77; Arkes, Shoots-Reihard & Mayers, supra note 77.

(108.) See Zamir, Ritov, & Teichman, supra note 4, at 223; see also Jeffrey J. Rachlinski, A Positive Psychological Theory of Judging in Hindsight, 65 U. Chi. L. Rev. 571, 601 (1998) (concluding that a judgment produced by a debiased factfinder "might seem less fair than an uncorrected, biased judgment").

(109.) Joel D. Lieberman & Jamie Arndt, Understanding the Limits of Limiting Instructions: Social Psychological Explanations for the Failures of Instructions to Disregard Pretrial Publicity and Other Inadmissible Evidence, 6 Psychol. Pub. Pol'y & L. 677, 703 (2000) ("the majority of extant empirical research indicates that jurors do not adhere to limiting instructions"); see also Zamir, Ritov & Teichman, supra note 4, at 221-23.

(110.) See generally ZAMIR, supra note 22, at 226-29 (discussing ways to overcome legal decision-makers' susceptibility to framing effects).

(111.) Id. at 224-27.

(112.) Studies have shown that group decision-making sometimes overcome individuals' cognitive biases--but in other instances exacerbates them, or has no effect. See generally Norbert L. Kerr, Robert J. MacCoun & Geoffrey P. Kramer, Bias in Judgment: Comparing Individuals and Groups, 103 Psychol. Rev. 687 (1996).

(113.) See generally Doron Teichman & Eyal Zamir, Judicial Decision-Making: A Behavioral Perspective, in The Oxford Handbook of Behavioral Economics and the Law, supra note 16, at 664.

Eyal Zamir, Elisha Harlev, and Ilana Ritov, Eyal Zamir is the Augusto Levi Professor of Commercial Law at the Faculty of Law of the Hebrew University in Jerusalem. Elisha Harlev is a student at the Faculty of Law and the Department of Psychology of the Hebrew University. Ilana Ritov is a professor at the School of Education and Center for Rationality, Hebrew University of Jerusalem. We are grateful to Netta Barak- Corren, Meirav Furth, Ori Katz, Daphna Lewinsohn-Zamir, Ofer Malcai, Fred Vars, and the participants in the workshop on Bayesian Networks and Argumentation in Evidence Analysis, held in the Isaac Newton Institute for Mathematical Sciences, Cambridge University, for valuable comments on earlier drafts. This study was supported by the I-CORE Program of the Planning and Budgeting Committee and the Israel Science Foundation (Grant No. 1821/12).
Figure 1: Reliance in Experiment 2a

Condition

            Gain   loss

direct      6.63   6.06
inference   5.51   3.91

Note: Table made from bar graph.

Figure 2: Reliance in Experiment 2b

Condition

          direct   inference

gain      7.03     6.28
loss      5.78     3.79

Note: Table made from bar graph.

Figure 3: Reliance in Experiment 2c

Scenario

              Gain   loss

direct        6.5    5.95
inference     6.21   4.82

Note: Table made from bar graph.

Figure 4: Reliance in Experiment 4

D         6.27
NG        4.42
LB        4.73
PB        5.12
LPB       5.14

Condition

D-Direct evidence; NG-No guidance; LB-Only Logic preceded
the original questions; PB-Only Proof preceeded the ...

Note: Table made from bar graph.
COPYRIGHT 2017 The Law & Psychology Review
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2017 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Zamir, Eyal; Harlev, Elisha; Ritov, Ilana
Publication:Law and Psychology Review
Date:Jan 1, 2017
Words:20708
Previous Article:Cognitive emotion and the law.
Next Article:Meditation for law students: mindfulness practice as experiential learning.
Topics:

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |