Printer Friendly

Cherry-picking memories: why neuroimaging-based lie detection requires a new framework for the admissibility of scientific evidence under FRE 702 and Daubert.


Neuroimaging techniques have been in heavy rotation in the news lately. Increasingly, companies have used neuroimaging techniques--specifically, functional magnetic resonance imaging (fMIRI)--in an attempt to determine whether an individual is telling a falsehood. More troublingly, these companies have proffered factual conclusions for use injury trials. This Article discusses the capabilities and limitations of the technique. In doing so, the Article also discusses why the technology will require the federal judiciary to reevaluate its current interpretation of Federal Rule of Evidence 702 and the Daubert doctrine for admitting novel sources of scientific evidence.


     A. Technical Background
        1. Fundamentals of fMRI
        2. Mechanics and Experimental Methodology of fMRI-Based Studies
     B. Scientific Background

     A. Legal Admissibility of Neuroimaging Evidence is Mixed
        1. Executive Background
        2. Judicial Background
     B. Functional MRI-based Lie Detection is Neither Reliable Nor Valid
        1. Technical Concerns
        2. Scientific Concerns
        3. Epistemological Concerns
        4. Practical Concerns
     C. Lie Detection via Functional Neuroimaging is Uncertain Under
        1. The Technique Should Have a Clearly Defined and Low Error
        2. The Technique Should Have Standards Controlling Its
        3. The Technique Should Be Testable or Falsifiable
        4. The Technique Should Have Survived Peer Review and Be
           Accepted Within the Relevant Field
        5. Limitations of FRE 702 and Daubert's Four Factors
     D. Improving Daubert: A New Model for Scientific Validity Under
        FRE 702
     E. Lie Detection via Functional Neuroimaging is Not
        Admissible Under the Proposed Model

     A. Why Not Just Let it in For What it is Worth?
        1. Probative Value is Outweighed by Prejudicial Nature
        2. Judicial Efficiency is Not Promoted
        3. Differences of Degree, Not Kind
     B. New Scientific Methods Require a Restrained Approach


It is the dream (or nightmare) of every trial lawyer. A witness is placed into a black box; a question asked ("Did you kill Mr. Smith?"); a response given ("No, I did not"); and then a klaxon blares, dissonant enough to rouse the most catatonic juror, accompanied by an unavoidable flashing red sign: "THAT'S A LIE."

How much faster would trials be, how much less costly the proceedings, how much more justice done, if only witnesses always told the truth? Or, the next best thing: If they could be tested by a lie detector with perfect accuracy and reliability? But how does the story sour if the witness could deceive the machine by pressing his toe onto a thumbtack placed in his shoe? Or the expert administering the test could, like a carnival operator, place a foot on the guy-wire, weighing the answer toward one side or the other?

Two private firms trying to make the dream of perfect truth verification into a reality have recently proposed that functional neuroimaging, a technology used in medicine and cognitive neuroscience, can be used to distinguish truth from deception. (1) Functional neuroimaging records brain activity in a specific location across moments in time. These firms offer to detect deception (or "verify truth") by matching an individual's own pattern of brain activity during interrogation to generalized patterns of brain activity observed when people are known to be engaging in deception. (2) Because functional magnetic resonance imaging (fMRI) is generally considered the most popular functional neuroimaging technique, both firms have adopted its use. (3)

Part I of this Article provides an overview of the technical and scientific background relevant to fMRI-based lie detection in order to apply it to existing doctrinal standards. This background also provides a practical working knowledge necessary to attack the results of a lie detection test.

Part II has five Sections. The first covers the extant legal standards for a court in a jurisdiction that follows Federal Rule of Evidence 702 and Daubert to accept in evidence exhibits and expert opinions based on novel scientific techniques. The second Section attacks the reliability and validity of fMRI-based results, in four subsections. The first subsection raises concerns about the validity of methodological and technical factors in fMRI studies. The second discusses the state of scientific knowledge about the regions purported to be involved in deception and memory. The third subsection raises questions about the general meaning of findings in fMRI, and the fourth discusses the ease with which a subject may engage in malicious countermeasures so as to willfully distort the test result. The next Section then applies these concerns to the Daubert standard. After finding that the analysis under Daubert's recommended factors leads to an uncertain result, contrary to the plain meaning of Federal Rule of Evidence 702 (FRE 702), the fourth Section reexamines the fundamental meaning and reasoning behind Daubert. With that in mind, this article proposes a novel interpretation of legal "scientific validity," one more adaptable to the rapidly-changing nature of technology and its impact on our understanding of scientific knowledge.

Because the current approach to Daubert leaves the power to define what is "science" in the hands of the attorneys-inherently biased sources (and rightfully so)--this Article argues that the trial judge must have simultaneously more discretion and more guidance in order to determine, potentially with the aid of expert witnesses, whether the underlying science is sound enough to support the propositions for which they are proffered. In a jury trial, the judge should make this decision out of the presence of the jury--if not in a pre-trial agreement, then in response to a motion in limine, so as to prevent jurors from being exposed to highly persuasive, putatively "scientific" data. (4) Where an evaluation of deception detection based on the original Daubert factors is vague and provides the gatekeeper insufficient guidance, applying the approach proposed in this article affords the trial judge a set of analytical steps to craft a decision persuasive to both counsel and the appellate bench. The last Section applies this proposed approach to neuroimaging-based lie detection and finds that the current base of scientific knowledge fails to advance the fact-finding mandate of the jury.

Finally, Part III returns to the specific fact situation delineated, and recommends, first, that jurists consider a more detailed and nuanced analysis in evaluating the admissibility of neuroimaging-based lie detection, and, second, that researchers developing neuroimaging-based lie detection tests adopt rigorous statistical and disclosure procedures. Significantly more research into lie detection must occur, as the scientific community certainly is not convinced that neuroimaging-based lie detection even detects that which it purports to detect. Even after the broader scientific community is convinced that a reliable and valid mechanism has been developed--and this article takes no stance as to whether such a feat is even neurologically possible--the parameters of each instrument developed from that theoretical mechanism must be disclosed to opposing counsel as well as the trial judge.


This Part provides a high-level overview of the current state of fMRI technological development and the scientific community's understanding of how the human brain gives rise to how we think and remember. Uncertainty and doubt regarding any factual findings can arise because fMRI is a multi-step amalgamation of techniques starting from nuclear physics, moving through neuroscience, and ending with statistics. Even assuming that the technology accurately and precisely measures that which it is proposed to measure, the smallest oversight at any part of the chain can cause the final result to be suspect. Thus, an understanding of how fMRI works and how a researcher or test administrator employs fMRI affords the reader the ability to evaluate and challenge any proffered evidence.

A. Technical Background

Functional neuroimaging is used to measure the function, interaction, and behavior of the living brain during cognitive tasks across space and time. (5) Functional magnetic resonance imaging (fMRI) is a method of noninvasively measuring a physiological correlate of neural activity. (6) As this Section covers both fundamental and practical aspects of fMRI analysis, the reader may find it easier to skim this Section and return only if necessary to understand later arguments.

1. Fundamentals of fMRI

A detailed discussion of nuclear magnetic resonance (NMR) is beyond the scope of this text and will be skipped. (7) Functional MRI involves two types of observations. A structural scan is a traditional magnetic resonance imaging scan (MRI) and is, in many ways, similar to a computed axial tomography scan (CT) or three-dimensional x-ray. (8) It distinguishes between different types of brain matter, thereby providing an image of the physical shape and contours of the brain. (9) The other type of observation is a functional scan. A functional scan indirectly measures changes in neural activity throughout the brain, but it does not reveal much of the physiological structure of the brain. Thus, to create the oft-seen "brain scan," this functional data is overlaid on top of the structural scan, like two transparencies.

To understand what fMRI measures, consider the ebb and flow of traffic on a highway (function) compared to the roads that make up the highway infrastructure itself (structure). Assume further that we want to know how bad the traffic was during rush hour (amount of brain activity), but that we cannot directly count the number of cars because they move too quickly to count (neural activity). What would be a good alternative? We could estimate changes in the severity of traffic over time by measuring the change in pollution between locations from one hour to the next. Then, we could overlay the increase in pollution onto a map of the roads to create a diagram showing where the traffic was the worst during rush hour. This type of indirect measurement is analogous to the measurements that fMRI makes.

Functional MRI relies on a chain of inferences to derive an estimate of localized neural activity. (10) This chain begins when you think about something. Thinking causes neural activity in the brain; neural activity uses up energy resources; when resources are depleted, the body must replenish them; those resources are supplied through the blood. (11) Thus, when you think about something (e.g., traffic increases, which is not directly measureable), the part of the brain responsible for that kind of thinking requires more blood (e.g., increased air pollution, which is measureable). Functional MRI therefore measures how the amount of blood flowing to various parts of the brain varies with mental effort. Typically, this is done with images reflecting the amount of oxygen found in blood throughout the brain.

2. Mechanics and experimental methodology of fMRI-based studies

Applying the principles of cognitive neuroscience--such as to detect deception--involves two stages: a fundamental research phase and then a test phase. In the first phase, the scientist asks research subjects various questions and instructs them to tell truths and lies in a predetermined and known order. For example, the subject may be asked to give either their real name and city of birth or made-up ones. While doing so, an fMRI scanner records brain activity. During analysis, the scientist discovers that one part of the brain is unusually active during lying (or unusually quiescent during truth-telling), thereby creating a telltale sign. A conclusion may thus be drawn that when humans deceive, that particular part of the brain is especially active.

During the test phase, a witness is asked a question, the answer to which is unknown. If it happens that the telltale region of the brain is unusually active, it may be presumed that he or she is lying; if, however, the telltale region is no more active than the rest of the brain, it is presumed that he or she is telling the truth. The careful reader's suspicions should now be aroused, and for good reason. At any point during either phase, it is possible for the test-giver or test-taker to inadvertently or maliciously affect the outcome of the test in both random and meaningful ways. For example, in the first phase, if the researcher mistakenly relies on a widely accepted but erroneous statistical test, the test-taker could be shown to be lying when he or she is in fact telling the truth. Alternately, if the witness thinks especially hard about the truth while responding with a falsehood, it may be possible to trick the test into signaling that the response was truthful.

There are two major technical areas where errors may arise: the choice of baseline task and the techniques used to discriminate signal from noise. These two facets are critical because their inadvertent or malicious misapplication has the potential to change a finding of deception into truth, and of truth into deception.

a. The Baseline Task

The incorrect selection of a baseline task can create or suppress a result that the witness was not truthful. The vast majority of fMRI studies rely on the blood-oxygen level dependent (BOLD) effect, in which sensitivity to oxygenation levels in the blood is used to estimate local neural activity. Because these measurements are not on an absolute scale, researchers must use contrastive experimental designs that look not at a single measurement, but at the difference between two or more measurements. In fact, it is logically impossible to determine with a single measurement the "brain activity" associated with a cognitive process like lying. (12) Instead, there must be two different "brain activity" readings: one while the individual is lying (or not lying), and a comparison measurement where the brain is doing not much of anything at all or doing something largely similar, but differing only in the critical aspect (e.g., contrasting lying and not lying). This is because the brain is constantly doing many different things, some of which you are aware of and are doing intentionally, such as looking at the scanner (which, for most people, is not an everyday occurrence, and thus involves significant learning of a new situation) or recalling a grocery list for that evening's dinner, and some of which you are not usually aware of, such as breathing and digesting food. (13) Each of these actions involves a certain amount of brain activity and thus requires different amounts of blood flow. If we record your BOLD activity while you tell a lie in the scanner, there is no way to determine which part of the blood flow is due to looking around, listening, breathing and digesting, and which is a result of lying. The solution is to measure the activity of the combined entangled processes, measure the activity from one of them in isolation, and then subtract the activity of the isolated process from the combined pattern. The isolation task is called the "baseline task."

Here is an illustration: suppose you have only one jug filled with water, and you would like to determine the weight of the water alone. To do so, you could fill the jug with water and weigh the filled jug. Then you would pour the water out, weigh the empty jug, and subtract the weight of the empty jug from the weight of the filled jug. Similarly, two cognitive tasks that require different processing functions can be considered discrete and independent so that one may be subtracted from the other. (14)

Consider, then, if we ask you to read a question and give a truthful response, and then subsequently ask you to read the same sentence and give a deceitful response. There are only a few things different in the two cases. In both, you must read the question and think of the correct answer. Only in the second situation must you do something further: you must stop yourself from telling the truth, think up some kind of incorrect answer and then evaluate whether that false response is comprehensible. (15) Mathematically subtracting the "truth" response from the "deceitful" response eliminates the activity common to both situations, leaving only the part of the brain that is putatively responsible for thinking up a lie.

In an fMRI study, a researcher compares the blood flow during two tasks: one that is thought to include the mental processes under examination versus a second that contains all the same physical and mental processes, except for the one in question* The former is called the trial task; the latter, the baseline task. Contrasting the two tasks allows the cognitive process of interest to be isolated. This is of concern as an uninformed or malicious choice of baseline task can easily cause a finding of truth or lie to appear or disappear. The ramifications of this issue are discussed in more detail infra.

b. Discriminating Signal From Noise

Certain techniques and procedures have been developed to compensate for the inherent physical limitations of fMRI technology. Some of these procedures, such as the need for the witness to remain absolutely motionless during the scan, or for the witness to repeat the lie many times over a testing period, make it significantly easier for the witness to deceive the test itself.

The practicalities of fMRI analysis constrain how data can be gathered and analyzed; the MR signal change during activation is miniscule--approximately 1% signal change per 50% change in rate of blood flow. (16) Therefore, the primary technical issue in fMRI data acquisition is maximizing the usable signal-to-noise ratio during the experiment. The only direct method of increasing signal strength is by increasing the strength of the magnetic field in the fMRI scanner. (17) However, due to imperfections in the magnetic field and the fact that different materials react differently to magnetic fields, increasing the magnet strength can cause distortions and blurring in the image. (18) For that reason, the only practicable method of increasing a subject's signal is by repeating the test several times. An area that is consistently active across repeated testing can then be considered to be truly active and not just the result of noise. To illustrate this principle, imagine that someone is whispering his phone number to you in a very noisy room. The first time you hear the number, you might not be sure about all the digits. But after hearing it repeated a few times, you become increasingly confident that the number you heard is correct. (19)

The most significant problem that arises from the need to repeat trials is that the subject can move. First, recall that, ideally, fMRI identifies places with different activity whenever the person is being deceptive. Furthermore, the falsehood must be repeated multiple times in order to reach a point where the signal can be "heard" over the noise. Finally, recall that fMRI measures structure, or the physical parts of the brain, at a different time from function, which is recorded in a spot in space relative to the fMRI machine, not relative to the physical structure of the brain itself. Only by placing the functional data on top of the structure after the test is finished is it possible to know where in the brain the activity is.

Thus, if the location of the signal relative to the fMRI machine changes between repetitions of the test, it would be as though the person whispering to you changes one of the numbers every time he or she says it. This can happen if the subject shifts his or her head a few millimeters in any direction. Obviously, doing so renders the data meaningless. Several physical methods have been adopted to ensure stereotaxic precision, including body straps, rigid thermoplastic full-face molds, head cages, and bite bars. (20)

Furthermore, overlap between scans is not only critical within a subject, but also across subjects. The combination of relatively small areas of activation and the need to integrate trials to reach a sufficiently high signal-to-noise ratio means that minor movements of the brain relative to the scanner--for example, movements due to compression and expansion of the chest cavity during respiration (!)--not only produce motion artifacts, but also reduce the amount of overlap between the brains of different people because every brain is distinctively different. (21) Aligning the scans from one brain over time, or from one brain to another (a procedure called coregistration) requires the use of computationally intensive mapping and warping techniques to ensure that activation in one subject's location is correctly matched up to activation in the same part of the brain as all the other subjects. (22)

This need to repeat trials both within the same subject and across multiple subjects causes more than just technical problems. Subjects tend to get better at the task over time, so their minds wander, causing inconsistencies in mental activity between trials. (23) Worse, people start to respond differently to the same question repeated multiple times; while a subject might respond "no" the first time, they may become exasperated when asked the same question for the twentieth time. And, most damning, people can respond differently when accused, falsely or correctly, of a crime. Some may become angry, while others may be calm. As different areas may be active both within and across subjects, it becomes difficult to identify which areas are consistently active when all the trials are merged. Thus, in relation to earlier-discussed issues, the baseline task becomes less and less effective at subtracting out the non-deception cognitive processes as the experiment accrues additional data.

These issues posed by research methodology and physical constraints can have a dispositive effect on the result of a lie detection test and will be discussed in more detail infra. For now, the discussion proceeds to the next level of detail: cognitive processes that are implicated while lying.

B. Scientific Background

Cognitive neuroscience studies the relationship between how we think and the structure and organization of our brains. (24) A fundamental axiom of psychology is that cognition is the result of neural activity within the brainy A further tenet is that neurons are connected to and organized by specific regions or networks of regions in the brain. (26) Therefore, activity in a specific structure or network of regions previously determined to be correlated with a specific cognitive function is proposed to be evidence that said function occurred. For example, it is well known that certain areas in the brain are more active while viewing faces of unfamiliar people, (27) while different, non-overlapping portions of the brain are more active while viewing buildings. (28) It is thus possible to predict based solely on observations of brain activity, although not with certainty, whether an individual was viewing a face or a building. (29) Similarly, neuroimaging-based lie detection would ideally identify the "locus of prevarication," viz., an area of the brain that alters its neural activity whenever the individual tells a falsehood. (30)

Two human cognitive functions are pertinent to the discussion: memory and the central executive. The concept of memory is self-explanatory; it is the process by which the brain stores and retrieves information on a long-term or permanent basis. (31) Pursuant to current federal jurisprudence, a witness may not testify unless he or she has personal knowledge of the matter about which he or she is testifying. (32) Personal knowledge requires that the testimony is regarding matters that the witness personally observed firsthand and has stored as a memory. (33)

The executive functions refer to the collective of regulatory and goal-directed cognitive processes that harness, coordinate, and control other cognitive functions. (34) Put simply, these parts of the brain tell other parts of the brain what to do. After experiencing something, a witness who later wants to lie about it must do at least seven other mental things: (1) recall from memory the events as they actually occurred; (2) construct a plausible but counterfactual explanation (or recall a factually similar but contextually irrelevant experience); (3) suppress the urge to recall and recount events as they actually occurred; (4) detect and correct implausible or truthful elements of the falsehood; (5) monitor the interaction of the separate elements of the falsehood; (6) predict potential questions that may be asked; (7) and recount the false experience verbally. This is why lying on the spot is so difficult to do convincingly; it takes a lot of effort and thinking to coordinate. Lying certainly feels easier and more convincing after you have the opportunity to come up with a plan and mentally rehearse it until it feels, not surprisingly, as though you actually experienced it. These tasks broadly correlate to the executive functions of strategic episodic retrieval, (35) selection and control of incoming information (e.g., attention), (36) temporary memory for information actively being considered (e.g., working memory), (37) planning and decision-making, (38) problem solving, (39) logical reasoning, (40) error detection and correction, (41) and initiation of activity and impulse control. (42)

The next Part will discuss how the background material gives rise to potential problems when conclusions about behavior and cognition are drawn without adequate attention to fMRI technique and cognitive neuroscience research.


An advocate attempting to introduce fMRI-based lie detection evidence to a jury must overcome a number of obstacles. In the federal judiciary, the court determines the admissibility of proffered testimonial evidence. (43) An opinion developed through the use of fMRI is clearly not testimony by a lay witness, and therefore must qualify as expert testimony in order to be admissible. (44) Furthermore, the danger of unfair prejudice resulting from the evidence must not outweigh its probative value. (45) Finally, evidence obtained through functional neuroimaging may be excludable by the protections in the Fourth (46) and Fifth Amendments. (47) This article focuses on the constraints of FRE 702 and the Daubert standard.

FRE 702 originally allowed an expert witness to provide scientific, technical, or other specialized knowledge to the jury in the form of expert witness testimony. (48) In 2000, following Daubert v. Merrell Dow Pharmaceuticals, Inc. (49) and its progeny, (50) Congress amended Rule 702 to require that "the testimony is based upon sufficient facts or data, the testimony is the product of reliable principles and methods, and the witness has applied the principles and methods reliably to the facts of the case." (51) In maintaining the role of the trial judge as gatekeeper, (52) the court must "ensure that any and all scientific testimony or evidence admitted is not only relevant, but reliable." (53) A theory or technique is reliable for evidentiary purposes if it is scientific knowledge; it is "scientific knowledge... [if] it can be (and has been) tested." (54) To evaluate whether expert testimony regarding a theory or technique is "scientific knowledge," Justice Blackmun lists four factors that the court may consider: (55) (1) testability or falsifiability, (2) peer review and publication, (3) error rate, and (4) general acceptance within the relevant field. (56) Testability and falsifiability go to whether the theory can be construed in a scientifically appropriate fashion. Publication, peer review, and error rate go to whether it has been tested.

Section A discusses the legal requirements of Federal Rule of Evidence 702 and the Daubert standard. (57) Section B discusses how many of the aspects of fMRI technique and scientific knowledge of cognitive neuroscience and cognitive psychology limit the ability to draw conclusions from fMRI evidence. Section C applies the facts and capabilities of fMRI to the Daubert standard and argues that the recommended Daubert factors do not provide judges and litigants with adequate guidance about whether the results of a novel diagnostic instrument developed from scientific knowledge are admissible as evidence. Section D proposes a four-stage schema for evaluating the admissibility of a diagnostic instrument based on principles of scientific validity, compatible with Daubert. Finally, Section E applies fMRI-based lie detection to the new model and finds that the technique falls far short of the strict requirements in FRE 702.

A. Legal Admissibility of Neuroimaging Evidence is Mixed

This Section discusses legal rules affecting the admissibility of neuroimaging evidence. Functional MRI is a relatively nascent technology, and, as such, is a novel source of legal evidence. Not surprisingly, evidence based on fMRI emerged in the trial courts before Congress had a chance to pass a statute, adopt a Rule of Evidence, or add an Advisory Committee Note regarding its use. Two underlying questions will serve as a preview for future parts: first, is the admissibility of neuroimaging evidence clearly permitted or denied? Second, if admissible, what propositions or arguments may it be used to support?

1. Executive Background

While there has been limited review of the role of neuroimaging in particular, the executive branch has weighed in on other lie detection and forensic science methods. In 2003, the U.S. Department of Energy asked the National Research Council to conduct a scientific review of the validity and reliability of polygraph examinations. (58) Although it found that polygraph tests for specific incidents (59) could discriminate truth from falsehood at rates well above chance, (60) the use of countermeasures, particularly by individuals with a strong incentive to use them effectively, seriously undermined any value of polygraph testing, especially for security screening. (61) The report also discussed alternate methods of detection of deception using practices grounded in central nervous system psychophysiology such as neuroimaging, but cautioned against their adoption without further critical review. (62)

In 2009, the National Academy of Sciences was tasked with studying the current state of affairs in forensic science. (63) It presented Congress with a study indicating that "many of the techniques and technologies used in forensic science lack rigorous scientific discipline," and that there was "a lack of standard accreditation processes for individual labs and the technicians who collect and process evidence." (64) Because of such failures to ensure accuracy and reliability, the Academy found that:
   [I]n some cases, substantive information and
   testimony based on faulty forensic science analyses
   may have contributed to wrongful convictions of
   innocent people. This fact has demonstrated the
   potential danger of giving undue weight to evidence
   and testimony derived from imperfect testing and
   analysis. Moreover, imprecise or exaggerated
   expert testimony has sometimes contributed to the
   admission of erroneous or misleading evidence. (65)

It therefore recommended further research to "establish[] the scientific bases demonstrating the validity of forensic methods" and to develop "quantifiable measures of the reliability and accuracy of forensic analyses." (66) Although not addressed in the report, fMRI and lie detection had been listed on the Committee meeting agenda. (67)

2. Judicial Background

As of the time of this writing, no federal statute or regulation addressing the admissibility of fMRI evidence or deception detection could be found. Furthermore, no opinion by a federal court could be found admitting or affirming the admittance of fMRI-based lie detection evidence at trial. However, functional neuroimaging demonstrating mental capacity has been persuasive at the Supreme Court and has been admitted in at least one trial at the District Court level.

a. Functional Neuroimaging in Amici Curiae at the Supreme Court

In Roper v. Simmons, the Supreme Court held that the Eighth Amendment, incorporated against the states through the Fourteenth Amendment, prohibited states from imposing the death penalty on offenders younger than 18 years of age at the time of their crime. (68) The Court's rationale was that because the death penalty is reserved for the most serious crimes, and because, among other things, the character of a juvenile is less likely to have "an irretrievably depraved character" due to being less fixed than that of an adult, juveniles are subject to diminished culpability compared to adult offenders despite having committed a crime of a similarly heinous nature. (69)

Although the majority opinion does not specifically cite neuroimaging data, two amici curiae briefs refer to functional neuroimaging studies of children and adolescents. (70) One study referenced purported to show that the prefrontal cortex of the brain was found to be still in development throughout adolescence. (71) Because those areas are thought to control emotion, aggression, and impulsive behavior, the briefs argued that a juvenile is less likely to have a fixed character regarding their moral nature. (72) Although the briefs cited to fMRI-based data, and some observers note that such data was likely persuasive to the Justice who authored the opinion, (73) the Court did not have to rule on whether fMRI-based evidence is admissible, as that evidence was not presented in the court below and therefore was not in dispute. (74)

b. Functional Neuroimaging Admissible to Demonstrate Generalized Mental Illness

In contrast, the district court in Entertainment Software Ass'n v. Blagojevich heavily cited expert witnesses using neuroimaging data. (75) Here, plaintiffs brought an action against several state officials, seeking to enjoin them from enforcing a state statute establishing criminal penalties for, among other things, selling or renting violent or sexually explicit video games to minors. (76) An expert witness for the defendants testified that their unpublished fMRI-based research suggested that when children with behavioral disorders played highly violent video games, they experienced decreased brain activation in certain prefrontal cortices compared to children without behavioral disorders. (77) He concluded that exposure to media violence was correlated with reduced executive functioning. (78) Ultimately, the court found that the expert witness proffering fMRI-based evidence "[could not] support the weight he attempt[ed] to put on them via his conclusions," which were that "minors who play violent video games are more likely to '[e]xperience a reduction of activity in the frontal lobes of the brain which is responsible for controlling behavior.'" (79) In Blagojevich, the court entertained fMRI-based evidence and related expert witness testimony but ultimately found that the expert testimony was not credible. (80)

c. Functional Neuroimaging Not Admissible to Prove Individual Mental Illness

While fMRI evidence was introduced in Blagojevich to support a generalized assertion, in U.S. v. Mezvinsky the defendant offered positron emission tomography (PET) evidence to support a specific assertion that he, as an individual, was mentally impaired. (81) In U.S. v. Pohlot, the Third Circuit held that "evidence of mental abnormality [may] negate specific intent or any other mens rea, which are elements of the offense." (82) When applied to alleged fraud, a lack of mens rea may be proven if the defendant's "clinical condition and symptomology [sic] can be logically connected to his subjective belief that his assertions were not false, baseless, or reckless vis-a-vis the truth." (83) In Mezvinksy, the defendant argued that he did not have the mens rea to commit the offense, because he did not have the capacity to form an intention due to "frontal lobe organic brain damage[,] which was revealed in a Positron Emission Tomography Scan (PET)." (84) However, expert witnesses for both parties agreed that no reliable inference could be drawn about the state of the defendant's brain, much less his capacity to deceive, from a single PET scan. (85) The court therefore held that since the evidentiary conclusions from the proffered neuroimaging scan were not only insufficiently supported by scientific consensus but also not relevant to a jury's decision-making process, the scan and accompanying testimony would not be admitted. (86) In Mezvinsky, the court examined the substance of a PET scan as well as proposed expert witness testimony accompanying the scan and concluded that such testimony could not be admitted because it was neither reliable nor relevant. (87)

As shown above, the legal status of functional neuroimaging is mixed, marked by much skepticism, but also by instances of some influence. Not surprisingly, the more spectacular the leap required from the data presented to the factual conclusion drawn, the more likely the court refused to admit that conclusion. Similarly, this Article argues that courts should follow this general schema when neuroimaging data is proffered for the purpose of deception detection. The sanctity of human memory being so unreliable, great care must be exercised when burdening the finder of fact with unreliable "proof" of mendacity.

B. Functional MRI-based Lie Detection is Neither Reliable Nor Valid

This Section discusses the limitations of fMRI as applied to lie detection. These arguments may be divided into four major categories. First, insufficient technical rigor forecloses any opportunity to legitimately draw certain conclusions. Second, current scientific knowledge limits practical inferences about whether we are lying. Our understanding of brain function, despite having grown dramatically in the last century, is still in its infancy. Theories of brain function are, at best, still developing and will continue to evolve. This tentative state of scientific knowledge demonstrates why the overwhelming majority of scientists do not believe that assertions about lie detection may be reasonably drawn at this point in time. Third, contemporary scientific knowledge poses epistemological barriers to developing a cohesive and scientifically convincing lie detection instrument. This section steps back from the science and instead considers the broader question of how a lie detection test administrator, conducting what is ostensibly a standardized test with a deterministic outcome, can be sure that what the test indicates is, in fact, a reasonably close approximation of the truth. Finally, practical concerns suggest that fMRI-based deception detection is likely to be unreliable if the test-taker intentionally engages in countermeasures.

1. Technical Concerns

As discussed earlier, fMRI studies are deeply complex and rely on a chain of inferences and assumptions. A violation of any one of those assumptions can invalidate the accuracy and reliability of the conclusion. In this Section, we discuss three major tactical research methodology issues: the correction for multiple comparisons, the correction for nonindependence error, and the choice of baseline task. If the method for any one of these three is chosen or applied incorrectly, the resulting technical error is prone to producing a false positive.

a. Inadequate Correction for Multiple Comparisons

Because fMRI relies so heavily on statistical analysis, one of the most important areas for careful review is that of statistical rigor. Contrary to popular perception, a statistical test cannot guarantee that the witness is telling the truth or is lying. Instead, it can only provide that the answer is within a certain margin of error--commonly set by the researcher at a 5% chance that the observed result arose purely from chance and noise (which translates roughly to a 95% chance that the result is real). This margin of error is present every time a statistical test is performed on a new set of data taken from the same or a substantially similar experiment. Because there is a chance for a wrong result every time a statistical test is used, doing more than one test per experiment will necessarily increase the odds of getting a false positive purely from chance. (88) This error is called the familywise error rate and represents the probability that, when a series ("family") of statistical tests are performed, at least one of them will come up erroneously positive by chance. (89) To illustrate, a rough analogy would be like keeping track of the number of heads that occur during a coin flip in a very dark room. Each time you flip the coin, you are uncertain about whether it came up heads or tails. While the odds of making a mistake for one flip are fairly low, the odds of making a mistake for a thousand tosses are fairly high.

This effect is the problem of multiple comparisons. Essentially, the mathematical nature of statistics is such that as the researcher performs more comparisons between conditions, the more likely it is that an erroneous result will occur by chance. (90) A typical whole-brain fMRI study involves around 60,000 voxels, and therefore, as many statistical tests. 91 With so many comparisons, it is a near certainty that at least a substantial number of activated voxels are false positives. (92) In a typical experiment, there would be so many false positives that if they were all coincidentally clustered together, the resulting blob would cover a space nearly half the size of a human hippocampus, (93) which is the part of the brain considered to be primarily responsible for memory. (94)

The problem of multiple comparisons is so pervasive and powerful that it is capable of fallaciously creating the appearance of brain activity in a long-dead fish. (95) Fortunately, this problem is easily addressed by using a principled correction technique; in the dead fish study, after proper statistical correction, the activity disappeared. (96) Most relevant and troubling is a meta-analysis finding that the putative activity in approximately 27% of neuroimaging studies in a survey vanished once an adequate correction for multiple comparisons had been applied. (97) In a recent survey, as many as 25% to 30% of all articles in six major neuroimaging journals published in 2008 did not use a correction for multiple comparisons adequately scaled for the size of their dataset, suggesting that there could be a surprising number of findings comprised wholly of false positives. (98)

Although some fMRI studies have attempted to control for multiple comparisons by adopting an arbitrary fixed threshold cluster size, (99) this technique is primitive as it provides no information about the actual error rate. (100) This article instead recommends using a more modern technique, such as random field theory or permutation simulations to limit sufficiently the familywise error rate. (101) This method is better because it affords a correction for the actual error rate of the recorded data, rather than a predetermined and arbitrary threshold. (102) An improper correction for multiple comparisons is easy to remedy: use modern and rigorous statistical methods and phantom activations will disappear.

b. Inadequate Correction for Nonindependence Error

A second statistical objection that is less easy to understand is nonindependence error, also called circularity error. The nonindependence error is particularly pertinent for lie detection as it only arises during fMRI studies of particular subregions in the brain. (103) Here, specific regions of interest (ROI) are defined using the same data set as for the results statistics. (103) Two problems arise: not only does the region of interest limit the voxels chosen in the second, but it also creates an additional set of comparisons that have not been statistically corrected. (105) Put simply, by using the same dataset to create the regions of interest as well as for detecting the effect of experimental interventions, the researcher has effectively "double-dipped" from the same variance pool. (106) Failing to correct for the nonindependence error creates "impossibly high correlations," which indicate the existence of false positives. (107) Researchers have suggested several solutions. One technique is to define ROIs using test runs independent from the substantive contrasts. (108) This solution is easy to implement but time-consuming and expensive. Another technique is to define ROIs independently from the results statistics. (109) In studies of clearly defined anatomical regions, this can be done by preselecting regions anatomically. (110) Finally, it is possible merely to acknowledge that the data were acquired circularly; however, this treatment solves nothing at all, but rather only alerts the reader that any conclusions drawn from the data are likely meaningless. (111)

c. Experimental Validity of the Baseline Task

In Part I.A.2, we discussed the baseline task. The baseline task is as important as the experimental task because it sets the zero from which the change in blood flow is measured. If the task is chosen incorrectly, the findings are worse than merely incorrect; instead, it is logically impossible to determine whether they are correct or incorrect. A widely used baseline task requires the subject to rest calmly in the scanner, the rationale being that because no task was being actively performed, this choice of baseline reflected zero activity. (112) However, Stark & Squire showed conclusively that using rest as a baseline task could erroneously eliminate activity previously found to occur while viewing a novel or familiar picture. (113) The reason, of course, is obvious upon explanation: resting calmly usually involves daydreaming or thinking of any variety of matters in an entirely uncontrolled and indeterminate fashion. Furthermore, using rest as a baseline may even cause the fallacious appearance of "deactivation" during many cognitive tasks known to require substantial processing. (114) While it is likely that certain tasks gradually require less neural processing over time, phantom deactivations that result from an ill-chosen baseline task will reappear with the adoption of a different baseline, indicating that the findings are meaningless. (115)

Similar to the variability resulting from an irrelevant or comparison question in polygraph practice, the choice of a baseline task is critical to accurate and reliable results. (116) Using a different baseline task may cause a previously observed activation to vanish or, where no activation was previously seen, to spontaneously occur. Any baseline task for a particular technique must therefore be subjected to similar review and standardization before it can be accepted by the scientific community.

Although the statistics of neuroimaging seem arcane or technical, the entire foundation of fMRI analysis is the statistics. Seemingly miniscule factors interact to push an effect past a generally-accepted statistical threshold, whereupon a scientific conclusion is recognized. (117) Flawed statistics thereby create findings out of thin air, borne from an oversight in the analysis. The error rate, although incalculable, is unpredictable and therefore potentially could be very high. Moreover, the type of error resulting from a poorly-selected correction for multiple comparisons or nonindependence is invariably overgenerous; adjustments inadequately applied (or not applied at all) always act to raise the possibility of a false positive. Furthermore, they are endogenous and inherent to the specific methodology chosen. As it stands now, fMRI-based research, if not conducted with care and precision, has an unpredictable error rate as well as a dearth of standards controlling its operation--because it is, in fact, still nascent research. (118) Research is an inherently progressive process, where techniques and methodology are constantly being developed and refined and thus, while many studies are rigorous and produce replicable (and replicated) results, some are not and do not. Furthermore, unlike law, where the "truth" can have life-changing consequences on liberty or livelihood, the cost of a false positive in science is the time and energy to repeat the experiment. While this cost is admittedly substantial, it is rare that a scientist is locked up for choosing the wrong parameters for a statistical test. Thus, because science has the luxury of repeated testing, the standards of evidence are founded on a different set of incentives from those in law.

2. Scientific Concerns

Because this work addresses the use of fMRI in lie detection, it is constrained by the limits of fMRI technology. This Section discusses fMRI's evidentiary reliability in relation to the memory and executive functions, and the structures within which they are believed to be instantiated. Below, this Article discusses reasons why the scientific knowledge upon which neuroimaging-based lie detection is based is still in its infancy.

a. Executive Functions and Lying

As discussed earlier, the ability to lie convincingly requires a plethora of executive functions. For example, an individual who intends to lie about not having been at the scene of a crime must remember the details of the actual incident and what the scene looked like, dream up an alternate story, check that the story is possible and plausible, check that nothing else he said conflicts with the new story, and resist the urge to tell the truth.

Executive functions have been localized to a wide variety of regions in the prefrontal cortex (PFC). (119) Puzzle-solving is an experimental task that recruits many executive functions and is conceptually similar to that of thinking up and retelling a plausible falsehood; brain activity for such tasks has been observed in anterior PFC. (120) Similarly, tasks requiring sustained focused attention to one task while suppressing distracting stimuli have been found to correlate with successful performance in lateral PFC. (121) The corollary to sustained attention is impulse control, the capacity to resist the urge to do something. Tasks emphasizing response inhibition have similarly evoked activity in both right or medial PFC (122) and bilateral dorsolateral and inferior PFC. (123) Free generation of verbal responses to a constrained but underdetermined set of stimuli have elicited activity in left PFC. (124) The converse of free generation is error detection and correction; what good is it to be able to come up with responses without the ability to determine if those responses fit the requirements? The anterior cingulate cortex (ACC) has been implicated in error detection and correction. (125) Combined, the PFC and ACC span about a quarter to a third of the volume of the entire brain. As demonstrated, scientific knowledge is currently unable to correlate prefrontal activity with the executive functions implicated in lying to any degree of reliability or validity.

b. Memory Functions and Lying

The relationship between memory and neural structures is even less well understood than that for the executive functions. The episodic memory system, which describes the ability to recall prior events experienced firsthand, (126) is necessarily associated with a personal and individualized experience at a specific time and place, (127) thereby satisfying the personal knowledge requirement of Federal Rule of Evidence 602. (128) An episodic memory is encoded while learning or experiencing an event; later, it is retrieved while the individual is remembering or recalling the memory of the event. Because lie detection can only be used to validate the authenticity of a witness's testimony at the moment the memory is recalled, this Section focuses on the functions and structures involved during the process of retrieval.

The capacity to clearly differentiate regions required to make a memory (encoding a novel experience) from those required to recall a memory (retrieval of a previously experienced event) is critical to many lie detection paradigms. Suppose a defendant asserts that he has never seen the murder weapon before. Suppose further that it has been convincingly demonstrated that a certain region in the brain is known to be more active while examining an object that has never been seen before (thus requiring encoding), while a different region is active when the object is familiar (thus requiring recognition). It would be trivial to present the defendant with a photograph of the object and observe, without questioning the defendant, whether his brain considers it novel or familiar. (129) A number of studies have indeed demonstrated that encoding and retrieval process may be in different locations. (130)

However, there are two problems with such a proposition. First, it is notoriously difficult (or potentially impossible) to capture activity related to pure retrieval: while you are in the process of remembering a particular event, you also are simultaneously forming a new memory of yourself engaged in remembering the event. (131) Second, recalling an old and familiar memory is likely to be accompanied by the creation of new memories. (132) Suppose you encounter an object with which you are very familiar, such as a pen you have owned for many years. The fact that you recognize it when you pick it up does not mean that you cannot learn anything more about it; in fact, even if you are intimately familiar with every detail of the pen itself, it is likely that you will remember where you left it the last time you used it, indicating that you were making new memories, however minor, while you were remembering it. The theory of incidental encoding, or a unitary model of encoding and retrieval, has been proposed for both animal models (133) and neural network models. (134) Another explanation comes from a competing theory that suggests a single process affects both storage and retrieval, (135) or that there are a number of overlapping processes not solely dissociable by location. (136) If any of these alternate models are accurate representations of the memory functions, it would be neurologically impossible for any functional neuroimaging-based lie detection test based on the localization of novelty detection or encoding and retrieval to be either reliable or valid.

In summary, the raw memory functions are generally agreed to be localized to the MTL, whereas executive control is located in the PFC. However, there is no scientific consensus on how the finer-grained functions are instantiated within each subregion, much less an understanding of how the two functions interact.

3. Epistemological Concerns

Scientific experiments often are more concerned with finding the answer to specific questions than with the philosophy of science, but there are broader issues regarding the nature of knowledge--how we know that what we know is true. In this section, we discuss two factors that are intended to cast doubt at the strategic level whenever reading conclusions drawn from forensic cognitive neuroscience research.

a. Correlation versus Causation

The fallacy of cum hoc ergo propter hoc is better known as the admonition against confounding correlation and causation. (137) Functional neuroimaging is a tool to measure correlations between changes in brain activity and observable behaviors and nothing more. The naive observer will note that when a subject is asked to lie, he evinces greater activity in region X, and without hesitation will conclude that region X is responsible for deception. Then, when he observes activity in that same area when the veracity of a statement is not known, he will conclude that because the same area lit up, the subject must be lying. There is, however, the possibility that an independent factor caused both observations. Those regions active during lying might be unrelated and independent cognitive processes that are required for lying, such as strategic memory retrieval, response inhibition and/or performance monitoring, (138) or even the decision-making process to determine whether to lie or refrain from lying. (139) Christ et al. examined overlapping regions between executive functions and deception and found that ten of thirteen functional ROIs activated in deception were also activated by the executive function tasks of working memory, inhibitory control, and/or task switching. (140) The remaining three regions were in areas implicated in maintaining and switching attention. (141)

How else can this question be explored? Rather than using techniques that measure correlation, how about causative methods? (142) One research group found that using transcranial magnetic stimulation (143) to "shut down" activity in the PFC caused subjects to respond more quickly when lying than when telling the truth. (144) They suggest that the regions suppressed were involved in controlling antisocial or moral behavior. (145)

b. Negative Results in Hypothesis Testing

One of the pitfalls of a diagnostic test is that the results are not always intuitive. Hypothesis testing as found in statistical methodology is to ask the question, "What is the probability that the outcome of the manipulation occurred purely by chance?" A statistical test evaluates the evidence gathered to determine the probability that this chance outcome may be rejected. (146) The probability of a chance outcome is rejected when the probability of the event falls below certain generally accepted thresholds. The statistical tests in fMRI are commonly set at a threshold between one in twenty (p < 0.05) and one in a thousand (p < 0.001). If a finding occurred at a probability less likely than these thresholds, it may be concluded that whatever happened did not occur due to random chance. Instead, the observed behavior occurred as a result of the manipulation.

The converse, however, is not necessarily also true. A result that occurred at a probability above the threshold does not afford the conclusion that the manipulation did not cause the behavior, but rather, only that there is no statistically valid relationship between the manipulation and theory in this particular set of data. (147) Furthermore, that manipulation could be contaminated by other sources, making it impossible to observe an effect that truly exists. Consider a hypothetical lie detector test. Suppose it is scientifically reliable and valid. If this imaginary lie detector indicates that deception occurred during a witness's statement, a conclusion may be drawn that the witness was being deceptive. But if it does not indicate that deception occurred, this does not mean that the witness is telling the truth. Perhaps there was insufficient evidence to confidently determine the witness was lying; perhaps the test was not powerful enough; perhaps the data had too much noise. (148) The only thing can be said from a negative result is that the test was unable to determine whether the witness was lying--nothing more and nothing less.

4. Practical Concerns

As with any other forensic technique involving human interaction, neuroimaging-based methods are prone to countermeasures. Countermeasures are the intentional adoption of specific behaviors for the purpose of influencing the responses being measured, thereby producing a spurious result (usually indicating truthfulness). (149) Such techniques have been widely discussed in the legal literature (150) as well as pop culture. Although they have not yet been broadly tested, there is a strong possibility that similar techniques adapted for fMRI-specific weaknesses could fool fMRI-based lie detection. These techniques may be performed within the scanner or rehearsed outside the scanner in advance such that they would be effectively undetectable.

a. Countermeasures Inside the Scanner

The reliance on stereotaxic precision suggests that gross motor movements may make it difficult to gather an adequate amount of data. Although the use of highly constricting physical restraints and aggressive motion control software may help to eliminate the problem of within-subject coregistration, movements within the skull (e.g. jaw movements or subtle shifts of the musculature) may cause distortions, reducing data accuracy. (152) Specifically, these kinds of movements would necessarily lead to a conclusion of no activation where in fact activation had occurred. Depending on the test framework, this means that merely clenching one's jaw at random moments could cause the test to incorrectly indicate that it was not possible to determine whether the witness was lying.

All fMRI studies require the use of a baseline task. (153) Most deception studies contrast test trials where the subject is asked to lie versus control baseline trials where the subject is asked to tell the truth. (154) Similar to how polygraph countermeasures involve strengthening the response to control questions, a subject interested in malingering could engage in similar tasks during the baseline task to artificially strengthen the baseline signal. (155) By conjuring up a deceitful scenario or attempting to remember the details of a faint and unrelated episodic memory during the control tasks, it may be possible maliciously to increase the activity recorded during the truth-telling task. Doing so would reduce the difference between the baseline task and the test task, thereby reducing the test's power to detect deception.

b. Countermeasures Outside the Scanner

Many of the deception studies so far have informed the subject about their choice of deception at the last minute, out of convenience for both the researcher and the subject. (156) The activations observed in the frontal network attributed to deception could have easily been the result of intense problem-solving and error detection in order to create immediately an internally valid scenario. In contrast, defendants and witnesses will almost certainly have the luxury of time. They will have the opportunity and motive to construct a plausible, believable, and internally consistent alternative explanation. They may engage in significant rehearsal of this alternative scenario, including actual reenactment and visualization, well in advance of the test. Such preparation could reduce the amount of executive processing during the test trials in the deception phase.

The effects of countermeasures are not yet known, as they have not been broadly tested. In one brilliant study, a combination of inside- and outside-scanner countermeasures affected the baseline task measurement to such a degree that lie detection accuracy dropped to well below chance. (157) In the polygraph literature, it is clear that the adoption of countermeasures is highly detrimental to its error rate. Although the polygraph is subject to the additional abstraction layer of measuring a bodily response compared to fMRI-based lie detection, the requirement of a baseline task still provides opportunities for malfeasance.

C. Lie Detection via Functional Neuroimaging is Uncertain Under Daubert

Daubert advises a trial court to evaluate whether proffered expert witness testimony is testable or falsifiable, has been subject to peer review and publication, has a sufficiently low error rate, has standards controlling the technique's operation, and has been generally accepted within the relevant field. (158) These factors relate to whether the technique can be tested and whether it has been tested. We now apply each factor to fMRI-based lie detection.

1. The Technique Should Have a Clearly Defined and Low Error Rate

The first Daubert factor that courts should consider is the known or potential rate of error. (159) As with the factors of testability and falsifiability, commentators have expressed concern that this error rate factor is notoriously difficult to apply. (160) A critical issue is that a single "rate of error" masks the complexity of any diagnostic instrument. A lie detection test is characterized by two types of errors: A false positive (no lie occurred, but the test says the statement was a lie); or the false negative, also called a miss (the statement was a lie, but the test says there was no lie). (161) Sensitivity measures the test's ability to identify every individual who in fact has the criterion in question, e.g., the test's capacity to reject misses. (162) Inversely, specificity is the test's ability to exclude every individual who does not in fact have the criterion in question, e.g., the test's capacity to reject false positives. (163) Poor sensitivity means that many people who lied will not be detected (a miss); poor specificity means that many people who did not lie will be falsely identified as having lied (false positive). (164) Combined, these two characteristics represent the test's capacity to detect hits, which is its "error rate."

Critically, these two error rates are not necessarily the same. Sensitivity and specificity describe complementary approaches to the likelihood of detection for a faint or uncertain signal. Consider a time you were in the shower while awaiting an important telephone call. You may recall hearing the phone ring and rushing out, only to find that it was a figment of your imagination because you so strongly anticipated hearing a ring. Here, specificity was compromised for increased sensitivity. There is a miniscule chance that you will miss the call (low miss rate), but at the cost of constantly running out of the shower (high false positive rate).

Similarly, the diagnostic value of a lie detection instrument depends on how its error rates are set, regardless of how it is implemented. The National Research Council's report noted that "there is little awareness ... in polygraph practice ... that false positives may be traded off against false negatives simply by adjusting the threshold" for a finding of deception. (165) Results from fMRI may be tweaked in precisely the same fashion by adjusting the parameters used in the statistical analysis. It is for this reason that Greve et al. argue that a lie detection instrument should explicitly report the sensitivity and specificity of the technique to better survive a Daubert challenge. (166)

The following example demonstrates why disclosure of these instrument characteristics is so crucial. Suppose a court is presented with evidence from a lie detection instrument administered to a witness. The specific application of this test is known to be more likely to miss an actual lie (e.g., reduced sensitivity) because the witness has a particularly large sinus cavity; (167) however, simultaneously, this test also is known to be more likely to indicate the witness lied when he in fact had not (e.g., reduced specificity) because the event occurred a very long time ago, and parts of the brain implicated in recalling distant events are also implicated while lying. (168) How will the court balance these two factors if it is not informed of the precise error rates for specificity and sensitivity? A monolithic "error rate" is extraordinarily deceptive. The next Section presents arguments focusing on, but not limited to, the falsifiability and error rate factors of fMRI-based lie detection.

The error rates of seven to ten percent reported in scientific (169) and legal (170) articles are not relevant to a discussion of practical applications. First, experimental methods are constantly evolving and improving. Some of the newest techniques for controlling false positives have only been developed in the last few years. (171) This article evaluated the combined corpus of thirty-two articles that Cephos and No Lie MRI promote in support of their claims to determine how many used modern statistical corrections techniques. (172) Of the thirty-two articles, three did not contain original research. Of the remaining twenty-nine, three did not utilize any identifiable correction for multiple comparisons. Seven used an arbitrary and potentially insufficient correction for multiple comparisons. Fewer than half used a principled correction method. No reports tested specificity or sensitivity, or the rate of false positive or false negatives.

Second, the majority of the experiments provided have not been conducted in an ethologically valid scenario. Subjects had not been incentivized to deceive to the degree that a man facing life in prison, bankruptcy, or a huge punitive damages award might be. Additionally, the vast majority of studies are conducted on volunteer undergraduate students for the sake of convenience and frugality. While some may call that subject pool experienced at deception, few would consider them as practiced as career criminals.

Finally, the experiments did not account for countermeasures. The National Research Council warned that effective countermeasures could seriously undermine any value of polygraph security screening. (173) Although effective in theory, the dearth of testing on the effects of countermeasures makes it unknown whether they are effective in practice. Here, the error rates of fMRI-based lie detection techniques are not only not low, they are simply unknown. Again, the general Daubert factors are unclear: a gatekeeper must examine the "error rate," but the error rate of what, precisely, must it examine?

2. The Technique Should Have Standards Controlling Its Operation

That the test administration process can be automated and that a subject only interacts with a computer in an fMRI-based technique misses the point. (174) In theory, a polygraph examiner could also pose predetermined questions to a test-taker via a computer screen, but this does not reduce the need for human interpretation. For the theory phase of an fMRI study, a human researcher must examine and align the functional regions of interest, develop the baseline task, and define the cluster thresholds. If multiple test trials are required, specific test and control questions must be chosen by a human. These questions are prone to distortion and bias through human interaction. (175) What must be automated and standardized? Is it the fMRI machine, which applies magnetic gradient planes and electromagnetic pulses? Is it the computer software, which blindly stacks together each plane of voxels into a volume? Or, more troubling, must it be the statistical controls that are standardized, or the particular set of test questions asked of the subject?

3. The Technique Should Be Testable or Falsifiable

Chief Justice Rehnquist said, "I defer to no one in my confidence in federal judges; but I am at a loss to know what is meant when it is said that the scientific status of a theory depends on its 'falsifiability,' and I suspect some of them will be, too." (176) The Chief Justice was right on the mark. A recent survey of 400 judges showed that they had a great deal of difficulty operationalizing the falsifiability and error rate factors, and therefore tended to rely more on the rhetoric but not the substance of Daubert. (177) It is simpler to gauge whether there have been many papers published on a topic, rather than whether half of the papers indicate support for the theory while the other half refute it.

Testability or falsifiability means that there is a conceivable possibility that the conjecture being proffered can be shown to be false. (178) Interpreted strictly, this requirement is moot because modern experimental studies are designed with a hypothesis that must be tested or falsified. (179) What could the Daubert Court have meant? Returning to first principles, the focus of Rule 702 "must be solely on principles and methodology, not on the conclusions that they generate." (180) These principles and methodology must establish a standard of "evidentiary reliability," which the Court opined "will be based upon scientific validity." (181) This phrase engenders much confusion, because reliability and validity are neither derivative of, nor synonymous with, each other in the scientific literature. (182) In fact, they are generally considered orthogonal characteristics. (183) Reliability is a measure of stability--whether repeated measurements produce the same result. (184) Scientific validity is a principle that an instrument actually measures what it purports to measure. (185)

In the schema of lie detection, a diagnostic instrument measures one easily observable characteristic in order to draw an inference about a scientifically related but less-observable process. One commentator argues that the "technology is capable of being tested because (1) the procedure is repeatable and (2) the results can be validated." (186) However, whether the technology itself is testable is not sufficient to make a diagnostic instrument valid. What about the reliability and validity of the scientific theory upon which the technology is based, or the test that utilizes that theory?

The lifecycle of a diagnostic instrument begins with an idea. The idea is tested on a small group and then generalized to a population. At this point, the knowledge becomes a theory that posits a relationship between a measurement and the inferred characteristic. In order practicably to use the theory, an instrument must be developed by specifying procedures, boundaries and standards. This instrument is then applied to an individual. Because of this cascading nature, insufficient validity at any stage taints the final validity of the instrument's result.

Therefore, in a scientific test such as the kind discussed here, reliability and validity must be evaluated at two levels. The first phase involves developing a general theory or framework (also called "general causation"). The second phase involves applying a set of facts to that theory or framework to derive a conclusion (also called "specific causation"). (187) Therefore, the testability and falsifiability factor is, like the other three, simply another indication of the reliability and validity of any scientific knowledge. It is only moot that the theory itself (general causation) be testable and falsifiable. More importantly, it is the specific causation--the application of the theory to a set of facts--that must be testable and falsifiable. It is specifically at this second stage, where facts are applied to a theory, which is novel in the approach that we later propose. A lie detection test that is falsifiable at the research stage will not satisfy Daubert if it is not also falsifiable (meaning, reliable and valid) at the testing stage.

However, testability is only one of many facets of scientific reliability. As before, this Daubert factor misses the point. The scientific techniques providing the foundation of neuroimaging-based lie detection--nuclear physics, neuroscience, and statistics-are testable, per se. However, the specific test utilized by the instrument proponent must include not only whether it is falsifiable, but also whether alternatives have been falsified, thereby proving its falsifiability. (188) This discussion therefore continues with whether the instrument has been tested or falsified.

4. The Technique Should Have Survived Peer Review and Be Accepted Within the Relevant Field

The stance commonly adopted on the peer review factor is often, "There are over 250,000 papers referring to 'fMRI' on the PubMed database, so the technology has been subject to peer review and is generally accepted." However, there are far fewer relating specifically to the use of fMRI for lie detection. Cephos states that "[t]he theory has been tested by numerous academic groups and one commercial group." (189) While the theory has been tested, both companies could point to a total of only 32 articles researching neuroimaging-based lie detection, not all of which even use an identifiable fMRI-based technique. There are effectively no replications of the finding using similar experimental methods by independent groups. (190) The theory is certainly testable, but it has not been truly tested.

More damning, at least one third of those articles explicitly hesitate to apply the technique as tested to a practicable, forensic setting without further research. The authors of articles purporting to demonstrate lie detection were careful to urge "a careful examination of social and ethical concerns ... before fMRI can be reasonably applied in forensic settings," (191) and that "[f]uture functional MR imaging studies involving a large sample size and conventional reliability and validity methods are required to establish the utility of this method as a test for deception." (192)

5. Limitations of FRE 702 and Daubert's Four Factors

This caution brings us to consider what the Daubert Court was doing, rather than what it was saying. For expert witness testimony to be admissible under FRE 702, three conditions must be met: (1) the testimony must be "based upon sufficient facts or data," (2) the testimony must be "the product of reliable principles and methods," and (3) the witness must have "applied the principles and methods reliably to the facts of the case." (193) In interpreting FRE 702, the Court noted that error rate, standards, falsifiability, and general acceptance may be examined. (194) However, the Court provided these factors only to aid the legal community in determining what was reliable scientific knowledge, not as a canonical and authoritative list of requirements. (195) The analysis here therefore returns to the fundamental meaning of the terms "reliability" and "validity" to create a new framework for evaluating whether a proffered technique may be admissible because it qualifies as "scientific knowledge."

A low error rate and the existence of standards seem to apply to the instrument that was developed, rather than to the scientific theory or the machinery upon which the test is run. Falsifiability and, practically, general acceptance and peer review generally apply to the scientific theory, as articles in academic journals are commonly about theory and the validation of a theoretical construct, not the parameters of an instrument. By discussing the end state, but not the steps to which each goal must be applied, the Daubert Court provided scant guidance to a trial judge. How is the judge to evaluate whether evidence is based on "scientific knowledge" if it is uncertain what exactly must be falsifiable, or which standards must exist? Thus, the factors that the Court identified must be applied to determine whether the "principles and methods" are reliable and valid, as well as whether the particular "application of those principles and methods to the relevant facts" was done reliably and validly. As the original Daubert factors were so vague as to allow a judge so much discretion that the standard might as well not exist, the focus of this objection is not limited strictly to the fact scenario in this article, but instead to any novel source of scientific evidence.

D. Improving Daubert: A New Model for Scientific Validity Under FRE 702

In a trial with expert witnesses presenting scientific evidence, the Daubert Court was ultimately concerned with restricting the jury's exposure to untrustworthy testimony. (196) To prevent irrelevant or unreliable expert testimony from reaching the jury, the trial judge must evaluate the expert's proffered testimony and determine whether the testimony carries "a guarantee of trustworthiness." (197) To this end, it required that the trial court must determine whether "the expert is proposing to testify to scientific knowledge." (198) The requirement of "scientific knowledge" must establish a standard of "evidentiary reliability," which, in a case involving scientific evidence, "will be based upon scientific validity." (199) We have discussed already the import of reliability and validity as applied to both the framework and tests phases of any diagnostic instrument.

Therefore, this article proposes a detailed model for Daubert. In order for a diagnostic instrument to be generally accepted as scientific knowledge, reliability and validity must be evaluated at each of the following four, increasingly narrowly-defined stages: (1) the technology itself; (2) the scientific theory utilizing the technology; (3) the specific instrument developed from the theory; and (4) the peculiar instance where the instrument is applied to an individual. (200)

The technology refers to the underlying mechanical device. A technology is accepted as scientific knowledge if it returns the same data output over multiple trials when measuring the same inputs, and measures what it purports to measure, confirmed by other, dissimilar technologies. While the technology itself has attained the level of "scientific knowledge," fMRI-based lie detection has not.

The scientific theory refers to the principle or framework that describes an observable phenomenon and affords both explanatory and predictive power. It is reliable and valid when justifiable inferences about the relationships between variables may be drawn, thereby falsifying alternative explanations for the same outcome.

The instrument refers to an individual set of procedures, parameters, and thresholds that are applied to the theory. It is valid when the research findings can be generalized to apply across a variety of populations and contexts. A reliable and valid technique must have an adequate statistical methodology, an acceptably combined low error rate, a reasonable and disclosed sensitivity and specificity, and sufficient procedural safeguards to prevent an administrator's internal biases from affecting the test outcome. (201)

The specific application refers to the instance the instrument is conducted on the test subject. Unlike the other three stages, there is less of a distinction between reliability and validity because it is no longer in the realm of "scientific theory"; rather, it is the straightforward application of an instrument. It is valid if the administrator acted in good faith and applied the instrument in accordance with the documented procedure. It is reliable if he or she would reach the same result any other expert in the field would reach. (202) If any single stage fails to be scientifically valid, the resulting conclusion must be in doubt, and therefore cannot offer the "guarantee of trustworthiness" that Daubert, and ultimately, Rule 702 requires.

Therefore, the court's stringent reliability/validity approach must be applied to each of these four steps. First, is the technology reliable and valid (does the machine returns the same result for the same object each time a measurement is repeated, and the same result as machines based on different technologies); second, is the theory reliable and valid (does the process measure that which it purports to measure); third, is the instrument reliable and valid (does it instantiate accurately and precisely the theory); and fourth, has the instrument been applied correctly (e.g., were the proper parameters and procedures applied to the witness when the test was administered?)? If such an approach is followed, a trial judge can more rigorously analyze the reliability and validity of novel forms of evidence based on burgeoning technology. Here, the evaluation of deception detection based on the original Daubert factors is uncertain. Utilizing this article's proposed approach, a trial judge is afforded the clarity with which to draft an opinion that is persuasive to both counsel and the appellate bench, not only for neuroimaging-based lie detection, but for any novel form of scientific evidence.

Because the reliability and validity of each phase must be confirmed, the model is still relevant to situations where a technology is long established but is applied to a novel theory, or where a well-established technology with a confirmed theory is operationalized with a novel instrument. Only by reaffirming the scientific knowledge of every stage may the final conclusion be considered reliable and valid.

E. Lie Detection via Functional Neuroimaging is Not Admissible Under the Proposed Model

Let us now examine the scientific knowledge of the technology, theory, instrument, and application of fMRI-based lie detection. The technology of fMRI is valid. The same stimulus provides the same BOLD response; the same regions of activity are elicited when subjects perform the same task. (203) While there is uncertainty about what BOLD really measures, localization of activity in the brain has been verified by replication as well as complementary techniques. Functional Mill therefore can be said to measure what it purports to measure, and it is the correct tool to measure the location of brain activity.

Next, the scientific theory. The theory that deception elicits an identifiable "deception network" fails to satisfy a reasonable standard of validity. The major finding from current deception studies is that there is a general "falsehood" network, where executive function regions are more active during deception. (204) The reader may thus ask, so what if there is no one single "seat of prevarication"? Why not slap a "falsehood pattern" label on the network and call it a day? Whenever a subject responds to a question and this pattern lights up, we can say that he isn't telling the truth, right? The problem is that this falsehood network is comprised of many other cognitive subfunctions--subfunctions that perform common tasks unrelated to deception or the intent to deceive. Furthermore, it could very well be possible that for some regions, deception causes more activity than truth-telling, and in others, the reverse is true. It is therefore not merely a single region that is necessary and sufficient for deception; instead, the task of deception is brought about by primitive cognitive tasks, such as paying attention, carefully remembering, problem-solving, and resisting the urge to tell the truth. Combined with the requirement of a baseline contrast, the problem of subfunctions becomes clearer.

The first problem is that of the false positive. If an amalgamation of independent but inherently benign subfunctions can be evoked such as to emulate a network indicating deception, how is it possible to dissociate true from false positives? Memories are recalled through a process of reconstructing a complete memory from bits and pieces, building up when one memory evokes related memories of facts, places, events, and experiences. (205) Fainter memories, whether because the event happened a long time ago or little attention was paid when the event occurred, requires greater perceived effort to recall. Imagine a time you had difficulty remembering a particularly faint or distant memory. For example: What did I eat last Friday for lunch? On Fridays, I usually go to that restaurant with fresh fish, but I didn't last Friday; why? I normally pick up my daughter from school at 4 p.m., but she had to get out earlier for a doctor's appointment at 1 p.m., so we went together to eat at the Italian place across from the clinic--Oh, right, I had pasta. This commonly experienced perception is in fact validly confirmed by fMRI; attempting to recall faint memories does require more activation in specific regions of the brain. (206) Because these regions are very similar to those in lie detection studies, scientists cannot yet distinguish between a witness struggling to recall an incident and a witness lying through their teeth. A detailed and directed retrieval process is likely to evoke subfunctions very similar to that involved in coming up with a lie and retelling it. The second problem is the false negative. As discussed above, the use of countermeasures may distort the activity of either the test or baseline conditions. Intentionally recruiting benign and easily replicable subfunctions could potentially let a lying test-taker pass by undetected with disturbing ease.

Many lie detection studies have demonstrated activation in a widespread region across the prefrontal cortex. However, there is still much uncertainty whether the "deception network" measures what it purports to measure, or whether it is simply a side effect of cognitive processes required to develop a counterfactual scenario rapidly. Any instrument developed on a "deception network" theory is therefore immediately suspect. (207)

Even if we proceed by presuming the theory is valid, the instrument is also not valid. No instrument for deception detection has yet been disclosed to the scientific community. No set of procedures, parameters, and statistical thresholds have been disclosed to the scientific community, much less evaluated, reviewed, or replicated. No error rates for the instrument, especially false positive and negative rates, have been publicly disclosed. (208) Most troubling, no studies, much less error rates, have been conducted regarding that particular instrument's resistance to countermeasures. Clearly, any instrument proffered is not valid. Finally, any particular application must also be questioned. Therefore, lie detection by functional neuroimaging fails to have any scientific validity under this new framework.


In its current state, conclusions from fMRI-based lie detection instruments are not valid because they are easily manipulated by the administrator, and there is no guarantee that they are resistant to manipulation by the witness. When the danger and degree of prejudice is unduly high and the probative value of the testimony lower than many other generally accepted types of scientific knowledge, a court should deny admission of fMRI-based lie detection evidence and testimony.

A. Why Not Just Let it in For What it is Worth?

Professor Schauer has argued persuasively and forcefully that the scientific standards of reliability and validity should not be applied to the law. (209) While his argument seems to be directed more at what he views as an overly strict scientific standard set in Daubert than specifically at neuroimaging-based lie detection, the crux of his argument appears to boil down to the challenge that "bad science ... is not necessarily worse than the non-science that lurks in the heads of judges and jurors." (210) He continues: "[B]ecause scientific reliability and validity is not a prerequisite for the admission of all evidence, much non-scientific evidence might well fill the gap left by the excluded flawed scientific evidence." (211) Such leniency towards the admission of evidence seems to endorse a "let it in for what it's worth" standard. (212)

This Article respectfully disagrees with parts of Professor Schauer's position, arguing instead that evidence from neuroimaging-based lie detection is so unreliable that it violates the policies behind even the lower standards of the Federal Rules of Evidence, much less the strict "scientific knowledge" standard of Daubert. While it is indeed the obligation of law to come to a decision, wasting time and resources because the evidence has little to no probative value does not benefit either the litigants or the bench. Bad neuroimaging science does not merely produce random noise; worse yet, the results can be manipulated by either party. What good is calling something "scientific evidence," with the need to have special "expert witnesses" with a special Rule of Evidence, if either party can easily manipulate said evidence to support its side? Expert witnesses would become no different from occasion or reputation witnesses, but with the perceived legitimacy of an "expert" badge and a highly persuasive multi-million dollar "truth-telling brain scanner" that can "read your thoughts." (213)

1. Probative Value is Outweighed by Prejudicial Nature

Professor Schauer argues that "if incomplete or shoddy or commercially-motivated science is barred from the law in the name of science, law's own goals may suffer" (214) because "the obligation of law [is] simply to reach a decision, and the ability to postpone a judgment until better evidence is available is rarely available to law." (215) While we do not endorse restricting admissibility of scientific knowledge until it becomes undergraduate textbook material, if the legal process arrives at a wrong decision because of unreliable evidence that we know to be unreliable, surely we should be persuaded that fundamental principles of procedural fairness have been violated. (216)

Scientific data can be overwhelming. Without careful attention to detail, it is persuasive to anyone--scholars, judges, attorneys, jurors. Over the many days or weeks of never-ending expert witness testimony in a hotly contested trial, attentions waver and the ability to reason critically falters. With fMRI-based evidence, the role of the trial judge as gatekeeper is even more critical than with other types of scientific evidence, because it has become clear that neuroimaging data is unduly persuasive. Pictures of brains confer credibility to data, regardless of whether the scientific reasoning has obvious errors. (217) Irrelevant neuroscience information makes poor scientific explanations seem more convincing and satisfying even to individuals with a semester of doctorate-level training in cognitive neuroscience, and while fMRI-based lie detection evidence led to more guilty verdicts from a pool of mock jurors, that effect disappeared once the fMRI evidence was challenged on cross-examination. (218) Because of this unnaturally persuasive nature, this article argues that neuroimaging evidence has a heightened prejudicial effect. In the courtroom, overly prejudicial and unduly persuasive evidence is especially dangerous for jurors as they are more likely to stop thinking critically and instead rely on the expert's colorful brain pictures. (219)

In opposition, Professor Schauer points out that,
   [T]hose who are most insistent about finding a sound scientific and
   empirical basis for the admission of various forms of evidence seem
   often to be comfortable abandoning the science in favor of their
   own hunches when the question is about the potential downstream
   dangers of allowing certain forms of evidence to be used for a
   particular purpose. (220)

But of course this is the case. It is not that scientists are abandoning empirical studies in favor of hunches; it is that practitioners of science (and law) have an ethical responsibility to manage how it is used by those who are not versed in its intricacies and qualifications. (221) Scientific articles may be unqualified because scientists are well aware of the tacit norms of the community, where articles are vetted only through replication and falsification of alternative hypotheses. When the general population relies on primary experimental findings rather than review articles and textbook knowledge, scientists must become proactive. (222) While law has the bar examination, codes of professional responsibility, and disciplinary committees, science has only peer reviewers to restrict the development and dissemination of junk science. Neuroimaging-based lie detection simply has no probative value at this point in time.

Scientific evidence is different from reputation testimony because it carries the indicia of, or at least the appearance of, reliability. In contrast, using current neuroimaging-based lie detection techniques, expert witnesses can testify to findings that are easily manipulated by either the test administrator or the test subject. (223) A proponent with unlimited resources could easily afford an expert witness to testify that a machine-administered test "found" that the opposing party had lied by simply adjusting the statistical thresholds or the baseline task. An indigent opposing party would not only be unable to afford an expert witness for the duel, but would also be unaware of the fact that the result of the test could have (or had) been so easily manipulated. Furthermore, it is no small consideration that private corporations providing a service in the interest of profit (224) have an ongoing economic incentive not only to advocate for the widespread adoption of this new type of evidence, but also to provide results favorable to the party requesting the service. (225) Because fMRI-based lie detection currently has virtually no probative value, even if such evidence has only the slightest amount of prejudicial nature, the prejudicial nature outweighs the probative value of the testimony.

2. Judicial Efficiency is Not Promoted

A second concern is that admitting "scientific" testimony willy-nilly wastes precious judicial resources and wastes the jury's time. As discussed above, results from neuroimaging-based lie detection may be easily manipulated. The reader may be reminded of another class of evidence that carries the same stigma of being easily manipulated: hearsay. Why are certain forms of hearsay treated as inadmissible? Besides the obvious flaw of being unreliable, they are a waste of judicial resources. (226) One of the stated purposes of the Federal Rules of Evidence is the "elimination of unjustifiable expense and delay." (227) If we accept expert witness testimony that is not merely equivocal but manipulable by either party, how are we promoting the efficient use of judicial resources more so than not allowing it at all? Surely two well-paid and well-equipped experts trading blows with opposing counsel on cross-examination may be amusing--at least for the first ten minutes--but is that worth the confusion to jurors of wading through days of perfectly conflicting expert testimony?

Better to remain with the techniques that have been thoroughly tested. Historically, our judicial system has relied on the jury as the final gauge of witness credibility; the judicial rejection of polygraph technology when it is unreliable only reinforces this idea. (228) Lie detection is, at its core, derivative testimony that a witness's credibility is at fault. Because it cannot indicate the "truth" of a statement, only that deception had or had not occurred, it adds nothing further to the evidence except to impeach a witness.

3. Differences of Degree, Not Kind

Unfortunately, bad science is in fact worse than the nonscience that lurks in the heads of judges and jurors. In that way, this Article agrees with the outcome that Professor Schauer warns of: "[W]hat is not good enough outside of law may be good enough for parts of the law." (229) A great deal of science can take many years or decades to evolve from the seed of an idea into a full-fledged validated, confirmed, replicated, and vetted theory. If the probative value of certain scientific knowledge is extraordinarily high, waiting until the knowledge reaches "undergraduate textbook" status may be too long for the legal system. If the judicial process can use a part of science, provided however that it is both reliable and valid, it would greatly benefit. The question becomes, when is it good enough? This query is one of degree, rather than one of kind.

Ultimately, the problem at this time is not that the expert testimony is merely less reliable; rather, it is that the results are not reliable because the results can be so easily manipulated, and the scientific validity simply not yet known. Lie detection via fMRI thus fails not only the requirements of Daubert and the text of FRE 702, but more troubling, the technique fails the entire purpose of a trial: to find facts.

B. New Scientific Methods Require a Restrained Approach

Evidence is "something ... that tends to prove or disprove the existence of an alleged fact." (230) This article has argued that fMRI-based lie detection is not ready for evidentiary admissibility at this point in time, simply because it cannot be shown reliably and validly to prove or disprove any fact. Many commentators have suggested the same. (231) In response, Professors Greely and Illes have proposed a detailed scheme where a regulatory body conducts trials and sets standards on admissibility of lie detection techniques based on neuroimaging. (232) Such an extensive system would winnow out the invalid and unreliable techniques, but at the cost of a substantial investment in time and money to both taxpayers and private companies. (233) However, hesitating to place substantial safeguards, either by executive or judicial action, could lead to a massive travesty of justice. For example, a woman in India was convicted of killing her former fiance and sentenced to life in prison on the evidence provided by LEG-based lie detection. (234) Unless and until a comprehensive scheme is adopted, courts must still deal with admissibility under Daubert.

When should it become acceptable, if ever? As with the common law adoption of evidence from any other novel technology, courts must engage in a balancing test between celerity of acceptance and procedural safeguards. The beauty of the jurisprudence of common law is that the evolution of law occurs organically--just like the evolution of scientific knowledge. Science "consists of a growing margin ... that is interesting but often wrong. Its core is much less controversial (because it is familiar) but very reliable. There is a wide gray area in between." (235) It has been estimated that even in physics, textbook science may be 90% right, whereas primary research is probably 90% wrong. (236) Basing a judgment about a defendant's liberty on science that could be 90 per cent wrong is hardly due process. This article suggests that when scientific knowledge has reached the point where it is a general consensus across multiple graduate-level textbooks, it may be considered scientific knowledge and thus reliable as evidence. This degree of consensus may be analogized to the hearsay exception allowing admissibility of learned legal treatises. (237)

As for the amount of time this process can take, consider human identification via deoxyribonucleic acid (DNA). Although DNA was discovered within the cell's nucleus in 1869, it was not until 1944 that it was generally accepted as the basic genetic building block and only in 1953 was the actual double-helix structure discovered. (238) In order to duplicate the miniscule amounts of DNA usually acquired in forensic investigations, a technology called polymerase chain reaction (PCR) was developed in 1985. (239) It would not be until 1988 that the first reported appellate court accepted a trial court's admission of DNA-based evidence. (240) Soon after, some state trial courts started taking judicial notice of the reliability and validity of DNA-based evidence. (241) DNA evidence, while superficially similar to fMRI evidence, is arguably inapposite on the grounds that the former evaluates biological certainties while the other evaluates mental states. However, even with a biological certainty, there was a 35-year development gap for the technology and theory to reach a level of reliability and validity to be admissible in the judicial system.

In 2003, the National Research Council wrote, "Not enough is known, however, to tell whether it will ever be possible in practice to identify deception in real time through brain measurement. We are confident that it will not happen within the next decade." (242) This work agrees with the NRC: not enough is known to determine whether it will ever be possible to identify deception, or, for that matter, verify the substance of memories. With the deliberate pace of both the common law and scientific knowledge, we are confident that it will not happen within the next decade, either.

What else can be done in the meantime to accelerate development and ensure that neuroimaging-based lie detection is sufficiently valid for judicial adoption? There are paths that both researchers and lawmakers may take in order to improve its adoption. Researchers must adopt the most modern and rigorous research methods as soon as they become available and reinterpret prior results as new techniques are vetted. Novel methods of corrections for multiple comparisons and the nonindependence error have only been developed in the last few years. Because deception is currently thought to inherently recruit a network of otherwise benign subfunctions, there is a possibility that lie detection can never free itself from the looming specter of significant false positives and countermeasures. However, because deception is an intentional, conscious process, a test that validates recognition memory for a prior experience may bypass this layer of abstraction. Current LEG-based technologies rely on a so-called "novelty" signal that purportedly indicates whether an individual's brain has or has not encountered a particular object before. (243)

And yet there are also significant barriers with such a test. First, an individual's brain that purportedly recognizes an object may be unconsciously recognizing the general category for that object, not the specific object. An individual unfamiliar with guns would not be able to consciously (or potentially unconsciously!) distinguish between two different makes or calibers of gun, much less two different instances of the same gun. Second, while preliminary research into false memories has demonstrated the ability to distinguish false memories from real ones, the experimental setup is hardly ethologically valid and replication has been scant. (244) It may eventually prove to be impossible to distinguish between real and false recognition. Finally, the inability to disentangle the processes of making and remembering memories may ultimately render futile the search for "mechanical truth verification."

As for lawmakers, this Article suggests that courts and legislators begin by evaluating the reliability and validity of the technology, theory, instrument, and application of any proffered evidence based on novel technological advancements. Doing so will help structure the Daubert analysis by clarifying the schema. For example, when examining the "testability or falsifiability" factor, it is critical to understand that while the testability of the technology is generally considered valid, the testability of the underlying theory of a deception network is not valid because it has neither been substantially replicated, nor have alternative explanations been ruled out. Similarly, the oft-touted low error rate of a particular application is only a partial picture; the error rate of the underlying technique must also be examined. Finally, the peer review process requires an examination of whether the technique has been validated and corroborated in the secondary review literature and published in graduate-level textbooks, rather than simply appearing in a peer-reviewed journal.

One final observation: the District Court of the Eastern District of Pennsylvania was faced with an expert witness who proffered evidence that a diagnosis of bipolar disorder made the defendant predisposed to lie. (245) On cross-examination, the court documented the following colloquy:

Adverse counsel: [Isn't it true that] we simply can't get a measure of what conscious intent is?

Witness: I don't have any reliable measure for that.

Adverse counsel: O.K. You nor anyone else. Is that right?

Witness: That's right.

In response, the court could only sputter, "What, one may ask, could a jury do with testimony like this?" (246)


Primary research articles

1. Nobuhito Abe, Maki Suzuki, Etsuro Mori, Masatoshi Itoh & Toshikatsu Fujii, Deceiving Others: Distinct Neural Responses of the Prefrontal Cortex and Amygdala in Simple Fabrication and Deception with Social Interactions, 19 J. COGNITIVE NEUROSCIENCE 287 (2007).

2. S. Bhatt, J. Mbwana, A. Adeyemo, A. Sawyer, A. Hailu & J. VanMeter, Lying About Facial Recognition: An fMRI Study, 69 BRAIN & COGNITION 382 (2009).

3. Shawn E. Christ, David C. Van Essen, Jason M. Watson, Lindsay E. Brubaker & Kathleen B. McDermott, The Contributions of Prefrontal Cortex and Executive Control to Deception: Evidence from Activation Likelihood Estimate Meta-Analyses, 19 CEREBRAL CORTEX 1557 (2009).

4. C. Davatzikos, K. Ruparel, Y. Fan, D.G. Shen, M. Acharyya, J.W. Loughead, R.C. Gur & D.D. Langleben, Classifying Spatial Patterns of Brain Activity with Machine Learning Methods: Application to Lie Detection, 28 NEUROIMAGE 663 (2005).

5. Rachael S. Fullam, Shane McKie & Mairead C. Dolan, Psychopathic Traits and Deception: Functional Magnetic Resonance Imaging Study, 194 BRIT. J. PSYCHIATRY 229 (2009).

6. Matthias Garner, Thomas Bauermann, Peter Stoeter & Gerhard Vossel, Covariations Among fMRI, Skin Conductance, and Behavioral Data During Processing of Concealed Information, 28 HUMAN BRAIN MAPPING 1287 (2007).

7. Matthias Gamer, Olga Klimecki, Thomas Bauermann, Peter Stoeter & Gerhard Vossel, fMRI-Activation Patterns in the Detection of Concealed Information Rely on Memory-Related Effects, 4 Soc. COGNITIVE AFFECTIVE NEUROSCIENCE 1 (2009).

8. G. Ganis, S.M. Kosslyn, S. Stose, W.L. Thompson & D.A. Yurgelun-Todd, Neural Correlates of Different Types of Deception: An fMRI Investigation, 13 CEREBRAL CORTEX 830 (2003).

9. Joshua D. Greene & Joseph M. Paxton, Patterns of Neural Activity Associated with Honest and Dishonest Moral Decisions, 106 PROC. NAT'L. ACAD. SCI. 12506 (2009).

10. J. G. Hakun, K. Ruparel, D. Seelig, E. Busch, J. W. Loughead, R. C. Gur & D. D. Langleben, Towards Clinical Trials of Lie Detection with fMRI, 4 Soc. NEUROSCIENCE 518 (2009).

11. J. G. Hakun, D. Seelig, K. Ruparel, J. W. Loughead, E. Busch, R. C. Gur & D. D. Langleben, fMRI Investigation of the Cognitive Structure of the Concealed Information Test, 14 NEUROCASE 59 (2008).

12. Tokiko Harada, Shoji Itakura, Fen Xu, Kang Lee, Satoru Nakashita, Daisuke N. Saito & Norihiro Sadato, Neural Correlates of the Judgment of Lying: A Functional Magnetic Resonance Imaging Study, 63 NEUROSCIENCE RES. 24 (2009).

13. Ahmed A. Karim, Markus Schneider, Martin Lotze, Ralf Veit, Paul Sauseng, Christoph Braun & Niels Birbaumer, The Truth About Lying: Inhibition of the Anterior Prefrontal Cortex Improves Deceptive Behavior, 20 CEREBRAL CORTEX 205 (2009).

14. F. Andrew Kozel, Kevin A. Johnson, Emily L. Grenesko, Steven J. Laken, Samet Kose, Xinghua Lu, Dean Pollina, Andrew Ryan & Mark S. George, Functional MRI Detection of Deception After Committing a Mock Sabotage Crime, 54 J. FORENSIC SCI. 220 (2009).

15. F. Andrew Kozel, Kevin A. Johnson, Steven J. Laken, Emily L. Grenesko, Joshua A. Smith, John Walker & Mark S. George, Can Simultaneously Acquired Electrodermal Activity Improve Accuracy of fMRI Detection of Deception? 4 Soc. NEUROSCIENCE 510 (2009).

16. F. Andrew Kozel, Kevin A. Johnson, Qiwen Mu, Emily L. Grenesko, Steven J. Laken & Mark S. George, Detecting Deception Using Functional Magnetic Resonance Imaging, 58 BIOLOGICAL PSYCHIATRY 605 (2005).

17. Frank A. Kozel, Tamra M. Padgett & Mark S. George, A Replication Study of the Neural Correlates of Deception, Brief Communication, 118 BEHAVIORAL NEUROSCIENCE 852 (2004).

18. F.A. Kozel, L.J. Revell, J.P. Lorberbaum, A. Shastri, J.D. Elhai, M.D. Homer, A. Smith, A. Nahas, D.E. Bohning & M.S. George, A Pilot Study of Functional Magnetic Resonance Imaging Brain Correlates of Deception in Healthy Young Men, 16 J. NEUROPSYCHIATRY CLINICAL NEUROSCIENCE 295 (2004).

19. Daniel D. Langleben, James W. Loughead, Warren B. Bilker, Kosha Ruparel, Anna Rose Childress, Samantha I. Busch & Ruben C. Gur, Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI, 26 HUMAN BRAIN MAPPING 262 (2005).

20. D.D. Langleben, L. Schroeder, J.A. Maldjian, R.C. Gur, S. McDonald, J.D. Ragland, C.P. O'Brien & A.R. Childress, Brain Activity During Simulated Deception: An EventRelated Functional Magnetic Resonance Study, Rapid Communication, 15 NEUROIMAGE 727 (2002).

21. Tatia M.C. Lee, Ho-Ling Liu, Li-Hai Tan, Chetwyn C.H. Chan, Srikanth Mahankali, Ching-Mei Feng, Jinwen Hou, Peter T. Fox & Jia-Hong Gao, Lie Detection by Functional Magnetic Resonance Imaging, 15 HUMAN BRAIN MAPPING 157 (2002).

22. Tatia M.C. Lee, Ricky K.C. Aua, Ho-Ling Liud, K.H. Ting, Chih-Mao Huang & Chetwyn C.H. Chan, Are Errors Differentiable from Deceptive Responses when Feigning Memory Impairment? An fMRI Study, 69 BRAIN & COGNITION 406 (2009).

23. Donald H. Marks, Mehdi Adineh & Sudeepa Gupta, Determination of Truth from Deception Using Functional MRI and Cognitive Engrams, 5 INTERNET J. RADIOLOGY 1 (2006), ology.html.

24. Feroze 13. Mohamed, Scott H. Faro, Nathan J. Gordon, Steven M. Platek, Harris Ahmad & J. Michael Williams, Brain Mapping of Deception and Truth Telling About an Ecologically Valid Situation: Functional MR Imaging and Polygraph Investigation--Initial Experience, 238 RADIOLOGY 679 (2006).

25. George T. Monteleone, K. Luan Phan, Howard C. Nusbaum, Daniel Fitzgerald & John-Stockton Irick, Detection of Deception Using JMRI: Better than Chance, but Well Below Perfection, 4 Soc. NEUROSCIENCE 528 (2009).

26. K. Luan Phan, A. Magalhaes, T. Ziemlewicz, D. Fitzgerald, C. Green & W. Smith, Neural Correlates of Telling Lies: A Functional Magnetic Resonance Imaging Study at 4 Tesla, 12 ACAD. RADIOLOGY 164 (2005).

27. Sean A. Spence, Tom F. D. Farrow, Amy E. Herford, Iain D. Wilkinson, Ying Zheng & Peter W. R. Woodruff, Behavioural and Functional Anatomical Correlates of Deception in Humans, 12 NEUROREPORT 2849 (2001).

28. Sean A. Spence, Catherine J. Kaylor-Hughes, Martin L. Brook, Sudheer T. Lankappa & lain D. Wilkinson, 'Munchausen's Syndrome by Proxy' or a 'Miscarriage of Justice '? An Initial Application of Functional Neuroimaging to the Question of Guilt Versus Innocence, 23 EUR. PSYCHIATRY 309 (2008).

29. Scan A. Spence, Catherine Kaylor-Hughes, Tom F.D. Farrow & lain D. Wilkinson, Speaking of Secrets and Lies." The Contribution of Ventrolateral Prefrontal Cortex to Vocal Deception, 40 NEUROIMAGE 1411 (2008).

Review articles or secondary sources

1. Scan A. Spence, Mike D. Hunter, Tom F. D. Farrow, Russell D. Green, David H. Leung, Catherine J. Hughes & Venkatasubramanian Ganesan, A Cognitive Neurobiological Account of Deception: Evidence from Functional Neuroimaging, 359 PHIL. TRANSACTIONS ROYAL SOC. LONDON B 1755 (2004).

2. Daniel D. Langleben & Melissa Y. De Jesus, Detection of Deception: Magnetic Resonance Imaging (MRI), in ENCYCLOPEDIA OF PSYCHOLOGY AND LAW 199 (Brian L. Cutler, ed. 2008).

3. Daniel D. Langleben, Detection of Deception with fMRI." Are We There Yet?, 13 LEGAL & CRIMINOLOGICAL PSYCHOL. 1 (2008).

(1) See CEPHOS CORP., (last visited Nov. 28, 2011); NO LIE MRI, (last visited Nov. 28, 2011).

(2) Although neither discloses its techniques on its website, this merely restates neuroimaging first principles.

(3) The popularity of fMRI in commercial applications over its cousin, positron emission tomography (PET), is likely due to its noninvasive nature (PET requires a radioactive injection while fMRI does not) and its celerity of acquisition. SCOTT A. HUETTEL ET AL., FUNCTIONAL MAGNETIC RESONANCE IMAGING 4 (2d ed. 2009).

(4) FED. R. EVID. 702 ("In jury cases, proceedings shall be conducted, to the extent practicable, so as to prevent inadmissible evidence from being suggested to the jury by any means, such as making statements or offers of proof or asking questions in the hearing of the jury.").

(5) A. Villringer, Physiological Changes During Brain Activity, in FUNCTIONAL MRI 3, 3 (C.T.W. Moonen & P.A. Bandettini eds., 1999). The goal of cognitive neuroscience is to "understand how brain function gives rise to mental activities such as perception, memory, and language." FRONTIERS IN COGNITIVE NEUROSCIENCE xv (Stephen Michael Kosslyn & Richard A. Andersen eds., 1995); see generally COGNITIVE NEUROSCIENCE: A READER 9-11 (Michael Gazzaniga ed., 2000).

(6) HUETTEL ET. AL.,supra note 3, at 159, 458-464; Paul M. Matthews, An Introduction to Functional Magnetic Resonance Imaging of the Brain, in FUNCTIONAL MRI 3, 4-5 (Peter Jezzard et al. eds., 2001).

(7) Very simply, NMR detects the energy given off when a volume of atoms is energized and subsequently allowed to relax. This local volume of atoms, quantized to a cube, is called a voxel. A voxel is a quantized volumetric element; it is a portmanteau of the words "volume" and "pixel." See generally RICHARD B. BUXTON, INTRODUCTION TO FUNCTIONAL MAGNETIC RESONANCE IMAGING: PRINCIPLES AND TECHNIQUES (2002); FUNCTIONAL MRI, supra note 6; HUETTEL ET AL., supra note 3.


(9) HUETTEL ET AL., supra note 3, at 125.

(10) For simplicity, this text drastically simplifies fMRI mechanics. For a detailed explanation of the mechanics of the hemodynamic response, and, accordingly, the blood-oxygenation level dependent (BOLD) response, including the ratio of oxygenated hemoglobin to deoxyhemoglobin, see BUXTON, supra note 7; HUETTEL ET AL., supra note 3; A.W. Song et al., Basic Principles of Function MRI, in HANDBOOK OF FUNCTIONAL NEUROIMAGING OF COGNITION 32, 32 (Roberto Cabeza & Alan Kingston eds., 2d ed. 2006); Dov Malonek & Amiram Grinvald, Interactions Between Electrical Activity and Cortical Microcirculation Revealed by Imaging Spectroscopy: Implications for Functional Brain Mapping, 272 SCIENCE 551, 554 (1996).

(11) See P.A. Bandettini, The Temporal Resolution of Functional MRI, in FUNCTIONAL MRI 205, 208 (C.T.W. Moonen & P.A. Bandettini eds., 1999); HUETTEL ET AL., supra note 3, at 221.

(12) G.K. Aguirre & M. D'Esposito, Experimental Design for Brain fMRI, in FUNCTIONAL MRI 369, 372 (C.T.W. Moonen & P.A. Bandettini eds., 1999). But see Jody C. Culham, Functional Neuroimaging: Experimental Design and Analysis, in HANDBOOK OF FUNCTIONAL NEUROIMAGING OF COGNITION 53, 6568 (Roberto Cabeza & Alan Kingston eds., 2d ed. 2006) (detailing the increasing ability to use neuroimaging to isolate functionally specific areas of the brain and predicting that this ability will only grow further in the future).

(13) HERBERT WILLIAM CONN, PHYSIOLOGY AND HEALTH 289-291 (1916), available at http://

(14) Aguirre & D'Esposito, supra note 12. See generally Eric Zarahn et al., A Trial-Based Experimental Design for fMRI, 6 NEUROIMAGE 122 (1997).

(15) Note that if you stop to double-check that your story is believable, that is yet another process.

(16) BUXTON, supra note 7, at 446. This signal change is for a 1.5T scanner; there is greater sensitivity with a stronger magnet, such as the 3.0T scanners currently used in some research institutions.


(18) Juan Alvarez-Linera, 3T MRI: Advances in Brain Imaging, 67 EUR. J. RADIOLOGY 415, 416 (2008). See generally Ning Xu et al., Simulation of Susceptibility-Induced Distortions in fMRI, 6144 PROC. SPIE (MEDICAL IMAGING 2006: IMAGE PROCESSING) 2071 (Joseph Reinhardt & Josien Pluim eds., 2006) (discussing the simulation of field inhomogeneity artifacts caused by magnetic susceptibility differences across air and tissue interfaces).

(19) In contrast to the few sources of signal strength, there are three strong sources of noise: intrinsic thermal noise and system noise from imperfections in the magnet, artifacts from motion and physiological processes, and non-task-related cognitive variability. See generally BUXTON, supra note 7, at ch. 12.

(20) HUETTEL ET AL., supra note 3, at 274. A bite bar is a piece of rigid plastic that the subject bites down on firmly during the entire scan. Because there is virtually no play between the upper teeth and the skull, as long as the subject's teeth are firmly clenched down on the bite bar, the entire head is immobilized.

(21) Jack J. Lin & John C. Mazziotta, Computational Anatomy, in 1 EPILEPSY: A COMPREHENSIVE TEXTBOOK 999, 999 (Jerome Engel, Jr. & Timonthy A. Pedley eds., 2d ed. 2008).

(22) See generally Michael A. Yassa & Craig E.L. Stark, A Quantitative Evaluation of Cross-Participant Registration Techniques for MRI Studies of the Medial Temporal Lobe, 44 NEUROIMAGE 319 (2009).

(23) See Bart Krekelberg et al., Adaptation: from Single Cells to BOLD Signals, 29 TRENDS IN NEUROSCIENCE 250, 251-54 (2006).

(24) Patricia S. Churchland & Terrence J. Sejnowski, Perspectives on Cognitive Neuroscience, in COGNITIVE NEUROSCIENCE: A READER 14, 14-15 (Michael S. Gazzaniga ed., 2000); Pasko Rakic, Introduction to Evolution and Development, in THE COGNITIVE NEUROSCIENCES 3, 3 (Michael S. Gazzaniga ed., 3d ed. 2004).


(26) W. Chen & S. Ogawa, Principles of BOLD Functional MRI, in FUNCTIONAL MRI 103, 103 (C.T.W. Moonen & P.A. Bandettini eds., 1999); see also Churchland & Sejnowski, supra note 24, at 15.

(27) Nancy Kanwisher et al., The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception, 17 J. NEUROSCIENCE 4302, 4302-03 (1997).

(28) Geoffrey K. Aguirre et al., An Area within Human Ventral Cortex Sensitive to "Building" Stimuli: Evidence and Implications, 21 NEURON 373, 375-377 (1998).

(29) See, e.g., Geoffrey K. Aguirre et al., Neural Components of Topographical Representation, 95 PROC. NAT'L ACAD. SCI. USA 839, 844 (1998).

(30) However, one must also be always wary of falling into the trap that devoured the phrenologist. In the early nineteenth century, phrenologists believed that people with an extreme trait would have an overly developed portion of the brain devoted to that function, creating a protrusion on the skull. Eventually, areas for "love for one's offspring" or "honesty" were defined. Such theories fell out of favor by the late 1830's when neither an experimental basis nor predictive power could be realized. HUETTEL ET AL., supra note 3, at 2.

(31) For the purpose of this article, "memory" is used to refer to the hippocampal dependent memory system and not working memory, which has been grouped together with the central executive, as in the work of Alan Baddeley. Alan Baddeley & Graham Hitch, Working Memory: Past, Present ... and Future?, in THE COGNITIVE NEUROSCIENCE OF WORKING MEMORY 1, 3 (Naoyuki Osaka et al eds., 2007); see also infra note 127.

(32) FED. R. EVID. 602.

(33) See, e.g., United States v. Lyon, 567 F.2d 777, 783-784 (8th Cir. 1977), cert. denied, 435 U.S. 918 (1978); see also CHRISTOPHER B. MUELLER & LAIRD C. KIRKPATRICK, 3 FEDERAL EVIDENCE [section] 6:6 (3d ed.) ("'Personal' means that the witness must have personally experienced what she is to describe, and that means ordinarily that she must have had direct sensory input that she experienced firsthand, usually in the form of sights and sounds.").

(34) Akira Miyake et al., The Unity and Diversity of Executive Functions and Their Contributions to Complex "Frontal Lobe" Tasks: A Latent Variable Analysis, 41 COGNITIVE PSYCHOL. 49, 50 (2000).

(35) P.C. Fletcher & R.N.A. Henson, Frontal Lobes and Human Memory: Insights from Functional Neuroimaging, 124 BRAIN 849, 849-51 (2001) (suggesting that working memory rehearsal tasks may recruit executive functions).

(36) See, e.g., Ce1ine Chayer & Morris Freeman, Frontal Lobe Functions, 1 CURRENT NEUROLOGY 8,: NEUROSCIENCE REPORTS 547, 547 (2001).

(37) E.g., Alan Baddeley, Working Memory, 255 SCIENCE 556, 557 (1992). See generally ALAN BADDELEY, WORKING MEMORY (1986).

(38) E.g., T. Shallice, Specific Impairments of Planning, 282 PHIL. TRANSACTIONS OF THE ROYAL SOC'Y B. 199 (1982). See generally Donald A. Norman & Tim Shallice, Attention to Action: Willed and Automatic Control of Behavior, in COGNITIVE NEUROSCIENCE: A READER 376, 377 (Michael S. Gazzaniga ed., 2000).

(39) E.g., Harvey S. Levin et al., The Contribution of Frontal Lobe Lesions to the Neurobehavioral Outcome of Closed Head Injury, in FRONTAL LOBE FUNCTION AND DYSFUNCTION 318, 328 (Harvey S. Levin et al. eds., 1991).

(40) E.g., Vinod Goel, Cognitive Neuroscience of Deductive Reasoning, in CAMBRIDGE HANDBOOK OF THINKING AND REASONING 475, 475-89 (Keith James Holyoak & Robert G. Morrison eds., 2005).

(41) E.g., H. Garavan et al., Dissociable Executive Functions in the Dynamic Control of Behavior: Inhibition, Error Detection, and Correction, 17 NEUROIMAGE 1820, 1820 (2002).

(42) E.g., Paul W. Burgess & Tim Shallice, Response Suppression, Initiation, and Strategy Use Following Frontal Lobe Lesions, 34 NEUROPSYCHOLOGIA, 263, 270-71 (1996).

(43) FED. R. EVID. 104(a).

(44) FED. R. EVID. 702.

(45) FED. R. EVID. 403.

(46) The Fourth Amendment protects persons from unreasonable search and seizures. U.S. CONST. amend. IV; see also Earl L. Kellett, Admissibility, in Civil Action, of Confession or Admission Which Could Not Be Used Against Party in Criminal Prosecution Because Obtained by Improper Police Methods, 43 A.L.R.3d 1375 (1972); Marjorie A. Shields, Admissibility, in Civil Proceeding, of Evidence Obtained Through Unlawful Search and Seizure, 105 A.L.R.5th 1 (2003). See generally Benjamin Holley, It's All in Your Head." Neurotechnological Lie Detection and the Fourth and Fifth Amendments, 28 DEV. MENTAL HEALTH L. 1, 11-14 (2009).

(47) The applicability of the Fifth Amendment arises from its protection against self-incrimination. U.S. CONST. amend. V. The question is whether "brain scanning" constitutes physical evidence, which is not protected, or testimonial evidence, which is. See generally Holley, supra note 46, at 14-22; Sarah E. Stoller & Paul Root Wolpe, Emerging Neurotechnologies for Lie Detection and the Fifth Amendment, 33 AM. J.L. & MED. 359 (2007).

(48) FED. R. EVID. 702; An Act to Establish Rules of Evidence for Certain Courts and Proceedings, Pub. L. No. 93-595, 88 Star. 1926 (1975).

(49) 509 U.S. 579 (1993).


(51) FED. R. EVID. 702 (enumerations omitted).

(52) Daubert, 509 U.S. at 597. But see infra note 202.

(53) Daubert, 509 U.S. at 589-90 (emphasis added).

(54) Id, at 593 n.9. Note the presence of the conjunction "and." This is a vitally important part of the Daubert standard that lower courts have somehow transmogrified into an "or." See, e.g., U.S. v. Sullivan, 246 F. Supp. 2d 700, 702 (E.D. Ky. 2003) ("A non-exhaustive list of factors guides the court's inquiry: (1) whether the theory or technique can or has been tested .... ") (emphasis added). The fact that a technique can be tested, but has not been, as could be read according to the disjunctive "or," would be contrary to the third and fourth factors of publication and general acceptance. A theory or technique would not be accepted by the field if it had not been actually tested.

(55) Daubert, 509 U.S. at 592. This text echoes the primary purpose of FRE 702. Note that none of the factors is primus inter pares. Id. at 594 ("The inquiry envisioned by Rule 702 is, we emphasize, a flexible one."); id. at 595 n. 12. In fact, the Court later clarified that these factors were neither mandatory nor exclusive; six years after Daubert, it held that the "list of factors was meant to be helpful, not definitive." Kumho Tire Co., Ltd. v. Carmichael, 526 U.S. 137, 151 (1999).

(56) Daubert, 509 U.S. at 593-594. See generally DAVID L. FAIGMAN ET AL., MODERN SCIENTIFIC EVIDENCE: THE LAW AND SCIENCE OF EXPERT TESTIMONY [section][section] 1 : 15--1:24 (2005--2006 ed.); 31A AM. JUR. 2D EXPERT AND OPINION EVIDENCE [section] 25 (2009).

(57) In Daubert, the Supreme Court held that FRE 702 effectively superseded the Frye doctrine, which had previously required scientific evidence to be "generally accepted" to be admitted as expert scientific testimony. 509 U.S. at 587. Among other things, the Court required the trial judge make a preliminary assessment whether the testimony's underlying reasoning or methodology is scientifically valid and properly can be applied to the facts at issue. Id. at 589, 592.


(59) Specific-incident testing is defined as questions with little ambiguity (e.g., "Did you see John Smith on Wednesday?"), as contrasted with generic security screening questions (e.g., "Have you ever divulged a trade secret to an unauthorized party?"). NATIONAL RESEARCH COUNCIL, supra note 58, at 1.

(60) Id. at 4.

(61) Id. at 5.

(62) Id. at 227-28.

(63) Strengthening Forensic Science in the United States: The Role of the National Institute of Standards and Technology: Hearing Before the Subcomm. on Tech. and Innovation of the H. Comm. on Science, 111th Cong. 2 (2009), available at gov/Media/hearings/ets09/march10/charter.pdf (published Mar. 10, 2009).

(64) Id.


(66) Id. at 190.

(67) Id. at 310.

(68) Roper v. Simmons, 543 U.S. 551, 568 (2005).

(69) Id. at 568-571.

(70) See Brief for the American Psychological Association and the Missouri Psychological Association as Amici Curiae Supporting Respondent, Roper v. Simmons, 543 U.S. 551 (2005) (No. 03-633), 2004 WL 1636447; Brief of the American Medical Association et al. as Amici Curiae Supporting Respondent, Roper v. Simmons, 543 U.S. 551 (2005) (No. 03-633), 2004 WL 1633549.

(71) Brief for the American Psychological Association, Roper v. Simmons, at 1012.

(72) Brief for the American Medical Association, Roper v. Simmons, at 11-20.

(73) "While Justice Anthony Kennedy didn't explicitly cite fMRI scans in his majority opinion against executing people under 18, many experts think it was an influencing factor." Reyhan Harmanci, Complex Brain Imaging Is Making Waves in Court, S.F. CHRON. (Oct. 17, 2008), /c/a/2008/10/17/MN8M13ACON.DTL.

(74) Cephos Corporation, one of the two firms claiming putatively legal grounds for the admissibility of fMRI-based lie detection evidence, originally stated on its website, "The U.S. Supreme Court has reviewed fMRI evidence in Roper v. Simmons to aid in the determination of when a person may be tried as an adult. Therefore the Supreme Court and neuroscientists have supported the use of fMRI in real-world settings." fMRI Testing & Legal Admissibility, CEPHOS CORPORATION, web/20090206163614/ y.htm (accessed by searching for Cephos Corporation in the Internet Archive Index). By February 2010, Cephos changed the first sentence to read: "The U.S. Supreme Court has received at least one amicus brief based in part on brain scans in Roper v. Simmons to aid in the determination of when a person may be tried as an adult. Therefore, the Supreme Court and neuroscientists have supported the use of fMRI in real-world settings." Legal Admissibility of fMRI Testing, CEPHOS CORPORATION, (last visited Oct. 18, 2011).

(75) Entertainment Software Ass'n v. Blagojevich, 404 F.Supp.2d 1051, 1063-68 (N.D. Ill. 2005), aff'd, 469 F.3d 641 (7th Cir. 2006).

(76) Id. at 1057-58.

(77) Id. at 1065. See Vincent Mathews et al., Media Violence Exposure and Frontal Lobe Activation Measured by Functional Magnetic Resonance Imaging in Aggressive and Nonaggressive Adolescents, 29 J. COMPUTER ASSISTED TOMOGRAPHY 287 (2005) (published portion of study to which witness refers); see also William Kronenberger et al., Media Violence Exposure and Executive Functioning in Aggressive and Control Adolescents, 61 J. CLINICAL PSYCHOL. 725 (2004).

(78) Entertainment Software, 404 F.Supp.2d at 1065.

(79) Id. at 1067 (quoting Transcript of Record at 356, Entertainment Software Ass'n v. Blagojevich, 404 F.Supp.2d 1051 (2005)).

(80) The issue of whether the fMRI evidence was admissible as a matter of law was not discussed on appeal, most likely because it was neither preserved for appeal nor contested. Entertainment Software Ass'n v. Blagojevich, 469 F.3d 641 (7th Cir. 2006).

(81) U.S. v. Mezvinsky, 206 F. Supp. 2d 661, 663 (E.D. Pa. 2002).

(82) U.S. v. Pohlot, 827 F.2d 889, 890 (3d Cir. 1987). Cf. the federal standard for insanity, 18 U.S.C. [section] 17 (2006).

(83) U.S. v. Bennett, 29 F. Supp. 2d 236, 240 (E.D. Pa. 1997).

(84) Mezinsky, 206 F. Supp. 2d at 663.

(85) Id. at 675.

(86) Id.

(87) Id.



(90) See, e.g., John Timmer, We're So Good at Medical Studies That Most of Them Are Wrong, Ars Technica, 2010/03/wereso-good-at-medical-studies-that-most-of-them-are-wrong.ars (last visited Mar. 11, 2010).

(91) Personal communication from Craig M. Bennett, University of California, Santa Barbara, to author (Feb. 4, 2010) (on file with author). A typical modern full-brain neuroimaging scan covers 64 x 64 voxels and 36 slices, producing a voxel volume of 3mm x 3mm x 3mm, or 27[mm.sup.3]. Id. A typical fMRI data set may contain approximately 25,000 to 100,000 voxels, which means that anywhere from 25 to 100 voxels could be falsely positive, that is, marked as active purely due to chance. D.W. Loring et al., Now You See It, Now You Don't: Statistical and Methodological Considerations in fMRI, 3 EPILEPSY & BEHAVIOR 539, 539 (2002) (using a data set containing 100,000 voxels); Brian Pittman, Multiple testing correction, http://afni.nimh.nih. gov/sscc/gangc/mcc.html (last visited Nov. 02, 2009) (using a data set containing 25,000 voxels).

(92) With a threshold of p < 0.001, as is commonly used in neuroimaging studies, approximately 60 voxels would occur purely by chance. (That is, a 0.1% probability that the statistical test for any given voxel results in a false positive, multiplied by 60,000 voxels.) Nikolaus Kriegeskorte et al., Circular Analysis in Systems Neuroscience: The Dangers of Double Dipping, 12 NATURE NEUROSCIENCE 535, 539 (2009); Loring et al., supra note 91, at 540 (p < 0.001).

(93) Sixty voxels at 27 [mm.sup.3] each is a volume of 1620 [mm.sup.3]. The average person has a hippocampus measuring roughly 4,000 [mm.sup.3]. Eleanor A. Maguire et al., Navigation-Related Structural Change in the Hippocampi of Taxi Drivers, 97 PROC. NAT'L. ACAD. SCI. 4398, 4400 (2000). For comparison, an average human brain has a volume of 1,450,000 [mm.sup.3]. PETER H. RAVEN & GEORGE B. JOHNSON, BIOLOGY 443 (1995).

(94) Howard Eichenbaum, Memory Representations in the Parahippocampal Region, in THE PARAHIPPOCAMPAL REGION: ORGANIZATION AND ROLE IN COGNITIVE FUNCTION 165, 165-66 (Menno Witter & Floris Wouterlood eds., 2002).

(95) Craig M. Bennett et al., Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument for Multiple Comparisons Correction, Poster at Human Brain Mapping 2009, posters/Bennett-Salmon-2009.pdf [hereinafter Bennett et al., Neural Correlates]; accord. Alexis Madrigal, Scanning Dead Salmon in fMRI Machine Highlights Risk of Red Herrings, WIRED, Sept. 18, 2009, A post-mortem Atlantic salmon was shown a series of photographs depicting humans in social situations and was asked to guess what emotion the pictured individual was experiencing. When the data was analyzed using the parameters conventionally selected in many neuroimaging studies (p < 0.001 and k > 8), the researchers were astonished to discover that the scan ostensibly indicated neural activity approximately the size of a pencil eraser (81 [mm.sup.3]) in the salmon's brain. Bennett et al., Neural Correlates, supra; see also Craig M. Bennett et al., The Principled Control of False Positives in Neuroimaging, 4 SOCIAL COGNITIVE & AFFECTIVE NEUROSCIENCE 417 (2009) [hereinafter Bennett et al., Principled Control].

(96) Bennett et al., Neural Correlates, supra note 95. Note that the correction for multiple comparisons (either a correction for false discovery rate or family-wise error) eliminated activity even though a highly relaxed statistical threshold had been applied (p = 0.25). See also Loring et al., supra note 91, at 545 fig.4 (2002) (showing a loss of activation in right hand, right hemisphere, between p = 0.01 and p = 0.001).

(97) Thomas Nichols & Satoru Hayasaka, Controlling the Familywise Error Rate in Functional Neuroimaging. A Comparative Review, 12 STATISTICAL METHODS MED. RES. 419, 438 tbl.6 (2003). These findings should not be surprising; the point of statistical thresholds is that they set a cutoff point beyond which an observed phenomenon arguably could have occurred by chance.

(98) Bennett et al., Principled Control, supra note 95, at 418.

(99) Conventionally, p < 0.05; k > 10.

(100) Bennett et al., Principled Control, supra note 95, at 417.

(101) E.g., Christopher R. Genovese et al., Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate, 15 NEUROIMAGE 870 (2002) (false discovery rate); Thomas E. Nichols & Andrew P. Holmes, Nonparametric Permutation Tests for Functional Neuroimaging: A Primer with Examples, 15 HUMAN BRAIN MAPPING 1 (2001) (permutation tests); K.J. Worsley et al., A Three-Dimensional Statistical Analysis for CBF Activation Studies in Human Brain, 13 J. CEREBRAL BLOOD FLOW & METABOLISM 900 (1993) (Gaussian random field theory); 13. Douglas Ward, Simultaneous Inference for FMRI Data, ALPHASIM, pub/dist/doc/manual/AlphaSim.pdf (AlphaSim: Estimate Statistical Significance via Monte Carlo Simulation); see Steven D. Forman et al., Improved Assessment of Significant Activation in Functional Magnetic Resonance Imaging (fMRI): Use of a Cluster-Size Threshold, 33 MAGNETIC RESONANCE MEDICINE 636 (1995); Nichols & Hayasaka, supra note 97 (review of methods). Although general statistical methods often adjust the familywise error rate with a Bonferroni correction, it can be overly conservative where the data are interrelated. See generally GEOFFREY KEPPEL & SHELDON ZEDECK, DATA ANALYSIS FOR RESEARCH DESIGNS 169-80 (1989) (traditional familywise Type I corrections).

(102) Bennett et al., Principled Control, supra note 95, at 419.

(103) See Nikolaus Kriegeskorte et al., supra note 92, at 538; Edward Vul et al., Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition, 4 PERSP. PSYCHOL. SCI. 274, 284 (2009).

(104) Kriegeskorte et al., supra note 92, at 538.

(105) Id.

(106) Vul et al., supra note 103, at 284.

(107) Id. at 285.

(108) Kriegeskorte et al., supra note 92, at 539 fig.4 (independent split-data analysis); Vul et al., supra note 103, at 282 ("[S]elect the voxels comprising different regions of interest in a principled way that is 'blind' to the correlations of those voxels with the behavioral measure .... ").

(109) Kriegeskorte, supra note 92, at 539 fig.4 ("independent analysis using all data").

(110) Id. ("anatomical selection criterion").

(111) Id. ("circular results").

(112) Craig E. L. Stark & Larry R. Squire, When Zero Is Not Zero. The Problem of Ambiguous Baseline Conditions in fMRI, 98 PROC. NAT'L. ACAD. SCI. 12760, 12760 (2001).

(113) Id. at 12762.

(114) See Marcus E. Raichle et al., A Default Mode of Brain Function, 98 PROC. NAT'L. ACAD. SCI. 676, 678 (2001).

(115) Miranda van Turennout et al., Modulation of Neural Activity During Object Naming: Effects of Time and Practice, 13 CEREBRAL CORTEX 381, 386 & 388 (2003).

(116) THE POLYGRAPH AND LIE DETECTION, supra note 58, at 124, 346 fig.H-4 (2003); see also id. at 14 (defining irrelevant and comparison questions).

(117) Vul et al., supra note 103, at 284.

(118) Note that both Vul et al., supra note 103, and Kriegeskorte et al., supra note 92, were published in 2009. These studies indicate that the technology and mathematics are still in their infancy, with standards still debated.

(119) See generally Todd S. Braver & Hannes Roge, Functional Neuroimaging of Executive Function, in HANDBOOK OF FUNCTIONAL NEUROIMAGING OF COGNITION 307, 308 (Roberto Cabeza & Alan Kingstone eds., 2d ed. 2006) (giving an overview of neuroimaging studies of executive function and highlighting the prefrontal cortex region).

(120) S. C. Baker et al., Neural Systems Engaged by Planning: A PET Study of the Tower of London Task, 34 NEUROPSYCHOLOGIA 515, 521 (1996).

(121) Angus W. MacDonald III et al., Dissociating the Role of the Dorsolateral Prefrontal and Anterior Cingulate Cortex in Cognitive Control, 288 SCIENCE 1835, 1837 (2000).

(122) H. Garavan et al., Right Hemispheric Dominance of Inhibitory Control: An Event-Related Functional MRI Study, 96 PROC. NAT'L ACAD. SCI. 8301, 8303 (1999).

(123) V. Menon et al., Error-Related Brain Activation During a Go/NoGo Response Inhibition Task, 12 HUMAN BRAIN MAPPING 131, 136 (2001).

(124) R. L. Buckner et al., Dissociation of Human Prefrontal Cortical Areas Across Different Speech Production Tasks and Gender Groups, 74 J. NEUROPHYSIOLOGY 2163, 2171 (1995).

(125) George Bush et al., Cognitive and Emotional Influences in Anterior Cingulate Cortex, 4 TRENDS COGNITIVE SCI. 215,216 (2000).

(126) Larry R. Squire & Stuart M. Zola, Episodic Memory, Semantic Memory, and Amnesia, 8 HIPPOCAMPUS 205, 205 (1998).

(127) Endel Tulving & Hans J. Markowitsch, Episodic and Declarative Memory: Role of the Hippocampus, 8 HIPPOCAMPUS 198, 202 (1998). In contrast, the other four memory types are procedural memory (motor skills); perceptual memory (fleeting echoes of stimuli perceived by the senses); semantic memory (knowledge of facts and concepts with which a time or place cannot be associated); and working memory (the short-term storage of information being actively thought about). Daniel L. Schacter& Endel Tulving, What Are the Memory Systems of 1994?, in MEMORY SYSTEMS 1, 26-28 (Daniel L. Schacter & Endel Tulving eds., 1994).

(128) FED. R. EVID. 602 ("A witness may not testify to a matter unless evidence is introduced sufficient to support a finding that the witness has personal knowledge of the matter.").

(129) A related technique using EEG and the P300 signal is currently being promoted commercially. See, e.g., BRAIN FINGERPRINTING LABORATORIES,; L.A. Farwell & S.S. Smith, Using Brain MERMER Testing To Detect Knowledge Despite Efforts To Conceal, 46 J. FORENSIC SCI. 135 (2001). The question with the P300 signal is whether it reflects absolute novelty.

(130) Martin Lepage et al., Hippocampal PET Activations of Memory Encoding and Retrieval: The HIPER Model, 8 HIPPOCAMPUS 313 (1998) (outlining a recta-analysis supporting the theory that for episodic memory, encoding predominantly activates the anterior, and retrieval predominantly activate the posterior hippocampus); see also Laura L. Eldridge et al., A Dissociation of Encoding and Retrieval Processes in the Human Hippocampus, 25 J. NEUROSCIENCE 3280, 3284 (2003) (showing one region had increased activity during encoding but not retrieval, whereas a different region was engaged during episodic retrieval, but not encoding); Steven E. Prince et al., Neural Correlates of Relational Memory: Successful Encoding and Retrieval of Semantic and Perceptual Associations, 25 J. NEUROSCIENCE 1203, 1204, 1207 (2005) (explaining that activity during encoding associated with later successful retrieval (i.e. subsequently remembered versus forgotten) was greater in anterior hippocampus, whereas activity during successful retrieval was greater than during forgotten memories in posterior hippocampus).

(131) Craig E.L. Stark & Yoko Okado, Making Memories Without Trying: Medial Temporal Lobe Activity Associated with Incidental Memory Formation During Recognition, 23 J. NEUROSCIENCE, 6748, 6748 (2003). That is to say, if you experience an event at Time A, then recall it at Time B, at a subsequent Time C you can recall not only the actual event at Time A, but also the fact that independently, at Time C, you can recall yourself at Time B recalling the event at Time A. This problem is augmented by the fact that the memory at Time B might not be an independent, self-contained memory, duplicating all the content of Time A, but rather a memory that refers to the Time A memory. Nested and referential memory is not at all well understood currently.

(132) Id. at 6752. Here, activity for images during study of novel images was similar to activity for novel test images. In a different study, researchers observed activity in the MTL as subjects gradually learned associations between images and words over an hour. No region was observed to increase in activity while another decreased, suggesting that encoding and retrieval are complementary processes occurring simultaneously in neighboring or monolithic structures. Jon R. Law et al., Functional Magnetic Resonance Imaging Activity During the Gradual Acquisition and Expression of Paired-Associate Memory, 25 J. NEUROSCIENCE 5720, 5728 (2005).

(133) E.g., Norman E. Spear, Extending the Domain of Memory Retrieval, in INFORMATION PROCESSING IN ANIMALS: MEMORY MECHANISMS 341, 365 (Norman E. Spear & Ralph R. Miller eds., 1981).

(134) E.g., Pablo Alvarez & Larry R. Squire, Memory Consolidation and the Medial Temporal Lobe. A Simple Network Model, 91 PROC. NAT'L ACAD. SCI. USA 7041, 7041-44 (1994); James L. McClelland et al., Why There Are Complementary Learning Systems in the Hippocampus and Neocortex: Insights from the Successes and Failures of Connectionist Models of Learning and Memory, 102 PSYCHOL. REV. 419, 424 (1995).

(135) Spear, supra note 133, at 363-73.

(136) Michael D. Greicius et al., Regional Analysis of Hippocampal Activation During Memory Encoding and Retrieval: fMRI Study, 13 HIPPOCAMPUS 164, 171-73 (2003).


(138) See Tatia M.C. Lee et al., Are Errors Differentiable from Deceptive Responses When Feigning Memory Impairment? An fMRI Study, 69 BRAIN AND COGNITION 406, 411 (2009).

(139) Joshua D. Greene & Joseph M. Paxton, Patterns of Neural Activity Associated with Honest and Dishonest Moral Decisions, 106 PROC. NAT'L ACAD. SCI. USA 12506, 12508-10 (2009).

(140) Shawn E. Christ et al., The Contributions of Prefrontal Cortex and Executive Control to Deception." Evidence from Activation Likelihood Estimate Metaanalyses, 19 CEREBRAL CORTEX 1557, 1563 (2009).

(141) IN.


(143) Declan J. McKeefry et al., The Noninvasive Dissection of the Human Visual Cortex." Using fMRI and TMS to Study the Organization of the Visual Brain, 15 NEUROSCIENTIST 489, 490 (2009). TMS temporarily and selectively disrupts the function of local cortical areas, allowing researchers to observe the effect on behavior of "shutting down" part of the brain, Id.

(144) Ahmed A. Karim et al., The Truth about Lying: Inhibition of the Anterior Prefrontal Cortex Improves Deceptive Behavior, 20 CEREBRAL CORTEX 205, 208-11 (2009).

(145) Id. at 210.


(147) Id. See generally KENNETH R. FOSTER & PETER W. HUBER, JUDGING SCIENCE: SCIENTIFIC KNOWLEDGE AND THE FEDERAL COURTS 131-134 (1997) (describing ways to adapt in the legal setting to the fact that science cannot prove negative propositions).

(148) Id.

(149) See THE POLYGRAPH AND LIE DETECTION, supra note 58, at 28 n.6 (defining countermeasures for polygraphs).

(150) See, e.g., Timothy B. Henseler, A Critical Look at the Admissibility of Polygraph Evidence in the Wake of Daubert: The Lie Detector Fails the Test, 46 CATH. U. L. REV. 1247, 1281-83 (1996-97).

(151) Owen Gleiberman, Heist Society: Movie Review, Ocean's Thirteen, ENTM'T WEEKLY, June 15, 2007, at 58 (describing a scene from the film: "How do you fake a lie-detector test ...? Why, you put a tack in his shoe, so he can step on it when he's giving a true answer, thus spiking the bodily-discomfort waves to match his false replies."); MythBusters (Discovery cable television broadcast Dec. 5, 2007) (Ep. 93) (attempting to use pain from a pinprick or biting down on the tongue to artificially elevate the baseline signal).

(152) See supra Part I.A.2.

(153) Id

(154) E.g., Rachael S. Fullam et al., Psychopathic Traits and Deception: Functional Magnetic Resonance Imaging Study, 194 BRIT. J. PSYCHIATRY 229, 231 tbl.2 (2009); Daniel D. Langleben et al., Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI, 26 HUMAN BRA1N MAPPING 262, 268 tbl.3 (2005).

(155) THE POLYGRAPH AND LIE DETECTION, supra note 58, at 140.

(156) E.g., Fullam et al., supra note 154, at 230 (subjects asked to lie after they were placed in scanner); Greene & Paxton, supra note 139, at 12510 (subjects informed they could cheat immediately before being scanned); Lee et al., supra note 138, at 407-08 (subjects instructed to feign a memory problem and deliberately do badly on the test immediately before being scanned). Contra G. Ganis et al., Neural Correlates of Different Types of Deception: An fMRI Investigation, 13 CEREBRAL CORTEX 830, 831-832 (2003) (subjects were given a false scenario constructed on a real episodic memory and, assisted by researchers to make it internally consistent, rehearsed it and memorized it before scanning).

(157) Giorgio Ganis et al., Lying in the Scanner: Covert Countermeasures Disrupt Deception Detection by Functional Magnetic Resonance Imaging, 55 NEUROIMAGE 312, 315-18 (2011).

(158) Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 592-94 (1993).

(159) Id. at 594.

(160) Michelle Michelson, The Admissibility of Expert Testimony on Battering and Its Effects After Kumho Tire, Recent Development, 79 WASH. U. L. Q. 367, 370; accord Sheila Jasanoff, What Judges Should Know About the Sociology of Science, 32 JURIMETRICS 345, 354 (1992) ("At the same time, the analytic approach ... suggests that Daubert's criteria of testability and falsifiability will in their turn prove difficult to implement in courts of law."); James T. Richardson et al., The Problems of Applying Daubert To Psychological Syndrome Evidence, 79 JUDICATURE 10, 14 (1995).

(161) FAIGMAN ET AL., supra note 56, at [section] 23:14. More precisely, in a statistical test, a "[t]ype I error is the probability that a finding [actually] occurred by chance when it appears to have not, while type II error is the probability that a finding actually occurred as a result of an intervention when it appears to have occurred by chance." Kelly H. Zou et al., Revisiting the p-value: A Comparison of Statistical Evidence in Clinical and Legal Medical Decision Making, 8 LAW, PROBABILITY & RISK 159, 164 (2009).

(162) Kevin W. Greve & Kevin J. Bianchini, Setting Empirical Cut-Offs on Psychometric Indicators of Negative Response Bias: A Methodological Commentary with Recommendations, 19 ARCHIVES OF CLINICAL NEUROPSYCHOLOGY 533, 534 (2004); accord THE POLYGRAPH AND LIE DETECTION, supra note 58, at 39.

(163) Greve & Bianchini, supra note 162, at 534.

(164) Id.

(165) THE POLYGRAPH AND LIE DETECTION, supra note 58, at 61.

(166) Greve & Bianchini, supra note 162, at 536.

(167) Field inhomogeneity surrounding the sinus cavity can make obtaining reliable data from the hippocampal region more difficult. Craig E.L. Stark & Larry R. Squire, Functional Magnetic Resonance Imaging (fMRI) Activity in the Hippocampal Region During Recognition Memory, 20 J. NEUROSCIENCE 7776, 7779 (2000). A perhaps apocryphal story told by fMRI technologists around the campfire involves a particularly skilled researcher who could predict whether a subject had a head cold merely by looking at the artifacts near his or her sinus cavity.

(168) See supra text accompanying notes 119 to 136 (discussing PFC and MT activation during both lying and mnemonic retrieval).

(169) E.g., F. Andrew Kozel et al., Detecting Deception Using Functional Magnetic Resonance Imaging, 58 BIOLOGICAL PSYCHIATRY 605, 610 (2005).

(170) Leo Kittay, Admissibility of fMRI Lie Detection, 72 BROOK. L. REV. 1351, 1366 (2007).

(171) See supra Part II.C.1.

(172) See Appendix A for a list of articles examined. The list was compiled by combining the publications at CEPHOS CORP., (last visited Feb. 2, 2010) and NO LIE MRI, (last visited Feb. 2, 2010).

(173) THE POLYGRAPH AND LIE DETECTION, supra note 58, at 216.

(174) Contra fMRl Testing & Legal Admissibility, CEPHOS CORP., (last visited Feb. 22, 2010) ("Existence and standards concerning its operation. Because the analysis is performed by a computer, standard operating procedures are maintained.").

(175) Contra Kittay, supra note 170, at 1368 ("To a considerable degree, a computer administers and analyzes the fMRI [sic] such that the same properly developed and tested software can be used to test each new subject.").

(176) Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 600 (1993) (Rehnquist, C.J., concurring in part and dissenting in part).

(177) Sophia I. Gatowski et al., Asking the Gatekeepers: A National Survey of Judges on Judging Expert Evidence in a Post-Daubert World, 25 L. & HUMAN BEHAVIOR 433, 52--54 (2001).

(178) LAWLESS ET AL., supra note 146, at 228

(179) Id. at 227-231. See generally KARL RAIMUND POPPER, THE LOGIC OF SCIENTIFIC DISCOVERY 57-73 (2d ed. 2002). However, recent fMRI studies have also supported data-driven experimental design using techniques such as independent components analysis, where no a priori assumptions are made before analyzing the data. E.g., Bharat Biswal & John Ulmer, Blind Source Separation of Multiple Signal Sources of fMR1 Data Sets Using Independent Component Analysis, 23 J. COMPUTER ASSISTED TOMOGRAPHY 265 (1999).

(180) Daubert, 509 U.S. at 595.

(181) Id. at 590 (emphasis omitted).

(182) See, e.g., FOSTER & HUBER, supra note 147, at 137 (commenting on "the textually curious arrangement of key words" in Daubert).

(183) The Court recognized this, albeit obliquely: "We note that scientists typically distinguish between 'validity' ... and 'reliability'...." Daubert, 509 U.S. at 590.

(184) LAWLESS ET AL., supra note 146, at 42 (2010); accord FOSTER & HUBER, supra note 147, at 131-34.

(185) LAWLESS ET AL., supra note 146, at 36-42; accord FOSTER H HUBER, supra note 147, at 146. A ruler that measured a book to be eight inches long one day and three inches the next is not very reliable; a weighing scale that consistently measured a book to be eight inches long is lacking validity. It is possible that a measure could be reliable but not valid, or valid but not reliable. Note, however, that for a scientific theory to be valid, it must be reliable. How can we reconcile what the Court said with the philosophy of scientific knowledge? The Court could not have meant that the scientific theory upon which the expert has based his testimony was only valid but not also reliable. A theory that is valid but not reliable would create an instrument that, for the same set of facts, predicts one outcome one day but a different outcome the next. This is an absurd result; it could not provide the guarantee of trustworthiness that the Court sought. The only conclusion is that the Court considered reliability to be a factor of validity.

(186) Kittay, supra note 170, at 1351, 1377.

(187) FAIGMAN ET AL., supra note 56, at [section] 1:17.

(188) See supra note 54 regarding the importance of the conjunctive in the phrase "can be and has been tested."

(189) fMRI Testing & Legal Admissibility, CEPHOS CORP, index.php#admissibility (last visited Feb. 10, 2010).

(190) Compare Frank Andrew Kozel et al., A Pilot Study of Functional Magnetic Resonance Imaging Brain Correlates of Deception in Healthy Young Men, 16 J. NEUROPSYCHIATRY CLINICAL NEUROSCIENCE 295, 295 (2004) (finding that the technique "lacks good predictive power for individuals"), with Frank Andrew Kozel et al., A Replication Study of the Neural Correlates of Deception, 118 BEHAVIORAL NEUROSCIENCE 852, 852 (2004) (finding that "individual results. were variable" but that "functional MRI is a reasonable tool with which to study deception").

(191) Matthias Gamer et al., fMRI-Activation Patterns in the Detection of Concealed Information Rely on Memory-Related Effects, 4 SOCIAL COGNITIVE AND AFFECTIVE NEUROSCIENCE 1, 9 (2009).

(192) Feroze B. Mohamed et al., Brain Mapping of Deception and Truth Telling About an Ecologically Valid Situation: Functional MR Imaging and Polygraph Investigation Initial Experience, 238 RADIOLOGY 679, 687 (2006) (emphasis added).

(193) FED. R. EVID. 702.

(194) See supra notes 55-56 and accompanying text.

(195) See supra note 55.

(196) Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 590 n.9 (1993) ("[H]earsay exceptions will be recognized only 'under circumstances supposed to furnish guarantees of trustworthiness.'") (citation omitted).

(197) Id.

(198) Id. at 592 (headings omitted) (emphasis added).

(199) Id. at 590 n.9.

(200) See generally supra note 187 and accompanying text. Stages 1 through 3 are effectively equivalent to general causation, whereas stage 4 is specific causation. Cf. Andre A. Moenssens, Admissibility of Scientific Evidence--An Alternative to the Frye Rule, 25 WM. & MARY L. REV. 545, 556 (1984) (describing six stages in the evolution of a scientific technique).

(201) It is solely at this stage that some commentators have applied Daubert's recommended factors. In doing so, they assume that the previous stages have already sustained scientific validity. Normally, the factors of "peer review" and "general acceptance" would satisfy the requirements of scientific validity for the technology and theory. However, an evaluation of the general acceptance of the technology's error rate or standardization will not satisfy an evaluation of the general acceptance of the instrument's error rate or standardization.

(202) In Daubert, the Court originally affirmed the principle that determinations of witness credibility and ability are restricted to the factfinder: "The focus, of course, must be solely on principles and methodology, not on the conclusions they generate." Daubert, 509 U.S. at 595. It therefore considered scientific knowledge to be restricted to the stages through the development of the instrument, but no further.

Four years later, the Court stepped back and allowed the court to throttle testimony where the application of the instrument was too far removed from the method practiced by the scientific community, recognizing that "conclusions and methodology are not entirely distinct from one another." General Elec. Co. v. Joiner, 522 U.S. 136, 146 (1997); see also Lust ex rel. Lust v. Merrell Dow Pharmaceuticals, Inc., 89 F.3d 594, 598 (9th Cir. 1996) ("When a scientist claims to rely on a method practiced by most scientists, yet presents conclusions that are shared by no other scientist, the district court should be wary that the method has not been faithfully applied.").

Federal appellate courts have interpreted Joiner to allow the court to evaluate whether the specific application of the instrument was valid and reliable, in order to ensure that the administrator did not take "any step that renders the analysis unreliable." See, e.g., In re Paoli R.R. Yard PCB Litigation, 35 F.3d 717, 745 (3d Cir. 1994).

(203) See supra Part I.A.

(204) See supra Part I.B.

(205) SMITH & KOSSLYN, supra note 142, at 215.

(206) See Randy L. Buckner et al., Functional--Anatomic Study of Episodic Retrieval using fMRI, 7 NEUROIMAGE 151, 160 (1998) (describing how difficulty of retrieval was manipulated by shallowness of encoding).

(207) See Jane Campbell Moriarty, Visions of Deception: Neuroimages and the Search for Truth, 42 AKRON L. REV. 739, 759 (2009) ("It is, to use the parlance of Joiner, the 'ipse dixit' problem; the gap between the existing data and the opinion about the meaning of such data. And that is a wide gap indeed at this point in time.").

(208) Note that nondisclosure in the interest of trade secrets or intellectual property should not trump judicial fairness and due process. Specific procedures may be evaluated in camera; the statistical thresholds and baseline task, however, should be disclosed so that opposing counsel may adequately argue in opposition.

(209) Frederick Schauer, Can Bad Science Be Good Evidence? Lie Detection, Neuroscience and the Mistaken Conflation of Legal and Scientific Norms, 95 CORNELL L. REV. 1119 (2010), available at; see also Mark Pet-tit, Jr., fMRI and BF Meet FRE: Brain Imaging and the Federal Rules of Evidence, 33 AM. J. L. & MED. 319, 340 (2007) ("[W]e ... should continue to be open to what is helpful in the pursuit of the goals of our legal system, even if the result is a profound transformation of how that system operates.") (emphasis added).

(210) Schauer, supra note 209, at 34.

(211) Id. at 35.

(212) This standard appears to be applicable only for bench trials, probably because the judge already knows that he or she will not give it much, if any, weight. Compare Oukrop v. Wasserburger, 755 P.2d 233, 239 (Wyo. 1988) ("The 'Let it in for what it's worth' rule of evidence is usually reserved for nonjury trials. The trial judge who invokes this doctrine does so as a sop to the proponent, knowing he is not going to consider it in any event. But, employing this doctrine is dangerous in jury trials. The jury may take the suspect evidence and run with it, as they apparently did here."), with Grubb v. U.S., 887 F.2d 1230, 1235-36 (4th Cir. 1989) (quoting from the lower court's opinion, "Were this being tried to a jury, I don't think I would let it in. However, since it's non-jury, I think I can keep everything in perspective as the factfinder in this case, so I am going to permit the question. I'm going to let it in for what it's worth" and reversing on appeal specifically because the judge did in fact consider the evidence allowed in under this standard).

(213) See Proposals To Eliminate the Prejudicial Effect of the Use of the Word "Expert" Under the Federal Rules Evidence in Civil and Criminal Jury Trials, 154 F.R.D. 537, 559 (1994) ("Given the state of 'expert' testimony in our society today, it is a matter of fundamental fairness and, increasingly, the duty of the courts and counsel to neutralize the impact and possible prejudicial weight given to such opinions.").

(214) Schauer, supra note 209, at 36-37.

(215) Schauer, supra note 209, at 37 n.100.

(216) The spectrum of scientific knowledge has been described as running from the exploratory primary sources of published papers, where communal scrutiny endeavors to eliminate "error, bias, and dishonesty" over time, to the secondary literature of review articles and graduate-level textbooks which present the general widespread consensus, and finally, to the sources of the most reliable and undisputed scientific knowledge, undergraduate textbooks. FOSTER & HUBER, supra note 147, at 161-62 (1997).

(217) David P. McCabe & Alan D. Castel, Seeing Is Believing: The Effect of Brain Images on Judgments of Scientific Reasoning, 107 COGNITION 343, 349 (2008).

(218) David P. McCabe et al., The Influence of fMRI Lie Detection Evidence on Juror Decision-Making, 29 BEH. SCI. LAW 566 (2011) (potential jurors); Deena Skolnick Weisberg et al., The Seductive Allure of Neuroscience Explanations, 20 J. COGNITIVE NEUROSCIENCE 470, 475 (2008).

(219) See, e.g., Joseph Dumit, Objective Brains, Prejudicial Images, 12 SCI. IN CONTEXT 173 (1999).

(220) Schauer, supra note 209, at 28-29.

(221) Judy Illes et al., ELSI [Ethical, Legal and Social Issues] Priorities for Brain Imaging, 6 AM. J. BIOETHICS W24, W24-W31 (2006).

(222) See In Defenee of Darwin and Reason, FINANCIAL TIMES, Jan. 16, 2009, at 6 (arguing that scientists must respond quickly to counter popular misconceptions about scientific research because ignorance or false beliefs about science can contribute to dangerous movements, such as anti-vaccination campaigns).

(223) Although one might consider the test administrator impartial and acting in good faith, consider that in 2009 providers of neuroimaging-based lie detection services charged a fee of $4,000 to $5,000 per scan. Henry T. Greely, Law and the Revolution in Neuroscience: An Early Look at the Field, 42 AKRON L. REV. 687, 698 (2009).

(224) See id. at 689.

(225) See generally Sidney A. Shapiro, Divorcing Profit Motivation from New Drug Research: A Consideration of Proposals to Provide the FDA with Reliable Test Data, 1978 DUKE L.J. 154 (1978).

(226) Jack B. Weinstein, Probative Force of Hearsay, 46 IOWA L. REV. 331, 338 39 (1961) ("The concept that admission should depend upon probative force weighed against the possibility of prejudice, unnecessary use of court time, and availability of more satisfactory evidence is an application of the well recognized principle ... giving the court discretion to exclude admissible evidence.") (emphasis added).

(227) FED. R. EVID. 102.

(228) See, e.g., John E. Theuman, Admissibility in Federal Criminal Case of Results of Polygraph (Lie Detector) Test--Post-Daubert Cases, 140 A.L.R. FED. 525 (1997).

(229) Schauer, supra note 209, at 38.

(230) BLACK'S LAW DICTIONARY 635 (9th ed. 2009).

(231) See, e.g., Greely, supra note 223, at 698; Henry T. Greely & Judy llles, Neuroscience-Based Lie Detection: The Urgent Need for Regulation, 33 AM. J.L. & MED 377, 413 (2007); Moriarty, supra note 207, at 761; Jane Campbell Moriarty, Flickering Admissibility: Neuroimaging Evidence in the U.S. Courts, 26 BEHAV. SCI. LAW 29, 48-49 (2008); Pettit, supra note 209, at 340. Contra Leo Kittay, Admissibility of fMRI Lie Detection, 72 BROOK. L. REV. 1351, 1398 (2007) ("Therefore, the typically loose Daubert analysis will likely endanger technologies like the [sic] fMRI, because cultural prejudice against new and contentious disciplines can easily, even innocently, color the evidentiary decision. The result: helpful and reliable evidence is excluded....").

(232) Greely & Illes, supra note 231, at Part IV.B.

(233) Greely & Illes suggest that the cost of testing any lie detection method could cost $5 million. Id. at 418.

(234) Anand Giridharadas, India's Novel Use of Brain Scans in Courts Is Debated, N.Y. TIMES, Sept. 15, 2008, at A10. Incredibly, the judge had disregarded a yearlong critical review by the committee led by the chief of India's national neuroscience program, who had recommended against admissibility. M. Raghava, Directorate of Forensic Sciences Not to Accept Panel's Findings on Brain Mapping, HINDU, Sept. 8, 2008, available at 09/08/stories/2008090854420400.htm. In the same article, the developer of the technology boasted of its 5% error rate. Id.

(235) FOSTER & HUBER, supra note 147, at 159.

(236) Henry H. Bauer, How Science Really Works, in SCIENTIFIC LITERACY AND THE MYTH OF THE SCIENTIFIC METHOD (1994), reprinted in FOSTER & HUBER, supra note 147, at 161.

(237) FED. R. EVID. 803(18) (B) (stating that "the publication is established as a reliable authority") (emphasis added); see also McCormick on Evidence [section] 321 (arguing that "learned treatises ha[ve] sufficient assurances of trustworthiness to justify equating them with the live testimony of an expert").


(239) George Sensabaugh & Cecilia von Beroldingen, The Polymerase Chain Reaction: Application to the Analysis of Biological Evidence, in FORENSIC DNA TECHNOLOGY 63, 63-64 (Mark A. Farley & James J. Harrington eds., 1991).

(240) Andrews v. State, 533 So.2d 841, 850 (Fla. Dist. Ct. App. 1988), cert. denied, 542 So.2d 1332 (Fla. 1989) ("In contrast to evidence derived from hypnosis, truth serum and polygraph, evidence derived from DNA print identification appears based on proven scientific principles."). At the time, Florida courts appeared to apply a more lenient admissibility standard than the Frye doctrine. State v. Anderson, 853 P.2d 135, 142 (N.M. Ct. App. 1993) ("[T]here is a subclass of cases that admit DNA evidence under a standard different than the Frye standard. See, e.g., ... Andrews v. State ... Known as the relevancy' standard, this other standard is thought to be more permissive than the Frye standard.").

(241) COMMITTEE ON DNA FORENSIC SCIENCE, NATIONAL RESEARCH COUNCIL, THE EVALUATION OF FORENSIC DNA EVIDENCE 172 n.15 (1996); e.g., People v. Castro, 545 N.Y.S.2d 985, 989 (N.Y. Sup. Ct. 1989). Although these cases were not decided under Daubert, many state courts have subsequently suggested that their evaluation standard would produce essentially the same result for DNA evidence under Daubert. COMMITTEE ON DNA FORENSIC SCIENCE, NATIONAL RESEARCH COUNCIL, supra, at 173.

(242) THE POLYGRAPH AND LIE DETECTION, supra note 58, at 227-28.

(243) See, e.g., Paul S. Appelbaum, The New Lie Detectors: Neuroscience, Deception, and the Courts, 58 PSYCHIATRIC SERVICES 460, 460-62 (2007).

(244) See Yoko Okado & Craig E. L. Stark, Neural Activity During Encoding Predicts False Memories Created by Misinformation, 12 LEARNING & MEMORY 3, 3-4, 6-8 (2005).

(245) U.S. v. Mezvinsky, 206 F.Supp.2d 661,674 (E.D. Pa. 2002).

(246) Id. (internal citations omitted).

J.R.H. Law, Litigation associate, Winston & Strawn LLP. The views expressed herein are solely those of the author and should not be attributed to the author's employer or its clients. A.B. (Psychology) Princeton University, 1999; M.A. (Psychology, concentration in cognitive neuroscience) The Johns Hopkins University, 2005; J.D. University of Illinois College of Law, 2010. Law clerk for Judge Michael P. McCuskey, United States District Court for the Central District of Illinois, 2010. I am grateful to Professors Craig Stark and Janice Pea for their invaluable comments on earlier drafts of this article, and to Professor Jennifer Robbennolt and Dr. Craig Bennett for their insight. I would also like to thank my parents and Jasmin Phua for all their support over the many years.

14 YALE J.L. & TECH. 1 (2011)
COPYRIGHT 2011 Yale Journal of Law & Technology
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2011 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Federal Rules of Evidence
Author:Law, J.R.H.
Publication:Yale Journal of Law & Technology
Date:Sep 22, 2011
Previous Article:Evolving entertainment technology: can new types of fun lead to new types of liability?
Next Article:The bramble bush of forking paths: digital narrative, procedural rhetoric, and the law.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |