Differential effects of nonreinforcement and punishment in humans.
Efectos Diferenciales de la Ausencia de Reforzamiento y el Castigo en Humanos. En una preparacion de aprendizaje asociativo, los participantes recibieron reforzamiento parcial (RP) con dos claves diferentes. Para una de las claves, las presentaciones no reforzadas consistieron en emparejamientos de la clave con una consecuencia neutra, mientras que estas presentaciones consistieron en emparejamientos con una consecuencia aversiva para la otra clave. Los resultados mostraron que el entrenamiento de RP produjo una fuerte respuesta ante la clave emparejada con la consecuencia neutra en los ensayos no reforzados. Sin embargo, la respuesta ante la clave emparejada con la consecuencia aversiva en los ensayos no reforzados resulto fuertemente suprimida. Los presentes resultados son problematicos para las teorias actuales del aprendizaje (p. ej., Rescorla y Wagner, 1972), pero pueden ser explicados por teorias clasicas que incluyen mecanismos motivacionales (p. ej., Konorski, 1967), asi como por un modelo recientemente desarrollado, en el cual las expectativas de consecuencias incompatibles compiten por su expresion en la conducta (i.e., Pineno & Matute, 2003).
Since Pavlov (1927) performed his original studies on classical conditioning, it is well known that a conditioned response to a conditioned stimulus (CS), formed due to the repeated pairing of the CS with an unconditioned stimulus (US), can be attenuated through either presentations of the CS without the US (i.e., experimental extinction) or presentations of the CS with a motivationally antagonistic US (i.e., counterconditioning). The fact that both experimental procedures result in a decrease in the strength and/or frequency of the response has encouraged many theorists of learning to explain extinction and counterconditioning through common mechanisms. Pavlov (1927; see also Konorski, 1948), explained extinction as due to the formation of an inhibitory CS-US association, different in nature to the excitatory CS-US association, and counterconditioning as due to the development of an excitatory association between the CS and the new US. By contrast, Konorski (1967) proposed that both extinction and counterconditioning are based on the formation of excitatory associations. Specifically, during extinction and counterconditioning, the representation (or gnostic unit, in his terminology) of the CS becomes associated with the representation of the noUS of the new US, respectively. According to Konorski, the activation of the representation of a US from a given motivational system (i.e., U[S.sub.1]) is incompatible with the activation of the representation of the noU[S.sub.1] or the representation of a US from a different motivational system (i.e., U[S.sub.2]). In other words, these representations ate mutually antagonistic and their activations are reciprocally inhibited (see also Rescorla & Solomon, 1967; Solomon & Corbit, 1974).
This view of Konorski (1967) implies that, after pairings of a CS with U[S.sub.1], CS-noU[S.sub.1] trials (extinction) and CS-U[S.sub.2] (counterconditioning) can be perceived by the animal as motivationally equivalent. For example, when U[S.sub.1] and U[S.sub.2] consist of food and footshock, respectively, the absence of U[S.sub.1] (just like the presence of U[S.sub.2]) can produce an aversive reaction (e.g., frustration, Amsel, 1958), and the absence of U[S.sub.2] (just like the presence of U[S.sub.1) can produce an appetitive reaction (e.g., relaxation, Denny, 1971). This functional equivalence of the representations of the noU[S.sub.1] and the U[S.sub.2] regarding their potential to interfere with the activation of the U[S.sub.1] does not necessarily imply that the noU[S.sub.1] and U[S.sub.2] will produce a similar degree of interference. It can be assumed that the impact of CS-U[S.sub.2] trials will be always higher than that of CS-noU[S.sub.1] trials. There are two reasons for this. First, the physical presentation of U[S.sub.2] can be expected to be more salient than the mere absence of U[S.sub.1]. Second, the presentation of CS-U[S.sub.2] trials also implies the presentation of CS-noU[S.sub.1] trials, therefore allowing for learning of both CS-noU[S.sub.1] and CS-U[S.sub.2] associations (Bouton, 1993).
The explanation of extinction and counterconditioning offered by Konorski (1967), therefore, not only explains both phenomena according to a single mechanism (learning of an excitatory CS-antagonistic US association), but also explains why counterconditioning treatment usually shows a higher effectiveness than extinction treatment in the suppression of the target response (e.g., Gambrill, 1967; Moore, 1986). Konorski's view had few precedents in the field of associative learning due to its ability to provide an integrated account of many different phenomena of interference between outcomes. For example, both extinction and counterconditioning phenomena are explained as effects that arise from learning, of an association between the CS and a different US (i.e., noU[S.sub.1] in extinction, and U[S.sub.2] in counterconditioning). Proactive counterconditioning (i.e., impaired responding during CS-U[S.sub.2] trials due to previous CS-U[S.sub.1] pairings, see e.g., Krank, 1985; Scavio, 1974) can be also seen as analogous to latent inhibition (i.e., impaired responding during CS-U[S.sub.1] trials due to previous CS-noU[S.sub.1] presentations, see e.g., Lubow, 1973). Also, the conditioned suppression suffered by an appetitive instrumental response due to the presentation of an aversive CS (e.g., Annau & Kamin, 1961; Bolles & Fanselow, 1980; Bouton & Bolles, 1980; Church, 1969; Estes & Skinner, 1941) can be explained by this theory as functionally equivalent to the summation test of conditioned inhibition (i.e., decrease of responding to a CS due to the simultaneous presentation of an inhibitor of the US; Rescorla, 1969). More importantly, Konorski's theory encouraged a great amount of research that supported many of its elegant predictions (see, e.g., Goodman & Fowler, 1983; Dickinson, 1977; Dickinson & Dearing, 1979; see Dickinson & Pearce, 1977, for a review).
However, these features of Konorski's (1967) theory have been largely ignored by traditional models of classical conditioning (e.g., Mackintosh, 1975; Pearce & Hall, 1980; Rescorla & Wagner, 1972; Wagner, 1981). First, some of these models (e.g., Mackintosh, 1975; Rescorla & Wagner, 1972) do not acknowledge the possibility of concurrent CS-US and CS-noUS associations. According to these models, excitatory and inhibitory learning consist, respectively, on the increase or decrease of the net strength of an association between the representations of the CS and the US (but see Bouton, 1993; Pearce, 1987; Pearce & Hall, 1980; Wagner, 1981). Second, according to all these models, extinction and counterconditioning are exclusively due to the absence of the US that was previously paired with the CS during the original acquisition phase. This is clearly represented in the learning rule of the Rescorla-Wagner model:
(1) [DELTA][V.sup.n.sub.CS] = [alpha] * [beta] * ([lambda] - [V.sup.n-1.sub.T])
In this equation, [DELTA][V.sup.n.sub.CS] represents the change in associative strength of the CS on trial n. [alpha] and [beta] are learning-rate parameters representing the salience of the CS and the US, respectively. These parameters adopt values between 0 and 1, as a function of their corresponding salience (in the Rescorla-Wagner model, the perceived physical intensity). The parenthetical term (i.e., [lambda]-[V.sup.n-1.sub.T]) represents the discrepancy between the amount of associative strength that can be supported by the US ([lambda]) and the current total associative strength acquired, until trial n-l, by all the CSs present on trial n ([V.sup.n-1.sub.T]). The value of [lambda] will depend on the presence or absence of the US on trial n: when the US is presented, [lambda] adopts a value of 1; when the US is absent, [lambda] adopts a value of 0.
Therefore, the acquisition of a conditioned response to a CS (i.e., CS-US trials) occurs, according to the Rescorla-Wagner model, due to a progressive strengthening (up to 1) of the CS-US association, based on the discrepancy between the expected and actual occurrence of the US (i.e., 1 - [V.sup.n-1.sub.T]). Since this discrepancy will be smaller as the acquisition training proceeds, the increments of the associative strength gained by the CS will be also smaller, resulting in a progressively decelerated curve of acquisition. During extinction training (i.e., CS-noUS trials), the associative strength of the CS decreases (down to 0) due to the existing discrepancy between the expectation of the US and its actual absence (i.e., 0 - [V.sup.n-1.sub.T]). As occurred during acquisition, this discrepancy will be smaller as the extinction training proceeds, resulting in smaller negative increments of the associative strength and, hence, in a progressively decelerated curve of extinction.
Importantly, in the Rescorla-Wagner (1972) model (see also Mackintosh, 1975; Pearce & Hall, 1980; Wagner, 1981) the value of [lambda] is exclusively determined by the presence or absence of its corresponding US. Therefore, according to the Rescorla-Wagner model, whether an expected US is merely absent (as occurs during extinction) or replaced by another US (as occurs during counterconditioning) is completely irrelevant. Thus, whereas Konorski (1967) viewed extinction as a kind of counterconditioning, the traditional models of learning (e.g., Rescorla & Wagner) contemplate counterconditioning a kind of extinction. As a consequence, the traditional models of learning (contrary to Konorski), are unable to explain the higher effectiveness of counterconditioning treatment than of extinction treatment in suppressing conditioned behavior (e.g., Gambrill, 1967; Moore, 1986).
The fact that the suppression of behavior to a CS due to its counterconditioning with a different US occurs at a faster rate than the extinction of behavior due to the mere nonreinforcement of the CS might be viewed as unchallenging because in both procedures, regardless of the different rates, responding does decrease. However, a completely different outcome can be expected it the different trial types of extinction and counterconditioning are interspersed during training. In the case of extinction, interspersing the CS-U[S.sub.1] and CS-noU[S.sub.1] trials would result in the typical partial reinforcement (PRF) procedure (e.g., Hartman & Grant, 1960), which is known to produce persistent responding in the face of subsequent extinction (Amsel, 1958). But, in the case of counterconditioning, interspersing the CS-U[S.sub.1] and CS-U[S.sub.2] trials would result in both a PRF procedure and a partial punishment procedure, which is known to yield strong and persistent suppression of the response (e.g., Storms & Boroczi, 1966). In this situation, according to the traditional models of learning (e.g., Rescorla & Wagner, 1972), responding to a CS, A, trained with both U[S.sub.1] and the absence of U[S.sub.1] should be similar to responding to a CS, B, trained with U[S.sub.1] and U[S.sub.2], whereas according to Konorski (1967; see also Rescorla & Solomon, 1967; Solomon & Corbit, 1974) responding to CS B should be more strongly suppressed than responding to CS A.
The present experiment was performed in order to test whether responding to a partially reinforced cue (i.e., analogous to the CS in the terminology of human associative learning) can be affected by the motivational value of the outcome (i.e., analogous to the US in the terminology of associative learning) presented on the nonreinforced trials. Three motivationally different outcomes were used in this experiment: an appetitive outcome (i.e., [O.sub.Ap]), an aversive outcome (i.e., [O.sub.Av]), and a neutral outcome (i.e., [O.sub.Ne]). The motivational value of these outcomes was given exclusively through instructions: the participants could either gain or lose points by responding on those trials in which the cue was followed by [O.sub.Ap] or [O.sub.Av], respectively. The number of gained or lost points on each trial positively correlated with the number of responses performed during the presentation of the cue. The participants were also instructed about the possibility of neither gaining nor losing points on a given trial, therefore providing a third, neutral, outcome (i.e., [O.sub.Ne]). It is also important to mention that the number of points accrued by the participants during their performance with the task was not interchanged by any good after the experiment, such as money (e.g., O'Donnell, Crosbie, Williams, & Saunders, 2000). Therefore, the instructions, together with the participants' interest in achieving a high performance with the task (i.e., to accrue a high number of points) provided the motivational value of the different outcomes used in the experiment.
The critical question in this experiment was: is responding to a partially reinforced cue affected by the motivational value of the outcome presented on the nonreinforced trials? In order to answer this question, all participants were exposed to two different cues, A and B, trained in a PRF schedule: cues A and B were reinforced in the 50% of the trials (i.e., A-[O.sub.Ap] or B-[O.sub.Ap] trials) and nonreinforced in the other 50% of the trials. However, for cue A the nonreinforced trials consisted of trials in which the cue was followed by the neutral outcome (i.e., A-[O.sub.Ne] trials), whereas for cue B the nonreinforced trials consisted of trials in which the cue was followed by the aversive outcome (i.e., cue B-[O.sub.Av] trials).
Participants and Apparatus. The participants were fourteen students (1 man and 13 women, with a mean age of 19.85 years [SEM = 0.36]) from Deusto University, volunteered for the study. The experiment was conducted using personal computers and participants were run in individual cubicles.
Design and Procedure. The preparation used in this experiment was the same as that previously used by Pineno and colleagues for the study of associative learning with humans (e.g., Pineno, Ortega, & Matute, 2000; Pineno & Matute, 2000) (1). In this preparation, participants were asked to imagine that they were to rescue a group of refugees by helping them escape from a war zone in trucks. A translation of the instructions from Spanish reads as follows:
Screen 1 Imagine that you are a soldier for the United Nations. Your mission consists of rescuing a group of refugees that are hidden in a ramshackle building. The enemy has detected them and has sent forces to destroy the building ... But, fortunately, they rely on your cunning to escape the danger zone] before that happens. You have several trucks for rescuing the refugees, and you have to help them get into those trucks. There are two ways of placing people in the trucks: Pressing the space bar repeatedly, so that one person per press is placed in a truck. Maintaining the space bar pressed down, so that you will be able to load people very rapidly. If you rescue a number of persons in a given trip, they will arrive to their destination alive, and you will be rewarded with a point for each person. You must gain as many points as possible! Screen 2 But ... your mission will not be as simple as it seems. The enemy knows of your movements and could have placed deadly mines on the road. If the truck hits a mine, it will explode, and the passengers will die. Each dead passenger will count as one negative point for you. Fortunately, the colored lights on the SPY-RADIO will tell you about the state of the road. These lights can indicate that: The road will be free of mines. [right arrow] The occupants of the truck will be liberated. [right arrow] You will gain points. The road will be mined. [right arrow] The occupants of the truck will die. [right arrow] You will lose points. There are no mines, but the road is closed. [right arrow] The occupants of the truck will neither die nor be liberated. [right arrow] You will neither gain nor lose points: You will maintain your previous score. Screen 3 At first, you will not know what each color light of the SPY-RADI0 means. However, as you gain experience with them, you will learn to interpret what they mean. Thus, we recommend that you: Place more people in the truck the more certain you are that the road will be free of mines (keep the space bar continuously pressed down ONLY if you are completely sure that there are no mines, because in this way you will put a lot of people in the truck ...). Introduce less people in the truck the more certain you are that the road is mined.
After these instructions, participants were shown a fourth screen that gave instructions about contextual changes. Although contextual changes were not used in the present experiment, in order to avoid making more changes than necessary between different experimental series conducted with the same preparation, we maintained the fourth instruction screen programmed in the task. A translation of the fourth instruction screen read as follows:
Screen 4 Finally, it is important to know that your mission may take place in several different towns. The colors on the SPY-RADIO can mean the same or a very different thing depending on the town in which you are. Thus, it is important to pay attention to the message that indicates the place in which you are. If you travel to another town, the message indicating the name of the town will change. When a change of destination is occurring, you will read the message 'Traveling to another town', so you will be continuously informed about such changes. Nevertheless, sometimes you might end up returning to the same town even if you have seen the message that indicates that you are traveling. Do not worry if all this looks like very complex at this point. Before we start, you will have the opportunity to see the location of everything (radio, town name, messages, scores, etc.) on the screen, and to ask the experimenter about anything that is unclear.
The cues were presented in the spy-radio, which consisted of six panels in which colored lights could be presented. Cues A and B were blue and yellow lights, counterbalanced. All cues were presented for 3 s. On each trial, the termination of the cue coincided with the presentation of an outcome. The appetitive outcome ([O.sub.Ap]) consisted of (a) the message '[n] refugees safe at home!!!' (with [n] being the number of refugees introduced in the truck during the cue presentation) and, (b) gaining [O.sub.Ne] point for each refugee who was liberated. The aversive outcome ([O.sub.Av]) consisted of (a) the message '[n] refugees have died!!!' and, (b) losing [O.sub.Ne] point for each refugee who died in the truck. The neutral outcome ([O.sub.Ne]) consisted of (a) the message 'Road closed' and, (b) maintaining the previous score (2). Outcome messages were presented for 3 s. During the intertrial intervals, the lights were turned off (i.e., gray). The mean intertrial intervals duration was 5 s, ranging between 3 and 7 s
The number of refugees loaded in the truck during the cue was reported in a box on the screen, this number being immediately updated after each response. Although pressing the space bar during the outcome message had no consequences, the number of refugees loaded in the truck during the previous cue presentation remained visible during the presentation of the outcome. Upon outcome termination, the score panel was initialized to 0. Responses that occurred during the intertrial intervals bad no consequence and were not reflected in the panel.
The number of refugees that participants risked placing in the truck on each trial was our dependent variable. During each cue presentation, each response (i.e., pressing the space bar once) placed [O.sub.Ne] refugee in the truck, whereas holding the space bar down placed up to 30 refugees per second in the truck. Therefore, the number of refugees placed in the truck not only correlated with the number of responses (i.e., pressing the space bar), but also with the intensity of these responses (i.e., holding the space bar down). However, for reasons of simplicity, we will refer to our dependent variable as the number of responses. Alternatively, [O.sub.Ne] could view our dependent variable as reflecting the participants' expectation of the appetitive outcome ([0.sub.Ap]). Presumably, the more certain the participants were that the cue would be followed by [O.sub.Ap], the greater number of refugees they would place in the truck, whereas the more certain participants were that the truck would explode ([O.sub.Av]) or that the road would be closed ([O.sub.Ne]), the fewer refugees they would place in the truck.
All participants in the experiment were given 40 trials, 20 trials with each of cues A and B. The half of the presentations of each cue was followed [O.sub.Ap], and the other half was followed by a nonappetitive outcome On these nonreinforced trials, cues A and B were paired with [O.sub.Ne], and [O.sub.Av], respectively. Thus, both cues A and B were exposed to a PRF procedure in which responding was reinforced in the 50% of the trials, and responding was either nonreinforced (cue A) or punished (cue B) in the other 50% of the trials. The different trial types were presented following a pseudorandom sequence, which was given to the participants twice during the experiment. This sequence was A [right arrow] [O.sub.Ap], A [right arrow] [O.sub.Ap], B [right arrow] [O.sub.Ap], A [right arrow] [O.sub.Ne], A [right arrow] [O.sub.Ne], B [right arrow] [O.sub.Av], B [right arrow] [O.sub.Av], A [right arrow] [O.sub.Ne], B [right arrow] [O.sub.Ap], B [right arrow] [O.sub.Av], A [right arrow] [O.sub.Ap], B [right arrow] [O.sub.Av], B [right arrow] [O.sub.Ap], A [right arrow] [O.sub.Ne], B [right arrow] [O.sub.Ap], B [right arrow] [O.sub.Av], A [right arrow] [O.sub.Ne], A [right arrow] [O.sub.Ap], B [right arrow] [O.sub.Ap], A [right arrow] [O.sub.Ap].
Figure 1 depicts the results of the experiment. As can be appreciated from the figure, responding to both cues A and B (or, from an alternative view, the ratings of these cues as predictors of [O.sub.Ap) initially increased from Trial 1 to Trial 2. However, after Trial 2 responding to A was stronger than responding to B on most of the trials. This impression was confirmed by a 2 (Cue: A vs. B) x 20 (Trials) ANOVA on the mean number of responses, which showed main effects of both cue, F(1, 13) = 11.81, p < .01, and trials, F(19, 247) = 2.32, p < .01, as well as a Cue x Trials interaction, F(1,247) = 2.41, p < .01. Also, pairwise comparisons showed that responding to cue A was significantly stronger than responding to B on Trials 3, 4, 7-9, 11-14, and 18-20, all Fs(1, 13) > 5.69, ps < .05. The response elicited by A was also marginally stronger than that elicited by B on Trials 15 and 17 (ps = .06 and .07, respectively). On the rest of the trials, responding to cues A and B did not differ (all ps >. 10).
[FIGURE 1 OMITTED]
These results were not influenced by any differential responding during the pre-cue period, as showed by a 2 (Cue: A vs. B) x 20 (Trials) ANOVA on the number of responses during the 3-s period prior to the presentation of the cue, which yielded no main effect nor significant interaction (all ps > .13).
The results of the present experiment showed that responding to cue B (i.e., a cue paired with both an appetitive and ah aversive outcome on different trials) was more strongly impaired than responding to cue A (i.e., a cue paired with both an appetitive and a neutral outcome on different trials). As previously discussed, these results cannot be explained by traditional models of learning (e.g., Mackintosh, 1975; Pearce & Hall, 1980; Rescorla & Wagner, 1972; Wagner, 1981). According to these models, responding to a cue in a PRF schedule is only determined by the ratio of reinforced and nonreinforced trials, with total independence of the motivational quality of the outcome presented during the nonreinforced trials. This is showed in the simulation (3) of the present experiment following the Rescorla and Wagner model (see top panel of Figure 2). As can be seen in this simulation, this model predicts that both cues A and B will progressively increase their associative strength as training proceeds, reaching an asymptotic level of [V.sub.A] = [V.sub.B] = 0.4. Thus, the associative strength of cues A and B will nearly resemble the actual contingency of 0.5 for each cue with [O.sub.Ap] (i.e., [DELTA]p, Allan, 1980), although slightly smaller due to the overshadowing (Pavlov, 1927) produced by the contextual cues (which were included in the simulation).
[FIGURE 2 OMITTED]
Some post hoc manipulations could be performed in the parameters of the Rescorla and Wagner (1972) model in order to enable this model to explain the present results. For example, in those trials in which [O.sub.Ap] is not present, its salience (i.e., [[beta].sub.noOAp]) could be assumed to be higher due to the presentation of [O.sub.Av], in comparison to its salience when [O.sub.Ne] is presented. Or, alternatively, when [O.sub.Av] is presented, the total amount of associative strength supported by [O.sub.Ap] (i.e., [lambda]) could adopt a negative value (e.g., -1) instead of a null value. Finally, the strength of the appetitive response could be viewed as reflecting the difference between the associative strengths that the cue acquired with [O.sub.Ap] and [O.sub.Av] (i.e., R = [V.sub.Ap] - [V.sub.Av], see Krank, 1985) However, none of these manipulations are free from theoretical problems. First, in the Rescorla and Wagner model an absent cue has a null salience (i.e., [[alpha].sub.CS] = 0), whereas an absent outcome has a positive salience (i.e., [[beta].sub.noOAp] > 0). This assumption, which is necessary in order to allow this model to explain learning in the absence of the outcome (e.g., extinction), implies ah asymmetrical processing of the cues and outcomes (but see Wagner, 1981). Although the latter assumption can be acceptable, it is harder to see how this model could assume that the value of [[beta].sub.noOAp] can be greater due to the presentation of an aversive outcome ([O.sub.Av]) than when a neutral outcome ([O.sub.Ne]) is presented. Second, assuming that [lambda] of [O.sub.Ap] adopts a value of 0 and -1 during the presentations of [O.sub.Ne] or [O.sub.Av], respectively, implies that, whereas the extinction procedure would merely result in a loss of the previously acquired positive associative strength (i.e., down to zero), counterconditioning training would result in the learning of an inhibitory association (i.e., an associative strength of -1). This problem also applies to the view of the appetitive response as reflecting the difference between [V.sub.Ap] and [V.sub.Av]. In this case, counterconditioning would be also expected to yield a net negative or inhibitory appetitive response (i.e., R = [V.sub.Ap] - [V.sub.Av] = 0 -1).
These results support the predictions of Konorski's (1967) theory (see also Rescorla & Solomon, 1967; Solomon & Corbit, 1974), which states that the expression of an association between a cue and an appetitive outcome (i.e., cue-[O.sub.Ap]) is more strongly impaired by training the cue with a motivationally incompatible outcome (i.e., [O.sub.Av]) than with a neutral outcome (i.e., [O.sub.Ne]). However, since no real appetitive and aversive outcomes were presented in the present experiment (i.e., the outcomes were endowed with motivational value through instructions), in order for Konorski's theory to explain the present results, it should assume that the instructions enabled the presentation and expectation of the outcomes to activate antagonist central emotional systems. However, this assumption is questionable since [O.sub.Ne] of the most prominent features of this kind of preparations for the study of human learning is precisely their use of stimuli of low (or even null) biological significance (see Miller & Matute, 1996).
The results of this experiment can be straightforwardly explained by Pineno and Matute's (2003) integrative model of associative learning (IMAL). According to this model, a cue can become associated with the representation of [O.sub.Ap], as well as with the representations of [O.sub.Av] and [O.sub.Ne]. According to IMAL, the presentation of A-[O.sub.Ne] and B-[O.sub.Av] trials will result in the formation of, not only A-[O.sub.Ap] and B-[O.sub.Ap] inhibitory associations (Konorski, 1948), but also A-[O.sub.Ne] and B-[O.sub.Av] excitatory associations (Konorski, 1967). Thus, the presentation each cue will elicit the simultaneous expectation of incompatible outcomes (i.e., [O.sub.Ap] and [O.sub.Ne], in the case of cue A, and [O.sub.Ap] and [O.sub.Av] in the case of cue B). As a consequence, the expression of the target outcome ([O.sub.Ap]) will be impaired, not only by the learning of an inhibitory association between each cue and [O.sub.Ap], but also by the expectation of the alternative and incompatible outcome. Moreover, in the framework of IMAL, learning of the B-[O.sub.Av] association will proceed faster than learning of the A-[O.sub.Ne] association due to the high salience of [O.sub.Av] compared to that of [O.sub.Ne] (i.e., because [O.sub.Av] is motivationally more relevant than ON,). Thus, from the initial trials of training, the expression of the [O.sub.Ap] expectation will be more strongly impaired in the presence of B than in the presence of A, due to the interference caused by the expectation of [O.sub.Av] (produced by the B-[O.sub.Av] association) being stronger than that caused by the expectation of [O.sub.Ne] (produced by the A-[O.sub.Ne] association). This is depicted in the bottom panel of Figure 2, which shows the simulation (4) of the present experiment by IMAL.
There are two important points that were not addressed by the experiment and that deserve consideration. First, based on the previous explanation of the present results by Pineno and Matute's (2003) IMAL, it is evident that this model predicts that, given a number enough of A-[O.sub.Ne] trials, this association should completely interfere with the expression of the A-[O.sub.Ap] association (as the B-[O.sub.Av] association did with fewer trials). In other words, this model predicts that, as PRF training proceeds, interference caused by the expectation of [O.sub.Ne] will become more pervasive. Although this prediction apparently contradicts the observation of response persistence in PRF procedures (Amsel, 1958), it receives some indirect support from experiments showing that extinction occurs more rapidly following extensive PRF than following PRF with a moderate number of training trials (see McCain, Lee, & Powell, 1962). However, the small number of trials in the present experiment did not allow us to directly test this prediction. Future experimental work should try to assess whether, as IMAL predicts, extensive PRF training with a neutral outcome produces, in the long run, the same effect produced by PRF training with ah alternative aversive outcome in few trials.
Finally, it is important to acknowledge the potential influence of the use of a within-subject design in the present experiment. Because all participants received training with both cues A and B, responding to each cue could be (at least partially) determined, not only by the outcomes directly paired with the cue itself, but also by the outcomes paired with the alternative cue. That is, responding to cue A could depend not only on the interaction between the expectations of [O.sub.Ap] and [O.sub.Ne], but also on the expectation of the absence of [O.sub.Av] (i.e., due to [O.sub.Av] being always presented in the absence of cue A). Symmetrically, responding to cue B could depend not only on the interaction between the expectations of [O.sub.Ap]and [O.sub.Av], but also on the expectation of the absence of [O.sub.Ne] (i.e., due to ON, being always presented in the absence of cue B). If cues A and B were learned as inhibitors of [O.sub.Av] and [O.sub.Ne], respectively, then cue A (but not cue B) would become a signal for safety and, as a consequence, responding to cue A would result more strongly enhanced than responding to cue B. Moreover, as the presentations of B [right arrow] [O.sub.Av] trials could have increased responding to cue A, the presentations of A [right arrow] [O.sub.Ap] trials could have enhanced the suppression of responding to cue B. This latter possibility is suggested by studies showing that the availability of an alternative source of reinforcement (i.e., responding to cue A, in the present experiment) increases the effectiveness of punishment treatments (e.g., Herman & Azrin, 1964). In sum, the use of a within-subject design in the present experiment could have determined to some extent the observed differential responding to the cues. However, according to Pineno and Matute's (2003) IMAL, although weak inhibitory A-[O.sub.Av] and B-[O.sub.Ne] associations could have been formed in the present experiment, responding to cues A and B mainly depended on the interaction between the expectations of the outcomes directly paired with each cue. Therefore, this model predicts that the present results should be replicable using a between-group design in which partial reinforcement and partial punishment treatments are given in different conditions. This prediction of IMAL also deserves, in our opinion, future empirical work.
(1) A demonstration version of this preparation can be downloaded from http://sirio.deusto.es/matute/software.html (see also http://bingweb.binghamton.edu/~learning/task.htm for new adaptation of this preparation).
(2) The neutral outcome ([O.sub.Ne]) was presented (i.e., instead of presenting no outcome at all) in order to give the participants feedback about the consequences of their behavior, not only on reinforced ([O.sub.Ap]) and punished ([O.sub.Av]) trials, but also on nonreinforced ([O.sub.Ne]) trials. This feedback given on A [right arrow] [O.sub.Ne] trials, as the feedback given on B [right arrow] [O.sub.Av] trials, aimed to make explicit the absence of [O.sub.Ap]. Thus, if anything, the presentation of a neutral outcome increased the effectiveness of nonreinforced trials, while minimizing unnecessary differences between the A [right arrow] [O.sub.Ne] and B [right arrow] [O.sub.Av] trials.
(3) The parameters used in this simulation were: [[alpha].sub.A] = c[[alpha].sub.A] = 1, [[alpha].sub.CTX]= 0.1, [[beta].sub.[O.sub.Ap]] = 1, [[beta].sub.noOAp] = 0.35, [lambda] = 1. This simulation was performed using the program developed by Jason M. Tangen. This software can be downloaded from http://univmail.mcmaster.ca/~tangenjm/main.html
(4) The parameters used in this simulation were the predefined parameters for simulations of experiments with human participants in the simulation program of Pineno and Matute's (2003) model. This program can be downloaded from http://bineweb.binghamton.edu/~learning/model.htm
Allan, L. G. (1980). A note on measurement of contingency between two binary variables in judgement tasks. Bulletin of the Psychonomic Society, 15, 147-149.
Amsel, A. 0958). The role of frustrative nonreward in noncontinuous reward situations. Psychological Bulletin, 55, 102-119.
Annau, Z., & Kamin, L. J. 0961). The conditioned emotional response as a function of intensity of the US. Journal of Comparative and Physiological Psychology, 54, 428-432.
Bolles, R. C., & Fanselow, M. S. (1980). A perceptual-defensive-recuperative model of fear and pain. Behavioral and Brain Sciences, 3, 291-323.
Bouton, M. E. (1993). Context, time, and memory retrieval in the interference paradigms of Pavlovian learning. Psychological Bulletin, 114, 80-99.
Bouton, M. E., & Bolles, R. C. (1980). Conditioned fear assessed by freezing and by the suppression of three different baselines. Animal Learning & Behavior, 8, 429-434.
Bromage, B. K., & Scavio, M. J. (1978). Effects of an aversive CS+ and CS- under deprivation upon successive classical appetitive and aversive conditioning. Animal Learning & Behavior, 6, 57-65.
Church, R. M. (1969). Response suppression. In B. A. Campbell & R. M. Church (Eds.), Punishment and aversive behavior (pp. 111-156). New York: Appleton-Century-Crofts.
Denny, M. R. (1971). Relaxation theory and experiments. In F. R. Brush (Ed.), Aversive conditioning and learning. New York: Academic Press.
Dickinson, A. (1977). Appetitive-aversive interactions: Superconditioning of fear by an appetitive CS. Quarterly Journal of Experimental Psychology, 29, 71-83.
Dickinson, A., & Burke, J. (1996). Within-compound associations mediate the retrospective revaluation of causality judgements. Quarterly Journal of Experimental Psychology, 49B, 60-80.
Dickinson, A., & Dearing, M. F. (1979). Appetitive-aversive interactions and inhibitory processes. En A. Dickinson y R. A. Boakes (Eds.), Mechanisms of learning curl motivation: A memorial volume to Jerzy Konorski (pp. 203-231). Hillsdale, NJ: Erlbaum.
Dickinson, A., & Pearce, J. M. (1977). Inhibitory interactions between appetitive and aversive stimuli. Psychological Bulletin, 84, 690-711.
Estes, W. K., & Skinner, B. F. (1941). Some quantitative properties of anxiety. Journal of Experimental Psychology, 29, 390-400.
Gambrill, E. (1967). Effectiveness of the counterconditioning procedure in eliminating avoidance behavior. Behavior Research and Therapy, 5, 263-273.
Goodman, J. H., & Fowler, H. (1983). Blocking and enhancement of fear conditioning by appetitive CSs. Animal Learning & Behavior, 11, 75-82.
Hartman, T. F., & Grant, D. A. (1960). Effect of intermittent reinforcement on acquisition, extinction, and spontaneous recovery of the conditioned eyelid response. Journal of Experimental Psychology, 60, 89-96.
Herman, R. L., & Azrin, N. H. (1964). Punishment by noise in an alternative response situation. Journal of the Experimental Analysis of Behavior, 7, 185-188.
Konorski, J. (1948). Conditioned reflexes and neuron organization. Cambridge: Cambridge University Press.
Konorski, J. (1967). Integrative activity of the brain. Chicago: University of Chicago Press.
Krank, M. D. (1985). Asymmetrical effects of Pavlovian excitatory and inhibitory aversive transfer on Pavlovian appetitive responding and acquisition. Learning and Motivation, 16, 35-62.
Lubow, R. E. (1973). Latent Inhibition. Psychological Bulletin, 79, 398-407.
Mackintosh, N. J. (1975). A theory of attention: Variations in the associability of stimuli with reinforcement. Psychological Review, 82, 276-298.
McCain, G., Lee, P., & Powell, N. (1962). Extinction as a function of partial reinforcement and overtraining. Journal of Comparative and Physiological Psychology, 55, 1004-1006.
Miller, R. R., & Matute, H. (1996). Animal analogues of causal judgment. In D. R. Shanks, K. J. Holyoak, and D. L. Medin (Ed.), The Psychology of Learning and Motivation, Vol. 34 (pp. 133-166). San Diego, CA: Academic Press.
Moore, J. (1986). On the consequences of conditioning. Psychological Record, 36, 39-61.
O'Donnell, J., Crosbie, J., Williams, D. C. & Saunders, K. J. (2000). Stimulus control
and generalization of point-loss punishment with humans. Journal of the Experimental Analysis of Behavior, 73, 261-274.
Pavlov, I. P. (1927). Conditioned reflexes. London: Clarendon Press.
Pearce, J. M., & Hall, G. (1980). A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychological Review, 87, 532-552.
Pearce, J. M. (1987). A model for stimulus generalization in Pavlovian conditioning. Psychological Review, 94, 61-73.
Pineno, O., & Matute, H. (2000). Interference in human predictive learning when associations share a common element, International Journal of Comparative Psychology, 13, 16-33.
Pineno, O., & Matute, H. (2003). The problem of catastrophic forgetting: Proposal of an integrative model of associative learning. Manuscript submitted for publication.
Pineno, O., Ortega, N., & Matute, H. (2000). The relative activation of the associations modulates interference between elementally-trained cues. Learning and Motivation, 31, 128-152.
Rescorla, R. A. (1969). Pavlovian conditioned inhibition. Psychological Bulletin, 72, 77-94.
Rescorla, R. A., & Solomon, R. L. (1967). Two-process learning theory: Relationship between Pavlovian conditioning and instrumental learning. Psychological Review, 74, 151-182.
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York: Appleton-Century-Crofts.
Scavio, M. J. (1974). Classical-classical transfer: Effects of prior aversive conditioning upon appetitive conditioning in rabbits (oryctolagus cuniculus). Journal of Comparative and Physiological Psychology, 86, 107-115.
Solomon, R. L., & Corbit, J. D. (1974). An opponent-process theory of motivation: I. Temporal dynamics of affect. Psychological Review, 81, 119-145.
Storms, L. H., & Boroczi, G. (1966). Effectiveness of fixed ratio punishment and durability of its effects. Psychonomic Science, 5, 447M.48.
Wagner, A. R. (1981). SOP: A model of automatic memory processing in animal behavior. In N. E. Spear & R. R. Miller (Eds.), Information processing in animals: Memory mechanisms (pp. 5-47). Hillsdale, NJ: Erlbaum.
(Manuscript received: 26 May 2003; accepted: 22 Sept 2003)
Oskar Pineno, This research was made possible by a postdoctoral fellowship from the Spanish Ministry of Education (Ref. EX2002-0739). The author was also supported by a F.P.U. fellowship from the Spanish Ministry of Education (Ref. AP98, 44970323) during performance of the experiment, and by grant PI-2000-12 of the Department of Education, Universities and Research of the Basque Government, awarded to Helena Matute. I would like to thank Leyre Castro, Ralph Miller, and Miguel Angel Vadillo for their insightful comments on an earlier version of this manuscript. Correspondence concerning this article should be addressed to Oskar Pineno, Department of Psychology, SUNY-Binghamton, Binghamton, NY 13902-6000, USA; email@example.com
|Printer friendly Cite/link Email Feedback|
|Date:||Jan 1, 2004|
|Previous Article:||Judgment frequency effects in generative and preventative causal learning.|
|Next Article:||Inflation of Type I error rates by unequal variances associated with parametric, nonparametric, and rank-transformation tests.|