Printer Friendly

Variable-interval schedule control following response acquisition with delayed reinforcement.

The impetus for this experiment was a set of informal observations of pigeons in experiments involving variable-interval (VI) schedules of immediate reinforcement following exposure to a procedure in which the key-peck response was established and maintained with unsignaled delayed reinforcement in the absence of any exposure to response shaping or other training procedures involving immediate reinforcement (cf. Lattal & Gleeson, 1990). Pigeons with a history of exposure to this delay of reinforcement procedure were suggested by some subsequent users of the pigeons in our laboratory to respond less consistently and at lower rates on VI schedules than pigeons for which key pecking was trained with immediate reinforcement using the method of differential reinforcement of successive approximations. These informal observations invited a more systematic analysis because previous animal studies of behavioral history effects suggest that fixed-interval (FI), but not variable-interval (VI), schedule performance can be strongly influenced by the previous schedule. If such observations held up under systematic analysis they would constitute an important exception to these previous findings.

Wiener (e.g., 1964, 1969) and Urbain, Poling, and Thompson (1978) found that adult humans and rats, respectively, responded differently under FI schedules as a function of prior training on schedules that generated either high or low response rates: respectively, fixed-ratio (FR) or differential-reinforcement-of-low-rate (DRL) schedules. In general, response rates were higher on equivalent FI schedules when the subjects had a history of reinforcement under FR as opposed to DRL schedules. Freeman and Lattal (1992, Experiment 1) trained individual pigeons for 60 sessions on both an FR and a DRL schedule in the presence of different stimuli in alternating sessions. When both schedules were changed to FI, each pigeon responded for a number of sessions at higher rates in the presence of the stimuli previously correlated with the FR schedule and at lower rates in the presence of the stimuli previously correlated with the DRL schedule. Freeman and Lattal (1992, Experiment 2) replicated the procedure described above but imposed a VI schedule instead of an FI schedule after DRL and FR training. Responding on the VI schedule in the presence of stimuli previously correlated with either FR or DRL schedules became similar after only a few sessions. This latter finding was replicated using a somewhat different procedure by Freeman and Lattal (1992, Experiment 3) and it also has been reported by Nader and Thompson (1987) and Poling, Kraft, and Chapman (1980), using groups of animals trained on either FR or DRL schedules.

In the present experiment, responding of pigeons first was established without training under unsignaled resetting delays of reinforcement. The delay duration was 30 s, a longer resetting delay than heretofore has been used with pigeons under the Lattal and Gleeson (1990) procedure. Subsequently, VI schedules were effected to assess the validity of the informal observations described in the first paragraph above.

Method

Subjects

Each of 10 experimentally naive White Carneau pigeons was maintained at approximately 70% of its free-feeding weight. Postsession feedings occurred at least 1 hr after a session. Water and health grit were freely available in the home cage, which was kept in a temperature-controlled room with a 12:12 hr light:dark cycle.

Apparatus

A Ralph Gerbrands Co. Model G7105 operant-conditioning chamber was housed in a Gerbrands Model G7210 sound- and light-attenuating enclosure. A 1.5-cm diameter response key, centered on the work panel 24 cm from the chamber floor and operated by a force of about 0.15 N, was transilluminated red by a 28-V DC bulb covered with a colored cap. A clear plastic response-key extension (cf. Lattal & Gleeson, 1990, Experiment 1) protruded 8 mm from the surface of the work panel. The response key was transilluminated red at all times except during food delivery for each pigeon, regardless of the condition in effect. Two 28-V DC bulbs (No. 1819) covered by white caps and located toward the rear of the ceiling provided general illumination throughout the session except during the reinforcement cycle. Reinforcement was 4-s access to mixed grain via a food hopper located behind a 6-cm diameter circular aperture centered on the work panel, with the lower edge 7 cm from the chamber floor. The food aperture was illuminated by two 28-V DC bulbs (No. 1819) during reinforcement. Noise from a ventilation fan located on the back of the enclosure behind the work panel masked extraneous sounds. Control and recording operations were accomplished with a microcomputer (Tandy 1000 Tx) using MedPC experiment-control software and connected to the chambers by a MedPC interfacing system.

Procedure

Each pigeon was magazine trained as follows. When a pigeon first was placed in the chamber, the food aperture was filled with grain. The only illumination was from the food-aperture lights, which were operated continuously until the pigeon had eaten for approximately 20 s. Then the food-aperture lights were extinguished for about 5 s. Thereafter the hopper was raised and illuminated several times. During this time the duration of the reinforcement cycle was reduced gradually and the average time between food presentations was increased. Next, with the houselights illuminated except during the reinforcement cycle, a variable-time (VT) 30-s schedule was effected in which 4-s hopper presentations occurred independently of the pigeon's behavior. The VT schedule was constructed from the distribution suggested by Fleshler and Hoffman (1962) and consisted of 20 intervals. Hopper training continued until the pigeons ate on each of 25 successive food presentations within 2-s of hopper operation. Subsequent sessions throughout the remainder of the experiment lasted for 90 min and occurred 7 days a week.

Following the last response-independent food presentation, five pigeons (4824, 2393, 4886, 3825, and 4575) were exposed to a tandem VI 30-s DRO 30-s schedule. This schedule defines a resetting unsignaled delay of reinforcement procedure, hereafter described as the delay procedure or the delay condition. On the average of every 30 s, based on a distribution of intervals selected using the procedure of Fleshler and Hoffman (1962), a key peck initiated a 30-s unsignaled period that terminated with reinforcement. Pecks during the latter component reset the clock controlling the delay interval, ensuring that exactly 30 s elapsed between a peck and food delivery. Because pecking for Pigeon 4886 had not been established reliably after 4 sessions, the houselight was extinguished during this pigeon's 5th session, leaving only the key light illuminated. The houselight was re-illuminated during the 6th session. One pigeon (4575) was removed from the experiment because it failed to peck after 15 sessions.

Condition 1 consisted of 25 sessions of the delay condition. Following this, a tandem VT t-s FI 30-s schedule was effected (Condition 2). Under this schedule, a variable time period averaging t s was followed, independently of responding during the VT component, by a 30-s period after which the first response immediately produced the reinforcer. The value of the VT schedule, t, for each pigeon was based on that pigeon's rate of reinforcement during the preceding resetting-delay condition. The mean interreinforcer interval for each of the last five sessions from the delay condition was calculated. Thirty was subtracted from the overall mean of these five sessions to yield a value of t that defined the mean value of the VT schedule, which then was constructed from the Fleshler and Hoffman (1962) distribution as described previously. This tandem schedule was effectively a VI schedule (which also can be described as tandem VT t FR1 schedules) and is so identified in the title of this paper, with a minimum interreinforcer interval of 30 s + the smallest VT interval value. It provided immediate reinforcement of responding while maintaining the same distribution and rate of reinforcement as occurred during the first condition. Hereafter this schedule is described as the immediate reinforcement procedure or condition.

The two conditions described above then were replicated. The pigeons were returned to the delay condition (tandem VI DRO schedule; Condition 3) for 20 sessions, followed by a final 20 sessions on the immediate reinforcement condition (tandem VT FI schedule; Condition 4). The interreinforcer intervals in the latter condition were yoked to the replicated delay condition in the manner described previously.

Pigeons 4244 and 2170 were exposed for 35 sessions to a VT t-s schedule in which response-independent food presentations were yoked on a session-by-session basis to the interreinforcer intervals of, respectively, Pigeons 4824 and 2393 during analogous sessions.

Pigeons 1076 and 4250 were exposed to a (yoked) VI schedule in which, beginning with the fifth session, food presentations were yoked on a session-by-session basis to the interreinforcer intervals of, respectively, Pigeons 4886 and 3825. During the first four sessions following magazine training, any key peck that occurred immediately produced a reinforcer. The VI schedule remained in effect for a total of 80 sessions.

Results

Figure 1 shows the response rates during each session of the experiment for each of the four pigeons first exposed to the delay of reinforcement procedure. The left panel in each pair of panels shows Conditions 1 and 3 (the delay conditions) and the right panel shows Conditions 2 and 4 (the immediate reinforcement conditions). Figure 2 shows comparable data for the yoked conditions. The top graphs show data for the pigeons receiving yoked response-independent food presentations and the lower graphs show data for the two pigeons exposed to yoked response-dependent reinforcement. For these two latter pigeons the schedule always was the yoked VI and therefore the condition numbers refer to the equivalent conditions to which the yoked partners shown in Figure 1 were exposed. Both immediate and delayed response-dependent reinforcement resulted in key pecking without explicit training such as shaping through the differential reinforcement of successive approximations. The four pigeons exposed to the delay condition first pecked within 2-5 sessions. Similarly, the pigeons exposed to the yoked schedules first pecked within 1-4 sessions. Following the end of hopper training, there was a pause of several minutes to several hours followed by a single peck. Thereafter responding occurred consistently throughout the remainder of the experiment at a rate that depended on whether reinforcement was immediate or delayed. The two pigeons exposed to the VT schedule arranging response-independent food presentations pecked a few times initially but failed to peck consistently over successive sessions and eventually ceased responding.

For the two pigeons exposed to the yoked-VI schedule throughout the experiment, the rates of responding were high and tended to increase, up to a point, with continued exposure to the VI schedule. In Figure 2, the filled circles depict Sessions 1-20 (Condition 1) and the filled squares depict Sessions 21-40 (Condition 2). The open circles depict Sessions 41-60 (Condition 3) and the open squares depict Sessions 61-80 (Condition 4). All of the data points for these subjects are continuous with one another but are separated into the four "conditions" corresponding to the four conditions of Pigeons 4886 and 3825 for ease of visual comparison of the data.

For the pigeons initially exposed to the delay condition, the transitions from delayed to immediate and the transitions from immediate to delayed reinforcement conditions were characterized by rapid changes in response rates as a function of the schedule in effect. During the first session with immediate reinforcement following the first delay condition (closed circles to closed squares; Condition 1 and Condition 2, respectively), response rates increased rapidly. The transition from high-rate to low-rate responding (immediate to delayed reinforcement - closed squares to open circles; Condition 2 to Condition 3) was not as rapid; however, within 2-3 sessions, three of the four pigeons had returned to low rates similar to those during the first exposure to the delay condition. Generally, the second transition from delayed to immediate reinforcement (open circles to open squares; Condition 3 to Condition 4) yielded slightly higher response rates during the first few sessions but after a few sessions the rates during Condition 4 were indistinguishable from those during Condition 2.

The top cumulative record of Figure 3 shows responding of Pigeon 3825 during the first immediate reinforcement session (Condition 2) after the delay condition. The bottom record shows responding during the first session of the second delay condition (Condition 3) after the second exposure to the immediate reinforcement condition for the same pigeon. The transitions in responding shown by this pigeon are representative of those observed in the other three pigeons exposed to the same conditions. These records illustrate the rapid, within-session transition of responding in accordance with the conditions of the changed-to schedule.

Figure 4 summarizes the speed of the transitions between conditions. It shows the number of sessions required for the response rate of a daily session during the first (Condition 2) and second (Condition 4) exposure to the immediate reinforcement condition (the tandem VT FI schedule), or, for Pigeons 4250 and 1076, the corresponding exposure to the yoked-VI schedule, to first reach the mean response rate observed during the last five sessions of that condition. For each pigeon, fewer sessions were required for responding to reach the terminal response rate of the condition following the second exposure to the immediate reinforcement condition or the yoked-VI schedule than in the first. That is, control by the current schedule developed more rapidly during the second exposure than during the first exposure.

Figure 5 shows a portion of a cumulative record for Pigeon 3825 (top record) and Pigeon 2393 (bottom record) from the 25th session of the first delayed reinforcement condition (Condition 1). These records show that, in addition to low rates of responding and long pauses between responses, the pigeons also showed occasional, brief "runs" of relatively high-rate responding, often immediately after delivery of a reinforcer. In the first exposure to the resetting unsignaled delay condition, all of the pigeons showed these brief runs of high-rate responding. The runs were most marked with Pigeon 3825 and least so with Pigeon 2393.

Table 1 shows the means and ranges of the rates of reinforcement (reinforcers per minute) for each pigeon during each condition. Each value is an average across all sessions of the condition except for the first delay condition, for which the means and ranges for the last 5 sessions of that condition are shown because programmed reinforcement rates in subsequent tandem VT FI sessions were based on these sessions. Rates of reinforcement were similar between the delayed and immediate reinforcement conditions for Pigeons 4824, 2393, 4886, and 3825 and between the rates of reinforcement during the delayed reinforcement conditions of Pigeons 4886 and 3825 and the yoked-VI condition of Pigeons 1076 and 4250.

[TABULAR DATA FOR TABLE 1 OMITTED]

Discussion

Similar VI response rates were obtained whether the responding first was established with immediate reinforcement or through a procedure where operant responses were established (without explicit training) and maintained only with unsignaled, delayed reinforcement. These results offer no support for the informal observation, which provided the impetus for this experiment; that pigeons with a prior history involving response acquisition and maintenance with delayed reinforcement are likely to respond at lower rates or more erratically when exposed subsequently to VI schedules than are animals without such a history. Rather, the results are in general agreement with other experiments where there were relatively small and transient residual effects of prior reinforcement schedules following a transition to a VI schedule of reinforcement (Freeman & Lattal, 1992, Experiments 2 & 3; Nader & Thompson, 1987; Poling et al., 1980).

Responding was established in 4 of 5 subjects with reinforcement delayed by 30 s from the response, and without training. In only 1 of the 4 subjects was a special intervention of turning off the houselight necessary to establish pecking. By the most conservative standard, then, 3 of 5 pigeons developed and sustained responding in the absence of training and when a 30-s resetting delay of reinforcement was scheduled. By contrast, neither of the 2 pigeons exposed to response-independent food at the same rate as occurred in the delay condition pecked reliably. This difference between the behavioral effects of delayed and response-independent food is consistent with several other experiments in suggesting that the response-reinforcer dependency is critical in such response establishment and subsequent maintenance (e.g., Lattal & Gleeson, 1990; Wilkenfield, Nickel, Blakely, & Poling, 1992).

Response acquisition with delayed reinforcement has been reported routinely using 30-s resetting delays with rats as subjects (e.g., Lattal & Gleeson, 1990; Wilkenfield et al., 1992) but this is the first report of acquisition with 30-s resetting delays using pigeons as subjects. Lattal and Gleeson (1990, Experiment 2) reported response acquisition with delayed reinforcement using nonresetting delays of 30 s with pigeons, but the actual delays between the last response and food delivery were considerably shorter. A key extension of the sort used here may facilitate the occurrence of the first response, in the manner that a protruding lever may make it more likely that a rat will contact it in the course of exploration, but the results obtained with the animals exposed to the VT schedule suggest that it is the contingency and not the extension alone that accounts for the sustained responding under the delay of reinforcement procedure.

This experiment followed up on an informal observation that, if valid, was at odds with the extant literature on behavioral history effects in both animals and humans. Although such follow-up is an important function of science (e.g., Sidman, 1960; Skinner, 1956), it is often the case, as here, that such informal observations do not hold up under systematic analysis. The possible reasons for their not doing so are myriad. Under these circumstances it is instructive to consider the reliability of both the original observation and its subsequent systematic analysis. Although the effect was observed by at least two separate investigators, little additional information on its frequency of occurrence or its magnitude was known, which was part of the reason for the present experiment. Furthermore, as is often the case with informal observations, various potentially confounding variables that might affect response rates could not be ruled out, such as the time periods between the establishing of responding with delayed reinforcement and subsequent exposure to the VI schedules, the use of a number of different delay procedures and values with different animals in the original experiments, and many differences in procedures in the subsequent experiments where VI schedules were effected. The consistency of the present results with previous experiments where VI schedules have been used as the baseline for assessing behavioral history effects and the replicability of the effects within and between subjects in this experiment suggest that variables other than response acquisition and maintenance with delayed reinforcement were important in determining the informally observed low VI response rates.

The general failure of VI schedules to sustain the response rates or patterns consistent with the past schedule performance history, both in this experiment and in the others cited herein, underlines the importance of the changed-to schedule in both developing and assessing what are called behavioral history effects. That is, any behavioral history effect must be considered in light of not only the past experiences of the organism but also concurrently with respect to the present reinforcement contingencies. Such history effects may be considered as transition states (Sidman, 1960) and, as such, are controlled largely by the contingencies operative at the time. This suggests that the transition effects identified as behavioral history effects may be (a) determined by uncontrolled variables that can be understood through systematic experimental analysis and (b) can be as precisely controlled by environmental manipulation as can be any steady state performance.

References

FLESHLER; M., & HOFFMAN, H. S. (1962). A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 5, 529-530.

FREEMAN, T. J., & LATTAL, K. A. (1992). Stimulus control of behavioral history. Journal of the Experimental Analysis of Behavior, 57, 5-15.

LATTAL, K. A., & GLEESON, S. (1990). Response acquisition with delayed reinforcement. Journal of Experimental Psychology: Animal Behavior Processes, 16, 27-39.

NADER, M. A., & THOMPSON, T. (1987). Interaction of methadone, reinforcement history, and variable-interval performance. Journal of the Experimental Analysis of Behavior, 48, 303-315.

POLING, A., KRAFT, K., & CHAPMAN, L. (1980). d-Amphetamine, operant history, and variable-interval performance. Pharmacology, Biochemistry, & Behavior, 12, 559-562.

RICHARDS, R. W. (1981). A comparison of signaled and unsignaled delay of reinforcement. Journal of the Experimental Analysis of Behavior, 35, 145-152.

SIDMAN, M. (1960). Tactics of scientific research. New York: Basic Books.

SKINNER, B. F. (1956). A case history in scientific method. American Psychologist, 11, 221-233.

URBAIN, C., POLING, A., MILLAM, J., & THOMPSON, T. (1978). d-Amphetamine and fixed-interval performance: effects of operant history. Journal of the Experimental Analysis of Behavior, 29, 385-392.

WEINER, H. (1964). Conditioning history and human fixed-interval performance. Journal of the Experimental Analysis of Behavior, 7, 383-385.

WEINER, H. (1969). Controlling human fixed-interval performance. Journal of the Experimental Analysis of Behavior, 12, 349-373.

WILKENFIELD, J., NICKEL, M., BLAKELY, E., & POLING, A. (1992). Acquisition of lever-press responding in rats with delayed reinforcement: A comparison of three procedures. Journal of the Experimental Analysis of Behavior, 58, 431-443. is possible
COPYRIGHT 1998 The Psychological Record
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1998 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Metzger, Barbara A.; Lattal, Kennon A.
Publication:The Psychological Record
Date:Sep 22, 1998
Words:3470
Previous Article:Effects of high doses of naltrexone on running and responding for the opportunity to run in rats: a test of the opiate hypothesis.
Next Article:Cognitive-behavioral strategies and precompetitive anxiety among recreational athletes.
Topics:


Related Articles
Differential reinforcement of other behavior and response suppression: the effects of the response-reinforcement interval.
Response-reinforcer contiguity and human performance on simple time-based reinforcement schedules.
The effects of terminal-link stimulus arrangements on preference in concurrent chains.
A matching law analysis of the effect of amphetamine on responding reinforced by the opportunity to run.
Periodic response-reinforcer contiguity: temporal control but not as we know it!
AN OPERANT BLOCKING INTERPRETATION OF INSTRUCTED INSENSITIVITY TO SCHEDULE CONTINGENCIES.
DIFFERENTIAL OUTCOMES EFFECT ON INSTRUMENTAL SERIAL FEATURE-AMBIGUOUS DISCRIMINATION IN RATS.
CONTINGENT INCENTIVE VALUE IN HUMAN OPERANT PERFORMANCE.
DELIVERING DIFFERENT REINFORCERS IN EACH HALF OF THE SESSION: EFFECT OF REINFORCEMENT RATE.
Within-session decreases in operant responding as a function of pre-session feedings.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters