Using morphed images to study visual detection of cutaneous melanoma symptom evolution.
Although dermatologists are adept at recognizing problem lesions, visual detection appears to be difficult for nonspecialist physicians who, importantly, are likely to be the first practitioners to encounter problem lesions (Miles & Meehan, 1995). For instance, in one study only 12% of nonderma-tology physicians and third-year medical students recognized melanoma lesions in photographs (Cassileth et al., 1986). In another study, upon examining a patient complaining of other symptoms, only one of 285 medical students commented on a suspicious lesion that was located prominently on the patient's neck (Robinson & McGaghie, 1996). These results suggest that general medical training is insufficient to produce melanoma-detection skills.
The importance of melanoma detection by nonspecialists is brought into sharp focus by the fact that patient self-inspection typically is the first line of defense in melanoma detection because each individual is likely to be most familiar with the baseline characteristics of his or her skin. Unfortunately, laypersons, even those who have previously sought medical assistance because of skin worries, are inconsistent in recognizing melanoma symptoms (e.g., Branstrom, Hedblad, Krakau, &Ullen, 2002; Liu et al., 2005; Miles & Meehan, 1995).
The symptoms of melanoma usually are detected visually (Abbasi et al., 2004). A lesion is considered to be suspicious if it is laterally asymmetrical in shape, irregular in border, variegated in color, and/or 6 mm or larger in diameter (Bono et al., 1999; Thomas et al., 1998). Because there is wide topographical variation in lesions that have not developed into melanoma, however, the most important visual symptom is thought to be evolution, that is, change over time in any of the characteristics mentioned above.
Among the weaknesses of existing research on melanoma detection is a reliance on static stimuli (i.e., lesions that are or are not clearly symptomatic), which means that little is known about how individuals identify changes that may be indicative of melanoma. The present investigation was designed as a first step toward developing methods for understanding how laypersons visually identify such changes. The study was inspired by a research program that treated breast-cancer detection as a problem in tactile stimulus control and ultimately led to a highly effective technology for teaching breast self-examination called MammaCare[R] (e.g., Fletcher et al., 1990; Pennypacker et al., 1982). That research employed psychophysical methods (Kling & Riggs, 1971) to identify the smallest degree of abnormality that individuals could identify, by touch, in breast tissue (Adams et al., 1976; Bloom, Criswell, Pennypacker, Catania, & Adams, 1982). A lynchpin of the research program was the development of stimuli (simulated breast models) in which the degree of symptom severity could be systematically varied (Madden et al., 1978), thereby allowing the precise measurement of detection behavior.
The present report describes some early steps in our attempt to do something analogous for melanoma symptoms by creating a series of digital images ranging from asymptomatic to clearly symptomatic. Experiments 1 and 2 employed stimulus-generalization testing and modified psychophysical procedures, respectively, to examine some of the perceptual properties of these images. Experiment 3 provided a first glimpse at how well volunteers detected symptom change in these images under a very rough analogue of self-inspection.
This study describes the creation of the melanoma images that were used in Experiments 2 and 3 and documents the generalization gradients that emerged following training with these stimuli. Images representing differing degrees of symptom severity were developed by using morphing software to create gradations between two benchmark images, one asymptomatic and one clearly symptomatic. According to the publisher, the software produces fixed degrees of change between pairs of adjacent stimuli in a sequence; however, the magnitude of difference between images was impossible to quantify independent of the "degrees of change" specified by the software. This stands in contrast to the usual practice in stimulus control research of quantifying stimulus features in physical units (e.g., Mostofsky, 1965).
In the present study, we sought instead to verify that the images we created functioned similarly to better quantified stimuli in a standard as-say for evaluating differential responding to physically related stimuli. In a generalization-testing procedure, one of our images served as S+, and responding was assessed to this and selected other stimuli from our progression of 100 images. The training that preceded this testing involved no exposure to an S - from within the same progression of images. When employed with stimuli of well-quantified characteristics that vary along some physical dimension, such a procedure typically produces a gradient in which responding is (a) occasioned most by S+, (b) less prevalent as stimuli become more different from S+, and (c) distributed symmetrically around S+ (Guttman & Kalish, 1956; Mostofsky, 1965). Similar gradients in the present study would not verify that the images represent a progression in ratio units of symptom severity but would at least lend confidence to the notion that they represent an ordinal progression of symptom severity.
Additionally, generalization gradients were thought to provide an important early indication of how discriminable images in the symptom-severity progression would be from one another. In similar types of tests, the amount of generalization has been found to vary substantially across types of stimuli (e.g., Ghirlanda & Enqist, 2003), so no basis existed for predicting how easy or difficult participants would find it to tell the stimuli apart.
Participants. College students (N = 18; 1 man and 17 women), ages 18 to 21 years, were recruited through a psychology department participant pool at a Midwestern university. According to dictates of the participant pool, all volunteers were enrolled in a psychology class at the time of participation and received documentation of participation that in many such classes could be exchanged for bonus credit. The specific amount of credit per hour of participation reflected the policies of individual course instructors. The participants all self-identified as Caucasian except for four participants, two of whom identified themselves as Black, one as Asian, and one as Latino/ Hispanic. They had a variety of academic majors, with a major in education as the most common among them.
Setting and stimuli. Experimental sessions were conducted in two office-size rooms equipped with an IBM-compatible computer running the Microsoft Windows 7 [R] operating system. Participants viewed stimuli on the computer's 15-in. color monitor.
Stimuli were presented against a white background. They consisted of digital photographs, each illustrating one of two primary melanoma symptoms (asymmetry of shape or border irregularity). Each stimulus was circular and measured approximately 5 cm in diameter when displayed on a 15-in. computer screen. A skin lesion occupied about 80% of the image and was surrounded by a small perimeter of Caucasian skin.
For both symptom types, image-morphing software was used to develop stimuli that represented 100 gradations of symptom severity, ranging from symptom absence to symptomatic at a clinically problematic level. From a public service website (http://www.melanoma.com) we obtained benchmark pairs of digital images in which one photograph showed a lesion in which a melanoma symptom was absent and the other pictured a lesion in which the same symptom was prominently present. For one pair, one image showed a lesion with pronounced lateral asymmetry and the other showed a lesion that was mostly symmetrical. For the other pair, one image showed a lesion with pronounced border irregularity while the other showed a lesion with smooth borders. Note that these images were identified as symptomatic and asymptomatic based on the medical expertise of the website's creators; no independent evaluation of face validity was conducted for this investigation. Because of copyright restrictions, neither the benchmark nor intermediate images are reproduced here.
For each pair of benchmark images, 98 intermediate images were created by using software (Morpheus Photo Morpher [R]) that, by employing a linear algorithm, rendered hybrid images in fixed degrees of change between pairs of adjacent stimuli in a sequence. The resulting 100 images per symptom type thus represented a continuum between asymptomatic and symptomatic lesions. Hereafter, we refer to the images numerically in the range of 1 to 100, with Image 1 (asymptomatic) and Image 100 (clearly symptomatic) being the benchmark images used in the morphing process. The following images were used in the present study: 1, 8, 15, 22, 29, 36, 43, 50, 57, 64, 71, 78, 85, 92, and 99.
Procedure, The study was programmed within the online Blackboard [R] instructional environment. All participants reported prior experience in this environment and thus required no special instruction regarding its use. The participants worked individually. First they completed a demographic questionnaire, followed by generalization tests for the asymmetry stimulus set and the border-irregularity stimulus set, with the order of those two tests counterbalanced across participants. An additional questionnaire (requiring 2 to 3 min to complete) was administered between the two generalization tests for the purpose of separating them. Results of this questionnaire are not considered here.
Participants were not told the specific nature of the stimuli they would view. They began by reading the following:
This is a study about visual perception. Many everyday experiences require us to use our vision to tell one thing from another. When two objects are very different, this judgment is not difficult. A different kind of perceptual challenge arises when we judge whether an object has remained the same or begun to change. Imagine tending a plant in your garden, cooking a piece of meat on your stove, or deciding whether to paint your house. All of these things involve deciding whether an object has changed from how it used to be. Big changes are easy to notice--for example, if your plant has blossomed, your meat has burned, or your house paint has all peeled off. It is harder to notice the first, small signs of change. Next you will become familiar with an image that we'll call "Image B" [for the asymmetry stimuli, or "Image Y" for the border-irregularity stimuli]. Then you'll see a whole series of other images. Your job is to decide whether each of the images matches Image B [or Y].
Participants were briefly trained to respond to Image 50 before the generalization test began. On the first trial, Image 50 appeared along with the text, "This is Image B [Y]. Please study its features, then click below to show you are ready to go on." Clicking a button marked "I am ready to go on" then produced eight additional training trials. On each of these trials, an image appeared along with the text, "Is this image B [Y]?" On one randomly determined half of the trials, Image 50 was presented, and on the other half, the image was a black circle of identical diameter. Thus, Image 50 served as S+, and there was no S - from the stimulus dimension used in testing. Participants registered a response by clicking "Yes" or "No" plus a button labeled "Save and View Next." After the participant clicked on the button, the next trial began. No feedback was provided after any trial. On the 16 trials (eight per symptom type) distinguishing Image 50 from a black circle, 10 participants registered 16 correct responses; the others registered either 15 (N= 7) or 14 (N= 1) correct responses.
Following each orientation phase was a 145-trial generalization test in the format just described above except that all images were from one of the two symptom-type progressions described above. Each test was divided into three blocks of 45 trials, between which a participant could take a short break if desired. Each block included three trials showing each of the 15 images noted above (thus, nine total trials per participant per image). Order of trials within a block was randomized for each participant.
Results and Discussion
Figure 1 shows the mean number of times that each stimulus was identified as Image B/Y (Image 50) during the generalization tests. For both the symmetry and border-irregularity stimuli, Image 50 (S+) was endorsed most often, with identification becoming progressively less frequent as stimuli became more different from S+. In this broad respect the generalization gradients were similar to those obtained from stimuli whose physical differences have been more precisely quantified.
[FIGURE 1 OMITTED]
The steepness of the gradients at their center indicates that, under the experiment's viewing conditions, participants responded differentially to stimuli separated by as few as seven steps in the morphing progression (i.e., 50 vs. 43 and 50 vs. 57). In a series of one-tailed t tests for paired scores, comparing S+ to each of the other stimuli within a gradient, the differences were significant (df = 34, p [less than or equal to] .001) in all cases according to an adjusted alpha (.05 / 14 comparisons = .0037). The apparent left-right symmetry of the gradients was evaluated in a series of two-tailed t tests for paired scores comparing the results for stimuli that were equidistant from S+ in the morphing progression (e.g., 43 vs. 57, 36 vs. 64, etc.). No significant differences were found (in all cases, df = 34, p [greater than or equal to] .15), indicating that participants treated the relatively more symptomatic stimuli to the right of S+ in Figure 1 as no more or less different from S+ as the relatively less symptomatic stimuli to the left of S+.
Most generally, these findings indicate that participants tended to respond differentially to individual stimuli (e.g., they did not respond categorically to clusters of stimuli; see Ghirlanda & Enquist, 2003). Overall, the gradients were similar to those described in previous research based on stimuli whose properties were defined by well-quantified physical dimensions (e.g., Guttman & Kalish, 1956).
The present study employed the stimuli described in Experiment 1 and a procedure based on psychophysical scaling to estimate the smallest amount of melanoma-symptomatic change in skin lesions that could be visually identified under well-controlled laboratory conditions. Specifically, participants viewed a series of images that became gradually more symptomatic. The outcome of interest was how much change occurred before a lesion was identified as possibly problematic. In psychophysical terms, this "smallest change" may be called a difference threshold.
One question of interest was whether the difference threshold varied as a function of symptom type (lateral asymmetry vs. border irregularity). Presumably, some health symptoms are harder to detect than others, as studies on manual breast examination indicate (e.g., Adams et al., 1976). The melanoma literature, however, provides no clear basis on which to predict whether it might be more challenging to detect changes in asymmetry or border irregularity.
A second question concerned whether the difference threshold varied as a function of symptom severity upon initial presentation of a lesion. In clinical situations, patients may begin monitoring their skin when it is entirely asymptomatic or after troublesome symptoms have started to emerge. It is important, therefore, to know how well individuals detect changes from various symptom-severity "starting points" To reflect this reality, participants viewed several different groups, or series, of slides, each of which progressed from less to more symptomatic. The series differed, however, in the degree of symptom severity of the initial lesion. Note that, based on considerable basic psychophysical research, Weber's law states that difference thresholds are a constant proportion of the original stimulus (Kling & Riggs, 1971). In terms of melanoma progression, therefore, Weber's law predicts that the more symptomatic a lesion is when visual monitoring begins, the greater the degree of change must be before change will be detected. Specifically, in terms of the present stimuli, Weber's law predicts that the more symptomatic the "original" stimulus is in a series, the more stimuli (a progression of changing lesions) will have to be viewed before one is determined to be different from the original.
A third question was whether difference threshold varied as a function of instructional set. Many medical interventions to promote better detection are educational (i.e., patients are told about a disease). The guiding assumption of these interventions appears to be that describing the prevalence and consequences of various cancers will enhance motivation and attention (e.g., Graham, Prapavessis, & Cameron, 2006). We compared informational conditions in which participants were told about melanoma specifically or told about cancer generally. It was predicted that information more closely related to the lesions being viewed would promote more careful detection of change.
Participants. College students (N = 40; 34 women and 6 men, all self-identified as Caucasian) were recruited as in Experiment 1. They ranged from 18 to 25 years old and had a variety of academic majors, with a major in psychology most common among them. Five participants reported a family history of melanoma.
Setting and stimuli-Experimental sessions were conducted in an office-size room equipped with an IBM-compatible computer running the Microsoft Windows XP[R] operating system. Participants viewed stimuli on the computer's 15-in. monitor. A second monitor allowed the experimenter to view7 the same stimuli. Each participant sat in a semi-enclosed work station, while the experimenter sat at a small table adjacent to this station. The experimenter's monitor was not visible to participants.
Prior to starting the procedure, each participant read a short block of text (about 250 words) that was presented on screen. For a randomly assigned half of the participants, the text described the dangers of melanoma and the benefits of its early detection but not specific melanoma symptoms. For the other half, the text did not address melanoma, instead providing general information about cancer.(1)
In the main procedure, Microsoft PowerPoint 2007[R] was used to present the stimuli. All 100 stimuli of each type (asymmetry and border irregularity) were used.
Procedure. The procedure was based on the psychophysical method of limits, in which the stimuli that are presented across trials vary monotonically along a perceptual dimension. For example, in evaluating weight discrimination, stimuli might be presented in two sequences, from heaviest to lightest and vice versa. The two sequences are used because any sequential procedure has the capacity to generate response biases in threshold estimates; the mean of the two sequences is presumed to eliminate the bias (Kling & Riggs, 1971). In the present case, however, skin lesions were presented always in the ascending order of least to most symptomatic; this was done to reflect the biological reality that cancerous lesions virtually always get worse, not better. Because it was unidirectional, the procedure could have promoted a bias for participants to respond "same," that is, to identify changed stimuli as being similar to the original (Kling & Riggs, 1971). Normally, such an uncorrected bias would be considered a troublesome artifact in psychophysical research. However, it may accurately represent what occurs in patient reporting of symptoms under naturally occurring conditions as noncancerous lesions gradually develop the signs of melanoma.
After participants read about melanoma or cancer, the researcher presented brief instructions describing the procedure but not the goals of the experiment. To ensure that the procedure made sense, participants completed a brief practice exercise (details available from the second author upon request) involving stimuli that were created by morphing images of two faces and thus illustrated gradual change between one image and another. The instructions specified that each series of trials would contain an original image followed by several other images, some of which would be different from the original, and that on each trial the participant's job was to indicate whether the current image was the same as or different from the original.
On each trial, one digital image was presented for 5 s against a white background. The image remained displayed until the participant spoke the response "same" or "different," which was recorded by an experimenter, who then advanced the screen to a mask consisting of a gray circle displayed in the location formerly occupied by the stimulus. The mask remained on the screen for 1 s before being replaced by the next stimulus, which initiated the next trial.
For each stimulus type, a series of trials was organized around sequential groups of stimuli that incorporated stimuli 1-20, 21-40, 41-60, 61-80, and 81-100. The start of each trial series was preceded by the simultaneous display of an original stimulus--the lowest numbered, or least symptomatic, lesion--and the highest numbered, or most symptomatic, lesion. Participants viewed this slide until they reported being ready to move on. The trial series began with presentation of the original (least symptomatic) lesion on the first three trials; this was done so that participants would not expect every slide to show a change. Subsequent trials proceeded through the remaining 19 stimuli in numerical sequence (from least to most symptomatic) until the first response of "different" occurred, after which four more trials were conducted to conclude the series. Trials continued so that "different" responses were not adventitiously reinforced via block termination; however, the first "different" response constituted the datum for the series. The goal in each series was to determine the least symptomatic lesion that was reported as different from the "original" lesion (i.e., the point at which a participant switched from reporting "same" to reporting "different").
The switch point for each of the five stimulus series was determined five times for each participant, with determinations obtained consecutively. Thus, the switch point for one series (e.g., Stimuli 61-80) was determined five times prior to testing with a different series (e.g., 21-40). The order in which difference thresholds were obtained for the five stimulus series was counterbalanced across participants. Once participants completed this task with one stimulus set (e.g., asymmetrical stimuli), the task was repeated with the second stimulus set (e.g., irregular border stimuli); this sequence was also counterbalanced across participants.
After the main procedure was completed, participants completed a brief demographic questionnaire asking about any family history of melanoma and were dismissed.
The analyses focused on what may be called an absolute difference threshold for each of the five stimulus series. A raw threshold was first determined that reflected the image for which the first "different" response was occasioned. For example, if a series incorporated Stimuli 21-40 and a "different" response occurred first for Stimulus 36, the raw difference threshold was 36. The absolute threshold reflected the number of stimuli that had preceded the first "different" response. In the preceding example, the absolute threshold would be 36 (switch point) - 21 (starting, or original, stimulus) = 15. The absolute threshold provided a means of comparing switch points across series of stimuli with different starting images.
Recall that each series of stimuli was presented five times. For purposes of statistical analyses, the median of these five iterations was used for each individual. In Figure 2, the mean of individual medians was used to summarize group trends in absolute threshold. A 2 (reading type; between subject) x 2 (symptom type; within subjects) x 5 (level of symptom severity; within subjects) mixed-model ANOVA was conducted to evaluate the results. Three possible main effects, three possible two-way interactions, and one possible three-way interaction were examined through this analysis. Table 1 summarizes the statistical outcomes, which are described here primarily in qualitative terms.
Table 1 Experiment 2: Analysis of Variance for Detection Thresholds Source df F R2 P Between subjects Reading (R) 1 1.40 .036 .243 Symptom (S) 1 3.17 .077 .083 RxS 1 7.1 .157 .011 Subjects within-group error 38 (106.6) Within subject Severity (V) 4 1.64 .158 .186 VxR 4 2.06 .191 .107 VxS 4 0.37 .040 .830 RxSxV 4 1.50 .143 .234 Subjects x v within-group error 35 (1.64)
Across all factors, the range in mean absolute thresholds was 5.45 to 7.05 on a 20-point scale, and there was no significant main effect of symptom severity or reading type on detection thresholds. There also was no significant main effect of symptom type (asymmetry vs. border irregularity) on absolute detection threshold, although this effect approached significance (speaking descriptively, participants tended to report slightly higher difference thresholds for lesions demonstrating border irregularity than for lesions demonstrating asymmetry).
[FIGURE 2 OMITTED]
The outcomes just described were qualified by a significant interaction in which the effect of reading type on absolute detection thresholds depended upon which symptom participants viewed. An analysis of the simple main effects (outcomes as a function of reading type, with asymmetry and border irregularity stimuli considered separately) indicated that when participants read about melanoma they reported significantly lower absolute detection thresholds when viewing lesions demonstrating asymmetry than when they viewed lesions demonstrating border irregularity. There was no significant difference between absolute detection thresholds for symptom type when participants read about cancer in general. This indicates that participants who read about melanoma reported lower detection thresholds when viewing lesions demonstrating asymmetry.
The left panel of Figure 2 shows the mean detection thresholds for lesions demonstrating asymmetry. It shows that participants who read about melanoma consistently detected change earlier than participants who read general information about cancer; no such effect is evident in the right panel of Figure 2, in which detection thresholds tended to be similar regardless of reading type. There was no significant interaction between reading type and severity of lesions. This means that the effect of reading type on detection thresholds did not depend on what level of severity was viewed. Across all levels of severity, reading type had a similar effect on detection thresholds. Also, there was no significant interaction between symptom type and severity of lesions. This indicates that the effect of symptom and severity of lesions on detection thresholds did not depend on certain levels of each other. There was no significant three-way interaction involving reading type, symptom, and severity of lesions.
Contrary to the assumption that some symptoms are harder to detect than others, there was no statistically significant main effect of symptom type (asymmetry vs. border irregularity) on absolute difference thresholds. Visual inspection of Figure 2 suggests that participants found it slightly easier to detect change in lesions demonstrating asymmetry than in lesions demonstrating border irregularity. Because the between-symptom main effect approached statistical significance (p =.08), it is possible that this effect would attain significance in a larger sample of participants. Thus, the hypothesis that accuracy in detection depends on the type of symptom may bear revisiting in future studies, particularly because such an effect might be magnified under more challenging everyday circumstances (such as when lesion evolution takes place over a lengthy interval).
Participants who read about melanoma were expected to notice smaller stimulus changes than participants who read general information about cancer. Although there was no main effect for reading type, the hypothesis was partially supported via a significant two-way interaction effect between reading type and symptom type, in which participants who read about melanoma and viewed the asymmetry symptom reported significantly lower detection thresholds than participants who read about cancer. Contrary to predictions, however, no parallel effect emerged for the border-irregularity symptom.
Based on Weber's law, it was predicted that as the stimuli in the present study became more symptomatic of melanoma, absolute difference thresholds would increase. The predicted main effect of symptom severity on absolute detection threshold was not statistically significant. Visual inspection of Figure 2 also suggests that absolute difference thresholds were consistent across severity; that is, across levels of symptom severity participants required approximately the same number of stimuli to detect a change. Apparent exceptions to Weber's law have been reported previously, and their theoretical implications remain a matter of debate (e.g., Bizo, Chu, Sanabria, & Killeen, 2005; Bush & Austin, 1924; Chukova, 1995; Haldane, 1933). In the present case, the most important potential implication is practical. Recall in the case of melanoma that as the abnormality of a lesion increases, the probability that it is cancerous and malignant increases as well. The effects predicted by Weber's law would indicate that as lesions become more symptomatic of melanoma--the very circumstances in which detection is clinically most important -- noticing change (presumably by comparing the current lesion with memories of its recent characteristics, as in the present procedure) becomes more difficult. The fact that absolute difference thresholds were stable across levels of symptom severity suggests that the detection challenge may not be magnified in this way.
Experiment 2 suggested that detection of change in melanoma-related images can be orderly and can occur after relatively little change in lesions of interest, but that study probably overestimated everyday detection ability, for four reasons. First, the entire procedure of Experiment 2 took place in an interpersonal context (in providing responses directly to an experimenter, participants may have been more attentive than if viewing lesion images alone). Second, our measurement of threshold focused on a participant's first indication of noticing change (within a trial block, trials ceased after a report of change), but there is no guarantee that an individual who suspects change at one point in time will continue to believe that something is amiss, even upon viewing progressively more symptomatic lesions. It is useful, therefore, to examine detection behavior over a larger progression of changing images. Third, a progression of lesion images was presented over a very brief time span. It is possible that detection behavior would be different when images are separated by considerable time and lesion-irrelevant events that may fill it. Fourth, participants judged lesion change in 25 separate blocks of images. This repeated experience of looking for and finding change (coupled with a unidirectional progression of images, as noted previously) established a bias for reporting change quickly within each block. It would be useful, therefore, to evaluate detection behavior in individuals who lack this history.
Experiment 3 addressed these limitations of Experiment 2. Participants in two groups viewed a single progression of images, allowing the tracking of detection behavior after a first report of lesion change. Because they viewed only one progression, detection behavior could not be biased by a within-experiment history of reporting change in other image progressions. To create a degree of similarity to the everyday process of self-examination, which normally takes place in uncontrolled environments, participants completed the procedures not in a laboratory with an experimenter, but rather on their own from any networked computer of their choosing. In this regard the level of potential distraction was expected to be greater--and the level of attention to the research was expected to be less--than in Experiments 1 and 2. In other words, this feature of the experiment was designed to partially model symptom detection under conditions that may prevail in the natural environment. Finally, whereas one group completed the image series without interruption during a single, brief sitting (similar to Experiment 2), the other group viewed only one image per day over a 28-day period, thereby mimicking to a small extent the gradual evolution of symptoms that is expected in melanoma. We expected the Brief Group to detect smaller changes in symptoms than the 28-Day Group because previous research suggests that a major challenge in skin self-examination is that individuals may not recall what their skin looked like in the past (Hanrahan, Hersey, Menzies, Watson, & D'Este, 1997; for analogous psychophysical findings, see Kemp, 1988).
Participants. College students (N= 20; 16 women and 4 men), ages 18 to 23 years, were recruited as described previously, with the following exceptions. First, because the two experimental groups were to follow different activity schedules, an institutional review board required that participants be recruited separately for each schedule; thus participants were not randomly assigned to groups. Second, because of the extended nature of the procedure that they completed, participants in the 28-Day Group were eligible for a $50 prize drawing if they did not miss two consecutive daily sessions or more than four total visits.
The participants all self-identified as Caucasian except for two who self-identified as Asian and two who self-identified as Latino/Hispanic. They had a variety of academic majors, with majors in education and psychology most common among them. No participant reported a family history of skin cancer; six reported a family history of other cancers.
Setting and stimuli. Stimuli were drawn from the progression of lateral asymmetry images described previously. Used in this study were Image 1, even-numbered images from 2 to 50, and Image 100. Participants worked in the online Blackboard [R] environment from a networked computer in a setting and at a time of their choosing. To access the experimental tasks, participants had to log on using the same secure password used to access a variety of confidential university functions such as electronic mail, course scheduling, and financial records.
The majority of participants (N= 12) reported viewing the stimuli on 15-in. monitors, with the rest reporting the use of smaller (12 - or 13-in.; N= 6) or larger ([greater than or equal to] 17-in.; N=2) screens. There were no obvious differences between groups in this regard, and upon casual inspection we found no obvious relation between reported screen size and detection behavior.
Procedure-Members of both groups first read this information on the screen:
This study of visual perception focuses on the symptoms of a disease called melanoma. Melanoma is the most serious type of skin cancer, and it affects more than 50,000 people a year in the United States. Melanoma begins in skin cells called melanocytes. These cells make the skin pigment melanin, which in light-skinned people tends to be concentrated in moles and. freckles. The first sign of melanoma is often a change in the appearance of an existing mole. Noticing the first signs of melanoma is very important. If melanoma has spread to other parts of the body, it can be difficult to treat and often is deadly. If, however, melanoma is diagnosed early, chances are high for a full recovery after treatment. For these reasons doctors want their patients to know the appearance of their healthy skin and watch for changes that might indicate symptoms of melanoma. Finding these symptoms as soon as possible in their development can be critical to surviving melanoma.
Second, members of both groups completed a brief training exercise to acquaint them with the benchmark images (asymptomatic Image 1 and clearly symptomatic Image 100), beginning with these on-screen instructions:
Today we will show you an example of a "starting" mole that has no melanoma symptoms ("Image H"). We will also show you an example of a mole that exhibits melanoma symptoms ("Image M"). We ask you to imagine that the symptom-free Image H is your own skin, and that symptomatic Image M is what your skin could become if there were a problem. Your job today is to learn to tell Image H and Image M apart. We think that this will be fairly easy to learn.
Next, participants viewed a screen showing the benchmark images accompanied by the instruction, "Study them until you are familiar with their features." When ready, participants clicked a button to remove this screen. This produced two practice trials, one presenting Image 1 (H) and the other presenting Image 100 (M), along with this prompt: "This is IMAGE H [M]. Study it until you are familiar with its features. Then click the correct answer below to show you understand." Below the image and its label were the response options "Image H" and "Image M." Clicking either option advanced the participant to the next trial. All participants answered both questions correctly. The final preliminary activity was an 18-trial discrimination test including nine trials with each of the two benchmark stimuli and the prompt, "The image below is. ..." Clicking on either "Image H" or "Image M" advanced the participant to the next trial. No feedback was provided during this test. On this 18-trial discrimination test distinguishing Image 1 from Image 100, all participants registered 18 correct responses.
The main assessment in this experiment was a series of 28 trials employing Image 1 on the first two trials, even numbered images from 2 to 50 on the next 25 trials, and Image 100 on the last trial. The format was similar to that just described with two exceptions. First, each image was accompanied by the question, "Is this the same as Image H?" Second, responses were registered on a 6-point scale (numbers below were used in scoring but were not presented to participants).
(6) I am VERY confident this is the SAME as Picture H.
(5) I am MODERATELY confident this is the SAME as Picture H.
(4) I am WEAKLY confident this is the SAME as Picture H.
(3) I am WEAKLY confident this is DIFFERENT from Picture H.
(2) I am MODERATELY confident this is DIFFERENT from Picture H.
(1) I am VERY confident this is DIFFERENT from Picture H.
Brief Group. The procedure for this group was modeled after that of Experiment 2, in that participants (N= 10) completed all trials in one, brief session. Completing each trial led to the next one after a delay of about 0.5 to 1 s.2 Participants typically required between 2 and 7 min to complete the procedure, after which they completed a brief demographic questionnaire.
28-Day Group. Participants in this group (N = 10) completed the training and the first test trial during their first session and then one test trial per day for 27 days thereafter. They could complete each day's visit any time during the scheduled 24-hour period, and actual times of day varied widely across visits both within and between participants. The demographic questionnaire was completed during the final session after the last test trial. Participants in this group all completed their first visit and completed 254 (94%) of 270 scheduled visits thereafter. For the purpose of data analysis, for missed visits a participant's rating was considered to be unchanged from the previous day.
Results and Discussion
As expected, the procedure in Experiment 3 registered more variability in detection behavior, both within and between individual participants, than did the highly structured procedure of Experiment 2. Figure 3 (left column) provides relevant examples of individual participants. Some individuals showed gradual changes from reporting "same" to reporting "different" as symptom severity increased (e.g., Panel A, from Brief Group), while in others the change was drastic (e.g., Panel B, from Brief Group). In many cases ratings vacillated considerably across trials (e.g., Panel C, from 28-Day Group). That is, ratings sometimes increased when symptom severity increased (this occurred a median of three times per participant across the 26 trials on which symptom severity increased). In one case from each group (e.g., Panel D, from 28-Day Group), ratings barely changed (i.e., remained at 5 or 6) until the most symptomatic images were presented (e.g., Image 100, which participants were explicitly taught to recognize during preliminary training).
[FIGURE 3 OMITTED]
Figure 4 shows the mean ratings of the two groups. Overall, ratings tended to decrease as symptom severity increased. As suggested in the figure, a 2 (group) x 28 (symptom severity) mixed-model ANOVA (Table 2) revealed significant main effects for symptom severity (i.e., ratings decreased significantly as severity increased; p < .0001) and for group (i.e., ratings tended to be lower for the 28-Day Group; p = .04). There was no significant symptom severity x group interaction. A series of f tests for unpaired scores of heterogeneous variance was conducted to compare group ratings for Trials 3 (Image 2) through 27 (Image 50); Trials 1 and 2 (Image 1) and 28 (Image 100) were excluded because these involved the training images and therefore were not expected to produce group differences (indeed, all participants provided ratings of 6 for Image 1 and of 1 for Image 100). To adjust for multiple comparisons, alpha was set at.05/25 = .002. In Figure 4, trials for which significant group differences occurred are designated with asterisks. Ratings of the groups differed as early as Trial 6 (Image 8) and on 11 of the subsequent 19 trials.
Table 2 Experiment 3: Analysis of Variance for Symptom-Change Reporting Source df F [R.sup.2] P Between subject Group (G) 1 4.90 .066 .040 Within subject Symptom severity (S) 27 30.01 .420 <.0001 GxS 1 1.17 .021 .260 Subjects within-group errors 18 (25.87)
Figure 3 (right column) shows how the groups compared in reaching selected empirical milestones. For the Brief Group and the 28-Day Group, the median images for which individual participants provided a first rating lower than 6 ("VERY sure this is the SAME as Image H") were 7 and 4, respectively (top right panel). The median images for which individual participants first provided a rating of 3 ("WEAKLY sure this is DIFFERENT from Image H") or lower were 26 and 12, respectively (right middle panel). Similarly, the first image for which the group median rating was as low as 3 was 40 and 24, respectively (not shown). Finally, the median images that occasioned individual participants' final switch from a higher rating to a rating of 3 or lower were 36 and 12, respectively. This switch may be functionally equivalent to the switch between "same" and "different" judgments that occurred in Experiment 2.
Overall, the preceding results show two patterns. First, whereas Experiment 2 suggested that the evolution of symptoms was detectable within about seven steps in the 100-image progression, the present results, based on different procedures, suggest somewhat less sensitive detection abilities. This outcome was anticipated, in part due to the fact that in Experiment 3 images were viewed under less well-controlled circumstances than in Experiment 2. Second, participants in the 28-Day Group reported change after consistently less symptom evolution than participants in the Brief Group. This finding ran counter to expectations, as it was thought that delays between images experienced by this group would make symptom evolution more difficult to notice. It is possible, of course, that participant awareness of these delays induced a bias to report symptom change, although such a bias, if it existed, was not evident for the first two trials (see Figure 4), during which symptom severity was not different from the benchmark, asymptomatic image encountered during training. One explanation could be that biasing became magnified across trials; this possibility can be evaluated in future studies via a control group that is identical to the present 28-Day Group except that the images do not change across trials. An additional possibility is that brief procedures like those of Experiment 2 and the Brief Group of Experiment 3 create interference across trials that impairs detection performance. If so, lengthening the intertrial interval and/or adding distracter activities between trials should alleviate the effect.
A general point of possible concern about our research is its focus on young-adult volunteers. Because melanoma may appear many years after excessive sun exposure, middle-age individuals are the population of greatest interest regarding the dynamics of symptom self-detection. Melanoma is, however, diagnosed in about 1 in 10,000 individuals in their 20s (Lachiewicz, Berwich, Wiggins, & Thomas, 2008). Based on 2010 census data, this implies that approximately 4,300 current United States citizens will be diagnosed with melanoma in their 20s. Moreover, incidence rates in young people are on the rise (Bradford, Anderson, Purdue, Goldstein, & Tucker, 2010). Thus, melanoma detection is not irrelevant to this age group, particularly if early detection is of interest. Perhaps just as important, young adulthood is when many people begin training for medical professions. In this regard, the abilities of young adults to detect melanoma-symptom evolution are important to the health of at-risk individuals of all ages. In one important respect, the participants in the current study were like many individuals of any age, in that they had not personally experienced melanoma. The pertinent question is whether substantially different dynamics govern detection behavior in people of different ages. Given the rarity of age-specific effects in operant stimulus control and psychophysics, we assume that cohorts of adults of various ages are more similar than different where detection behavior is concerned, while acknowledging that this is a valid focus for future research.
In the present research, Experiments 1 and 2 showed that digital images depicting varying degrees of melanoma symptom severity often could be distinguished from one another under well-controlled laboratory conditions. Although the degree of symptom severity depicted in these images was not quantified outside of the steps of the morphing process that created them, results of the first two experiments supported the ordinal properties of the images and do not contradict the suggestion, based on the linear algorithm of the morphing software, that the images represent a series in which symptom severity varies in approximately equal units. To be clear, our data do not verify ratio properties of these stimuli; technical analyses analogous to those used in the development of simulated breasts in the MammaCare [R] research program (Madden et al., 1978) are required and should be the focus of future studies. In the meantime, however, the morphed images appear to be adequate for use in preliminary analyses of visual detection of melanoma symptom evolution.
This is important because most previous research has focused on how people react to the static properties of potentially cancerous skin lesions. Experiment 2 examined detection of lesion change, albeit under highly artificial conditions. In Experiment 3 the circumstances were somewhat more ecologically valid. For one group, this process spanned a period of 4 weeks, which, though brief by the standards of the disease of interest (melanoma symptoms may evolve over a period of years; Kaufman, 2005), is lengthy by comparison to many experiments on melanoma detection. The fact that detection, contrary to expectations, was better in the extended time frame than in a brief, one-sitting procedure suggests that much remains to be learned about this kind of behavior. The fact that detection behavior varied systematically as a function of image-viewing conditions, however, suggests that procedures like those used here can provide a useful assay in which to study the effects of other variables, such as various types of discrimination training, on detection behavior.
(1) The melanoma reading was obtained November 15, 2008, from http://www.skincarephysicians.com/skincancernet/whatis.html. The general cancer reading was obtained from http://www.cancer.org/Cancer/CancerBasics/what-is-cancer. Neither reading is reproduced here due to copyright restrictions.
(2) Thus, there was no mask between trials, as in Experiment 2, due to limitations of the Blackboard environment. In pilot work, an additional group responded to trials that were separated by filler items. Each filler item presented a different picture of an animal, accompanied by the question, "If you were very hungry and had no other source of food, would you be willing to eat this type of animal?" This was presented to participants as a test of personality. Each filler item offered the response options "Yes" and "No"; answers were not analyzed. Filler trials averaged 3 s or more in duration, and participants typically required between 5 and 15 min to complete the overall procedure. Because the filler items required attention to an image in the same screen location as the lesion image of the previous trial, they were expected to serve the same function as the gray circle mask of Experiment 2. Results for this pilot group were similar to those of the Brief Group and thus are not reported here.
ABBASI, N. R., SHAW, H. M., RIGEL, D. S., FRIEDMAN, R. J., MCCARTHY, W. H., OSMAN, I., KOPF, A. w., & POLSKY, D. (2004). Early diagnosis of cutaneous melanoma. Journal of the American Medical Association, 292, 2771-2776.
ADAMS, C. K., HALL, D. C, PENNYPACKER, H. S., GOLDSTEIN, M. K., HENCH, L. L., MADDEN, M. C, & CATANIA, A. C. (1976). Lump detection in simulated human breasts. Perception & Psychophysics, 20, 163-167.
BIZO, L. A., CHU, J. Y. M., SANABRIA, F., & KILLEEN, P. R. (2005). The failure of Weber's law in time perception and production. Behavioral Processes, 71, 201-210.
BLOOM, H. S., CRISWELL, E. L., PENNYPACKER, H. S., CATANIA, A. C, & ADAMS, C. K. (1982). Major stimulus dimensions determining detection of simulated breast lesions. Perception & Psychophysics, 32, 251-260.
BONO, A., TOMATIS, S., BARTOLI, C, TRAGNI, G., RADAELLI, G., MAURICHI, A., & MARCHESINI, R. (1999). The abed system of melanoma detection: A spectrophotometric analysis of the asymmetry, border, color, and dimension. Cancer, 85, 72-77.
BRADFORD, P. T., ANDERSON, W. F., PURDUE, M. P., GOLDSTEIN, A. M., & TUCKER, M. A. (2010). Rising melanoma incidence rates of the trunk among younger women in the United States. Cancer Epidemiology, Biomarkers, & Prevention, 19, 2401.
BRANSTROM, R., HEDBLAD, M,. KRAKAU, I., & ULLEN, H. (2002). Laypersons' perceptual discrimination of pigmented skin lesions. Journal of American Academy of Dermatology, 46, 667-673. BUSH, A. D., & AUSTIN, A. M. (1924). Weber's law as tested by flowing increments. American Journal of Psychology, 35, 230-234.
CASSILETH, B. R., CLARK, W. H., LUSK, E. J., FREDERICK, B. E., THOMPSON, C. J., & WALSH, W. P. (1986). How well do physicians recognize melanoma and other problem lesions? Journal of the American Academy of Dermatology, 14, 555-560. CHUKOVA, S. V. (1995). Weber's law fails in discrimination of two-dimensional contour images. Sensory Systems, 9, 125-130.
FLETCHER, S. W., O'MALLEY, M. S., EARP, J. L., MORGAN, T. M., SHAO, L., & DEGNAN, D. (1990). How best to teach women breast self-examination. Annals of Internal Medicine, 112, 772-779.
GHIRLANDA, S., & ENQUIST, M. (2003). A century of generalization. Animal Behaviour, 66, 15-36.
GIRGIS, A., CLARKE, P., BURTON, R. C, & SANSON-FISHER, R.W. (1996). Screening for melanoma by primary health care physicians: A cost effectiveness analysis. Journal of Medical Screening, 3, 47-53.
GRAHAM, S. P., PRAPAVESSIS, H., & CAMERON, L. D. (2006). Colon cancer information as a source of exercise motivation. Psychology and Health, 21, 739-755.
GUTTMAN, N., & KALISH, H. I. (1956). Discriminability and stimulus generalization. Journal of Experimental Psychology, 51, 79-88.
HALDANE, J. S. (1933). The physiological significance of Weber's law and color contrast in vision. Journal of Physiology, 79, 121-139.
HANRAHAN, P. F., HERSEY, P., MENZIES, S. W., WATSON, A. B., & D'ESTE, C. A. (1997). Examination of the ability of people to identify early changes of melanoma in computer-altered pigmented skin lesions. Archives of Dermatology, 133, 301-311.
KAUFMAN, H. L. (2005). The melanoma book. New York: Penguin.
KEMP, S. (1988). Memorial psychophysics for visual area: The effect of retention interval. Memory & Cognition, 16, 431-436.
KLING, J. W., & RIGGS, L. A. (1971). Experimental psychology. New York: Holt, Rinehart, and Winston.
LACHIEWICZ, A. M., BERWICH, M., WIGGINS, C. L., & THOMAS N. E. (2008). Epidemiologic support for melanoma heterogeneity using the surveillance, epidemiology, and end results program. Journal of Investigative Dermatology, 128, 243-245.
LIU, W., HILL, D., GIBBS, A. F., TEMPANY, M., HOWE, C, BORLAND, R., MORAND, M., & KELLY, J. W. (2005). What features do patients notice that help to distinguish between benign pigmented lesions and melanomas? The abcd (e) rule versus the seven-point checklist. Melanoma Research, 15, 549-554.
MADDEN, M. C., HENCH, L. L., HALL, D. C., PENNYPACKER, H. S., ADAMS, C. K., GOLDSTEIN, M-K., & STEIN, G. H. (1978). Development of a model human breast with tumors for use in teaching breast examination. Journal of Bioengineering, 2, 427-435.
MILES, F., & MEEHAN, J. W. (1995). Visual discrimination of pigmented skin lesions. Health Psychology, 14, 171-177.
MOSTOFSKY, D., f. (1965). Stimulus generalization. Palo Alto, CA: Stanford University Press.
PENNYPACKER, H. S., BLOOM, H. S., CRISWELL, E. L., NEELAKANTAN, P., GOLDSTEIN, M. K., & STEIN, G. H. (1982). Toward an effective technology of instruction in breast self-examination. International Journal of Mental Health, 11, 98-116.
POOLE, c. M., & DUPONT, G. (2005). Melanoma prevention, detection, and treatment. (2nd ed.). New Haven, CT: Yale University Press.
ROBINSON, J. K., & MCGAGHIE, W. C. (1996). Skin cancer in a clinical practice examination with standardized patients. Journal of American Academy of Dermatology, 34, 709-711.
THOMAS, L., TRANCHARD, P., BERARD, F., SECCHI, T., COLIN, C, & MOULIN, G. (1998). Semiological value of ABCDE criteria in the diagnosis of cutaneous pigmented tumors. Dermatology, 197, 11-17.
Experiment 2 is based on a Master's thesis completed at Illinois State University by the first author. We thank Derek Reed for useful discussions that influenced the design of Experiment 3.
Correspondence concerning this article should be addressed to Thomas S. Critchfield, Department of Psychology, Campus Box 4620, Illinois State University, Normal, IL 61790. E-mail: email@example.com
Elizabeth A. Dalianis, Thomas S. Critchfield, Niki L. Howard, and J. Scott Jordan Illinois State University
Adam Derenne University of North Dakota
|Printer friendly Cite/link Email Feedback|
|Author:||Dalianis, Elizabeth A.; Critchfield, Thomas S.; Howard, Niki L.; Jordan, J. Scott; Derenne, Adam|
|Publication:||The Psychological Record|
|Date:||Jun 22, 2011|
|Previous Article:||Conditional discriminations by preverbal children in an identity matching-to-sample task.|
|Next Article:||Effects of deprivation of vomeronasal chemoreception on prey discrimination in rattlesnakes.|