Capuchin ( Sapajus [Cebus] apella ) Change Detection

Change blindness is a phenomenon in which individuals fail to detect seemingly obvious changes in their visual fields. Like humans, several animal species have also been shown to exhibit change blindness; however, no species of New World monkey has been tested to date. Nine capuchins ( Sapajus [Cebus] apella ) were trained to select whether or not a stimulus changed on a computerized task. In 4 phases of testing, consisting of full image changes, subtle occlusion changes, and 2 levels of feature location changes, the search display and mask durations were systematically varied to determine whether capuchins experienced change blindness and in what contexts. Only the full-image change test yielded significant results, with subjects detecting changes most accurately with longer search displays and, perplexingly, least accurately when there was no mask. No interactions between search display and mask durations were found in any test phase, suggesting that the relationship between the 2 parameters may not be important to how capuchins perceive changes. Although it is possible that capuchins do not experience change blindness, we suspect that a mix of experimental design, the difficulty of the task, and the inability to verify how closely the subjects attended to each trial contributed to the lack of significant results. male capuchins social husbandry together in the colony room for the duration of the study. Outdoors, the monkeys had visual and auditory access to neighboring groups. Indoors, two groups (Nkima’s Griffin’s visual and auditory access to each other. The monkeys were never food deprived veterinary their well-being) and received chow, fresh fruits, and throughout the day on the same schedule, in addition to any food rewards during research or not they chose to participate in testing. All monkeys had ad libitum access to water, including during test sessions, and subjects were trained to voluntarily separate for short periods of time from their group for cognitive and behavioral testing. Monkeys were never restricted from food, water, social contact, or outdoor access to encourage participation in the study. The LRC is fully accredited by the Association for Assessment and Accreditation of Laboratory Animal Care, and all procedures were approved by the Institutional Animal Care and Use Committee of GSU (IACUC) and were in accordance with the Association for the Study of Animal Behaviour/Animal Behavior Society's guidelines for the use of animals in research and the American Society of Primatologist’s guidelines for the use of non-human primates.

changes in motion pictures and videos clips (Angelone, Levin, & Simons, 2003;. In one striking study, Simons and Chabris (1999) showed participants a video of several individuals in white or black shirts passing two basketballs. The participants were instructed to mentally count how many times individuals in one of the two colors passed the ball as everyone walked in circles while passing the ball around. Despite the circuitous and intentionally confusing patterns walked by the basketball passers, participants were generally able to keep track of the total number of passes. Incredibly, many participants did, however, fail to notice a man in a gorilla suit walking through the basketball passers. This form of change blindness, known as inattentional blindness, provides support for the hypothesis that attention is required for changes to be detected (Neisser, 1979). Yet, O'Regan, Deubel, Clark, and Rensink (2000) found that even when fixated on the location of the change, participants still failed to detect the change over 40% of the time. Thus, even seemingly obvious details that might be crucial to our lives, such as the moments proceeding a car accident or recalling what a thief looked like or was wearing, can be easily missed, resulting in potentially damaging consequences (for review, see Hyman, 2016). Attention is therefore an important component of change blindness, but clearly attention alone does not explain the phenomenon.
Perhaps most surprising -and unsettling -is the degree to which people are blind to changes occurring in the real world. This was artfully demonstrated by Simons and Levin (1998), who had an experimenter holding a map stop and ask individuals on a college campus for directions. Following a minute or so of discussion, two confederates dressed as construction workers and carrying a door walked between the experimenter and participant. The passing construction workers and door served as a mask, enabling a second experimenter to surreptitiously change places with the first experimenter. Despite wearing different clothing and many physical differences between the two experimenters, 8 out of 15 participants failed to report noticing the change, despite now being engaged in conversation with a completely different individual. Participants who noticed the change tended to be roughly the same age as the experimenters, implying a potential bias for detecting changes to in-groups. In a second experiment, the experimenters again dressed as construction workers; however, all participants were either graduate or undergraduate students, creating the appearance of an in-group/out-group divide between the experimenters and participants. Here, only one third of the participants noticed the change, providing further support for an in-group change-detection bias. While this was a harmless study occurring on a college campus, it nonetheless suggests that humans may fail to detect changes in more serious situations, such as while driving, resulting in far more serious consequences.
That this perceptual failure readily occurs in the real world, where detecting changes can have life-or-death implications, strongly supports the need for further research and understanding of the phenomenon. In particular, it is important to determine whether change blindness is a result of something to do with human culture (most of these studies have been run in Westernized societies) or is the result of a more basic biological phenomenon, in which case one might expect it to be shared with other species. Indeed, there are differences across cultures; Masuda and Nisbett (2006) found that, compared to American college students, college students from East Asia were more sensitive to contextual changes than focal object changes. Nonetheless, the basic expression of change bias appears to be highly consistent. Aside from suggesting that change blindness should be present in other species, disambiguating between culture and biology is key to determine how best to address change blindness in situations in which it can have grave side effects. Therefore, beyond studying change blindness in humans from diverse cultures, additional research is also needed to determine how widespread and consistent it is among nonhuman animals and whether it shares the same cognitive foundations.
It is also important to better understand the parameters that influence change blindness in order to predict when it will occur and determine strategies for mitigating it. One feature that has received much attention is the duration that the search display is visible prior to the mask in experimental tasks. This parameter has been varied extensively, typically depending on the specific questions being asked. For instance, the longer the search display, the more time subjects have to attend to and encode the stimuli, enabling a trace of the item to be stored in visual working memory and then recalled at the test display. Conversely, subjects must rely solely on attentional capture for the shortest search displays. Varying the duration of the search display can thus provide insights into how executive control and memory consolidation function with respect to how long subjects are able to attend to stimuli. Unsurprisingly, increasing the duration of the search display has led to improved retention of items in change detection tasks with human participants (Alvarez & Cavanagh, 2004;Eng, Chen, & Jiang, 2005).
However, these findings come from studies in which multiple items from within a category, such as colored squares, letters, or faces, are presented, as opposed to a single item that then may or may not change in conjunction with the mask. Longer search displays may therefore simply be associated with more time to encode additional items into working memory or even long term memory. On the other hand, when presented with a relatively simple single stimulus, such as a line drawing, which may or may not change, subjects need not rely on a large visual working memory capacity but can attend solely to whether or not any change occurred. Thus, using a simple single stimulus and a change/no-change design reduces the role of visual working memory capacity, which has been found to vary depending on the type of stimuli used and human participants' familiarity with those stimuli (Alvarez & Cavanagh, 2004;Cowan, 2001;Eng et al., 2005;Luck & Vogel, 1997;Luria, Sessa, Gotler, Jolicoeur, & Dell'Acqua, 2009;Sørensen & Kyllingsbaek, 2012). This paradigm seems most appropriate for nonhuman primates (hereafter primates), whose visual working memory capacities appear to be both smaller and more variable than those of humans, albeit depending on the type of stimuli used (Elmore et al., 2011;Elmore & Wright, 2015;Leising et al., 2013).
Perhaps the most common method for inducing change blindness is referred to as the gap-contingent technique, in which the change occurs during a gap -often a blank screen, though sometimes a patterned mask -between the original and altered stimuli (e.g., Pashler, 1988;Phillips, 1974;Rensink et al., 1997;Simons, 1996). This technique has been found to induce relatively robust levels of change blindness, as the gap mimics a long eye blink or a cut from one scene to another in a film. Moreover, increasing the duration of the mask resulted in significantly more errors by human participants (Pashler, 1988), a pattern that appears to remain relatively consistent across multiple species, including rhesus macaques (Macaca mulatta) and pigeons (Columba livia; Elmore, Magnotti, Katz, & Wright, 2012;Eng et al., 2005;Leising et al., 2013). In humans, recognition and recall of images improves with extended viewing time, at least up to 5 s (Tversky & Sherman, 1975), implying that a mask following an extremely short search display may disrupt memory consolidation. However, Rensink, O'Regan, and Clark (2000) found no significant differences in trials in which human participants were presented with an 8-s uninterrupted preview of the original stimuli prior to a rapid flicker sequence compared to trials without the extended preview, suggesting that the appearance of the mask alone does not actually disrupt memory consolidation. Thus, varying the duration of the mask offers insights into the mechanisms directly causing change blindness.
Like humans, primates also seem more susceptible to change blindness as the complexity or number of items in the search display increases. Heyselaar, Johnston, and Paré (2011) presented two laboratory housed female macaques with a color change detection task using arrays of two to five colored squares presented for 500 ms followed by a 1,000-ms mask and found that performance at all set sizes was significantly above chance; though, as predicted, performance declined gradually as set size increased. Similarly, Chau, Murphy, Rosenbaum, Ryan, and Hoffman (2011) used a flicker change detection task alternating between a 500-ms search display and a 50-ms mask to test object-in-scene memory in 12 university students and 4 female macaques, finding that both species had similar search time patterns even with these more complex stimuli, suggesting a common underlying memory process. In an effort to measure visual short-term memory capacity in 6 humans and 2 individually housed laboratory macaques, Elmore et al. (2011) adjusted the display size (i.e., 2, 4, 6, 8, or 10 stimuli) on a change detection task, using colored squares in one experiment and clip art in a second experiment. The macaques viewed 5-s search displays followed by 50-ms masks and performed slightly better in the clip-art condition than the coloredsquare condition, regardless of display size; this same effect was only noticeable among humans with the larger display sizes of 8 and 10. The humans, however, viewed their search display for 1 s followed by a 900-ms mask. Interestingly, these results were replicated with no significant differences when the two macaques were retested using the same stimuli and display sizes but with a 1,000-ms mask, as opposed to just 50 ms, in contrast to the decline in performance associated with longer masks in human research (Elmore & Wright, 2015;Pashler, 1988;Phillips, 1974).
Besides macaques, there is evidence for change blindness in chimpanzees (Pan troglodytes). Tomonaga and Imura (2015) presented three socially housed chimpanzees with a varied search display lasting either 90 ms or 320 ms followed by a mask lasting either 90 ms or 180 ms. Regardless of search display and mask duration, they found that the chimpanzees detected changes significantly less accurately when 4 a mask was inserted between search and test displays. Recently, pigeons (Columba livia) have also been found to exhibit change blindness. Herbranson and Davis (2016) varied the duration of the search display between 15 ms and 125 ms, with a 30-ms mask used on half of all trials. They found that pigeons performed better with no mask and longer search displays. Similarly, Herbranson et al. (2014) found that pigeons performed worse on a change detection task when the mask was longer and the change was repeated fewer times.
Prior to the present study, the change blindness phenomenon had yet to be studied in any species of New World monkey, a lineage that split off from that of humans 32-36 million years ago (Glazko & Nei, 2003;Schrago & Russo, 2003). It is important to look at the phenomenon across the entire primate order, as well as in nonprimates, to determine whether there are differences in how change blindness manifests in different taxa and, if so, to determine whether these differences may have correlated within each species' evolutionary history. Such understanding may provide insight into the evolutionary causes of change blindness, which would help determine the situations in which it is most likely to occur. The relatively similar patterns of change blindness seen across primate species when variables such as search display and mask duration are adjusted suggests similarity in the underlying mechanisms responsible for change detection across the primate order, but this must be verified in New World monkeys to be sure.

Capuchins (Sapajus [Cebus] apella)
were chosen for the present study as they are perhaps the most appropriate New World monkey species to compare with humans and Old World primates given several similarities between these taxa, including capuchins' apparent convergent evolution with humans with respect to brain size and several social behaviors, including cooperation. Capuchins, who are frequently used in cognitive and behavioral research, live in complex social groups in which they are known to cooperate (Brosnan, 2011;de Waal & Davis, 2003;Hattori, Kuroshima, & Fujita, 2005;Perry, Manson, Dower, & Wikberg, 2003), share food (de Waal, 1997(de Waal, , 2000, and exhibit prosocial behavior under some circumstances (Lakshminarayanan & Santos, 2008). Part of living in a social group involves monitoring the location and activities of group mates, and this is particularly true in species like capuchins that have dominance hierarchies in which relationships vary from one individual to another. As such, detecting changes to the location or activity of a group mate is important for social decision-making. Additionally, capuchins are highly intelligent monkeys, capable of using tools (Fragaszy, Visalberghi, & Fedigan, 2004;Ottoni & Mannu, 2001;Visalberghi & Trinca, 1989;Westergaard & Fragaszy, 1987) and possessing aspects of metacognition (Beran & Smith, 2011;Fujita, 2009) and numerosity (Judge, Evans, & Vyas, 2005). Capuchins also boast an impressive brain-to-body-size ratio, which is equivalent to that of chimpanzees (Dunbar, 1992;Gibson, 1986). Accordingly, capuchins have been successful trained to use computerized touch screens (e.g., McGonigle, Chalmers, & Dickinson, 2003) or joysticks enabling the monkeys to control a cursor on the screen (Evans, Beran, Chan, Klein, & Menzel, 2008) to complete an array of computerized cognitive tasks (e.g., Beran, 2008;Fragaszy, Johnson-Pynn, Hirsh, & Brakke, 2003;Leighty & Fragaszy, 2003). Computerized testing provides an added benefit to the present study, as unlike earlier studies that relied on just two or 5 three subjects, the design of our facility and our subjects' experience with computerized testing enabled us to test a larger number (nine), shedding light on the individual differences seen in change blindness.
Given the range of study designs and variation in the category and complexity of stimuli used in change detection tasks, a single task is unlikely to yield generalizable results. Accordingly, using identical methodology to present subjects with stimuli of differing complexities and utilizing multiple types of changes is essential to understanding the mechanisms underlying change blindness. Black and white linedrawings or experimenter-made checkerboard designs were therefore used as stimuli in the test phases. Likewise, the type of change varied from one test to another, ranging from full stimuli changes (Test 1) to subtle occlusion changes (Test 2) to both difficult (Test 3) and relatively easier (Test 4) location changes of a single feature within the stimuli.
In the present study, subjects were presented with systematically varied durations of a search display (i.e., original stimulus) and mask (i.e., blank screen) to determine if capuchins experienced change blindness comparably to the rest of the primate order. Aside from providing information on capuchins' propensity for change blindness, these results may prove useful in determining the ideal duration of these two parameters in future research into primate change detection and visual working memory, as well as provide reference points to compare the capuchins' performance with that of other species. Due to the increased time to attend to and encode the stimulus (Alvarez & Cavanagh, 2004;Eng et al., 2005;Pashler, 1988), we predicted that as the duration of the search display increased, change detection accuracy would also increase. In line with previous research demonstrating that longer masks result in impaired change detection performance (Elmore et al., 2012;Pashler, 1988), we also predicted that as the duration of the mask increased, the monkeys' change detection accuracy would decrease. This pattern was expected for each phase of testing; however, given the increasing difficulty of the phases, subjects were expected to detect changes most accurately in the same/different phase, followed by the subtler occlusion phase, and, finally, struggle the most with the feature location changes in a checkerboard design.

Method Subjects
Twenty-two capuchin monkeys at Georgia State University (GSU) participated in the training phase of the study; however, only nine subjects successfully passed training and moved on to testing. All nine monkeys completed Tests 1 and 2, however, only six completed Test 3 and five completed Test 4 (subjects had to complete one test to move on to the next, and some subjects stopped participating in the more challenging conditions). Six of these monkeys were classified as Language Research Center (LRC) monkeys, as they had lived at GSU for at least 9 years (or their entire lives) and had considerable experience with computerized testing. The other three subjects who passed training were classified as National Institutes of Health (NIH) monkeys. They had arrived at GSU approximately one year prior to the study. In addition, while they had roughly comparable experience with cognitive and behavioral testing, they had significantly less experience with our computerized testing procedure, having been introduced to computerized testing less than one year prior to the start of this study when they first arrived at GSU. All subjects (both GSU and NIH) were mother-reared in captivity and had lived in mixed-sex social groups for their entire lives, providing them with species typical social exposure (see Table 1). All were housed in large, stable, mixed-sex and mixed-age social groups in indoor/outdoor enclosures with extensive environmental enrichment (climbing structures, ropes, toys, etc.). However, two months prior to the start of the study, two male capuchins (Liam and Albert) had to be removed from their larger social group for husbandry reasons and pair-housed together in the colony room for the duration of the study. Outdoors, the monkeys had visual and auditory access to neighboring groups. Indoors, two groups (Nkima's and Griffin's groups) had visual and auditory access to each other. The monkeys were never food deprived (except for veterinary necessity for their own well-being) and received chow, fresh fruits, and vegetables throughout the day on the same schedule, in addition to any food rewards during research and regardless of whether or not they chose to participate in testing. All monkeys had ad libitum access to water, including during test sessions, and subjects were trained to voluntarily separate for short periods of time from their group for cognitive and behavioral testing. Monkeys were never restricted from food, water, social contact, or outdoor access to encourage participation in the study. The LRC is fully accredited by the Association for Assessment and Accreditation of Laboratory Animal Care, and all procedures were approved by the Institutional Animal Care and Use Committee of GSU (IACUC) and were in accordance with the Association for the Study of Animal Behaviour/Animal Behavior Society's guidelines for the use of animals in research and the American Society of Primatologist's guidelines for the use of non-human primates.

Materials
The monkeys were tested using the LRC's Computerized Test System comprised of a personal computer, digital joystick, 17-inch color monitor, and pellet dispenser. The test program was written in Python. Contacting the appropriate stimulus with the joystick-controlled cursor resulted in a food reward of a single 45-mg banana flavored pellet (Bio-Serv, Frenchtown, NJ). Auditory feedback was also provided for all responses. (Details of the testing system can be found in Evans et al., 2008). The six LRC subjects had extensive experience (e.g., at least five years) with computerized tasks requiring the use of a joystick to manipulate a cursor on screen, while the other three NIH subjects had significantly less experience (e.g., less than one year) completing computerized tasks.

Stimuli
Unlike all other species tested on change blindness tasks to date, male capuchins and many females are dichromats, unable to discriminate between red and green, limiting the types of stimuli that may be appropriately used (Gomes, Pessoa, Tomaz, & Pessoa, 2002). Thus, although they can see colors, they do not perceive them the way that humans do, complicating the experimental design because it is unclear how different these colors appear to capuchins and therefore how large of an effect a change in color is for them. Accordingly, whereas macaques have been tested using arrays of colored squares and clip art (e.g., Elmore et al., 2011Elmore et al., , 2012Elmore & Wright, 2015;Heyselaar et al., 2011), subjects here were presented with only black and white stimuli for testing (i.e., Snodgrass line drawings; Snodgrass & Vanderwart, 1980). These line drawings have been regularly used in psychological testing and, importantly, have been normed on visual complexity, as well as familiarity, name, and image agreement for human memory research (although of course familiarity would be different for the capuchins). During testing, subjects were tested with different sets of stimuli than during training to avoid an experience effect; however, given that all image sets are black and white line drawings, training performance was expected to transfer to the novel stimuli used for each testing phase. Snodgrass and Vanderwart (1980) line drawings were used as the stimuli for same/different training. Subjects were trained to select the "change" icon when the stimuli were different or the "no change" icon when the stimuli remained the same. The stimuli for same/different testing, Test 1, involving variable search display and mask durations, included Nishimoto, Ueda, Miyawaki, Une, and Takahashi's (2012) set of 360 line drawings that, like the Snodgrass drawings, were also normed for numerous variables, including visual complexity. The first phase of change detection testing, Test 2, in which small sections of line drawings were occluded, utilized Bonin, Peereman, Malardier, Méot, and Chalard's (2003) set of 299 line drawings, which have also been normed for numerous variables, including visual complexity. The changes here were fairly subtle and differed from one line drawing to the next, resulting in somewhat limited experimental control. In the remaining change detection tasks, subjects were tested on feature location changes. Here, a 4-by-4 checkerboard design was presented, initially with 8 black "checkers" (i.e., black circles) randomly placed in 8 of the 16 possible squares on the grid (Test 3). Test 4 was identical except only 2 black checkers were randomly placed on the grid rather than 8. Following the search display and ensuing mask, on half of all trials, one of the checkers changed location to an available adjacent square. Thus, although the change was still relatively subtle in the final two phases, the nature of the change was extremely well controlled.
The two types of change detection testing (i.e., occlusion change or location change) were intended to complement one another with regard to internal validity to present the first investigation of capuchins' ability to detect multiple types of changes under varying levels of experimental control. In the occlusion phase, Test 2, the potential change was either an addition or subtraction to a line drawing. The change in the feature-location-change phases, Tests 3 and 4, never added or removed parts of the checkerboard, but instead one feature of the checkerboard (i.e., one checker) changed location. These tasks were chosen based on the types of situations primates encounter and monitor in the wild, such as the appearance or disappearance of a predator or group member or a predator or group member moving nearby.

Same/Different Training Procedure
Prior to testing, monkeys were trained to indicate whether or not a stimulus changed ( Figure 1). Each trial began once the subject used the joystick to move the cursor to a start box in the center of the screen, at which point the cursor disappeared and one Snodgrass line drawing appeared in the center of the screen. The stimulus remained visible for 5 s, at which point it was either immediately replaced by a different line drawing from the image set or no change occurred, and the original drawing remained visible. As the monkeys could not be forced to attend to the computer screen, the 5-s search display was chosen to provide the monkeys ample time to view the stimulus during the training phase. This search display duration is also the same as what was used in training for previous change detection tasks with another primate species, rhesus macaques (Elmore et al., 2012;Leising et al., 2013).
At this point, the cursor reappeared between two distinct icons that indicated "change" and "no change." The change icon was a dotted blue square that always appeared on the subject's left side of the screen, while the no change icon was a hashed yellow circle that always appeared on the right (the sides were not counterbalanced so that location could be another cue for each icon's meanings). Subjects had up to five seconds to make a selection. Correctly selecting the blue square when a change had occurred or the yellow circle when no change had occurred resulted in a food reward (pellet) and auditory feedback (ding), followed immediately by the start screen. Choosing incorrectly resulted in no food reward, negative auditory feedback (buzz), and a 20-s timeout (gray screen) before the start screen reappeared. If no selection was made within the 5-s window, the program reverted back to the start screen. Each day, subjects received an unlimited number of session blocks, each consisting of 120 trials, until criterion was met. Subjects were required to achieve 80% accuracy on the final completed session block on two consecutive training days to move on to testing.

Same/Different Supplemental Training Procedure
Eighteen of the 22 subjects either exhibited a persistent side bias or struggled to learn the task, and so were switched to a simpler training task ( Figure 2). This supplemental training involved identical methods as the initial training; however, rather than using randomized Snodgrass line drawings, a total of six differently colored geometric shapes were used as stimuli. These shapes differed in color despite capuchins' typical dichromatism to enhance their distinctness as capuchins do see color, although we do not necessarily know how different colors appear to them (Gomes et al., 2002). Criterion remained at greater than 80% accuracy on the final completed session blocks on two consecutive days. Once criterion was met, subjects were still required to meet criterion on the initial training with line drawings before moving on to testing. Subjects who continued to struggle with these much simpler six stimuli received a further modification to the training module in which the display time was reduced from 5 s to 1 s (during training only) in an attempt to improve the monkeys' attentiveness to the screen. As the subjects were able to complete as many trials as they chose each day and did not all run on the task the same number of times per week, subjects were given approximately four months from when they began supplemental training to meet training criterion rather than a set number of trials before being dropped from the study. Accordingly, the number of training trials completed varied considerably both between subjects and from day-to-day within subjects. Throughout this time, subjects ran on the training module a minimum of two days per week if not more frequently. In all, nine subjects (five of whom required supplemental training) ultimately passed the training phase and moved on to testing. All four subjects who did not require additional training were LRC monkeys and required 2,384-6,288 training trials to meet criterion. The two LRC and three NIH subjects who did require additional training completed 35,506-87,428 total training trials to meet criterion. The remaining 13 subjects never met criterion, even in the supplemental training phase, despite extensive training sessions (comprising 14,156-73,889 total training trials), first with a 5-s search display and line drawing stimuli, then with a 5-s search display and colored geometric shape stimuli, and finally with a 1-s search display also with colored geometric shape stimuli.

Test 1: Same/Different Change Detection Testing Procedure
Testing relied on a nearly identical procedure as training; however, a blank screen of different durations was inserted as a mask between the search display and test display ( Figure 3). Search display duration (i.e., length of time the initial stimulus was visible) and the duration of the mask between search and test displays were varied systematically. Search display lengths were selected based on the range of times used in previous change detection research and included 250 ms, 500 ms, 1,000 ms, 2,500 ms, and 5,000 ms. This combination was chosen given that extremely short search displays may rely solely on attentional capture, whereas longer search displays may primarily rely on visual short term memory. Accordingly, making use of a range of search display lengths in conjunction with varying the duration of the mask should help establish under which conditions change blindness may be attenuated (i.e., attentional capture or short term memory). Furthermore, varying the duration of the search display helps determine if capuchins exhibit patterns of reduced change blindness as the duration of the search display increases, as has been seen in some human change blindness research (e.g., Eng et al., 2005).
Similarly, the duration of the mask was also varied within the range of times typically used in previous research, consisting of 0 ms, 50 ms, 100 ms, 250 ms, 500 ms, and 1,000 ms, with the 0 ms condition serving as the control. This variation was done to determine if capuchins exhibit change blindness similarly to humans, macaques, and pigeons, all of whom show change detection accuracy that decreases as the duration of the mask increases (Elmore et al., 2012;Eng et al., 2005;Leising et al., 2013;Pashler, 1988). Additionally, determining the mask durations that both maximize and minimize change blindness may provide insight into the role of executive control and attention in change detection.
Subjects completed 120-trial session blocks consisting of 4 trials of each possible combination of search display and mask duration. Trials occurred in a randomized order as determined by the computer program. Subjects were able to complete as many sessions as they chose to per day, and data from incomplete sessions were discarded. The monkeys thus needed to complete at least one entire 120-trial session block per day for their data to be analyzed. Subjects completed a total of 40 session blocks over as many test days as they required. This resulted in 4,800 total trials or 160 trials of each possible combination of search display and mask duration per subject.
As with training, subjects first needed to move the cursor to a start box prior to each trial to initiate the trial and focus their attention on the trial. Once the start box was contacted, a line drawing would appear in its place and remain visible for a predetermined duration (i.e., the search display length), at which point the screen would go blank for a predetermined duration (i.e., the mask length). Following the mask, either the same line drawing or a new line drawing appeared where the previous stimulus had been, while the "change" and "no change" icons also appeared on either side of the lower half of the screen, with the cursor reappearing between them. Subjects then had five seconds to move the cursor to indicate whether a change occurred or not, with the dotted blue square still signifying that a change had occurred and the hashed yellow circle still signifying that no change occurred. As before, correct responses resulted in a food reward (pellet) and auditory feedback (ding), followed immediately by the start screen. Choosing incorrectly resulted in no food reward, negative auditory feedback (buzz), and a 20s timeout (gray screen) before the start screen reappeared. Accuracy (i.e., correctly indicating a change did or did not occur) was collected on each trial to analyze with respect to search display duration and mask duration. The next task, Test 2, involved occlusion changes and consisted of far subtler changes than the entire stimulus change that occurred in same/different testing in Test 1 (Figure 4). Subjects were initially presented with a black and white line drawing from the Bonin et al. (2003) stimulus set. The stimulus appeared as originally drawn on half of all trials, while on the other half of trials, the stimulus appeared with a small section occluded (i.e., whited out). The occluded sections were chosen by the experimenter based on the nature of each line drawing and thus lacked a considerable degree of internal validity between stimuli, though all subjects were presented with the same original and changed stimuli. Following the predetermined search display duration and mask duration, the same stimulus would reappear. If subjects were initially presented with an unaltered stimulus, then, following the mask, a small section of the stimulus would appear occluded (i.e., subtraction change) on half of the trials, while the stimulus would reappear unaltered on the other half of these trials (subtraction no-change). If, however, subjects were initially presented with a partially occluded stimulus, then, on half of the trials, the same occluded stimulus reappeared following the mask (addition no-change), while on the other half of trials, the same stimulus reappeared but no longer occluded (i.e., addition change). At this point, the subjects once again selected either the "change" or "no change" icon. If the monkeys failed to make a selection within the 5-s window, then the program returned to the start screen. Subjects again completed a total of forty 120-trial session blocks resulting in 4,800 total trials. Thus, subjects completed a total of 1,200 trials each (40 of each combination of search display and mask durations) of subtraction change, subtraction no-change, addition change, and addition no-change.

Tests 3 and 4: Feature Location Change Detection Testing Procedure
The final change detection tests utilized identical procedures as in both the same/different testing (Test 1) and the occlusion change detection testing (Test 2). However, rather than an entirely new stimulus or a portion of the stimulus being occluded, a feature of the stimulus (i.e., one checker on a checkerboard) changed location in half of the trials (Test 3; Figure 5). In Test 3, 8 checkers appeared randomly placed on 8 of the 16 possible checkerboard squares. Following the predetermined search display and mask durations, the same checkerboard reappeared. On half of the trials, following the mask, the 8 checkers remained in the same locations as during the search display. On the other half of trials, following the mask, 1 of the 8 checkers moved to one of the empty squares adjacent to it (i.e., above, below, left, or right). Subjects then indicated whether or not a change had occurred by moving the cursor to the dotted blue square if a change had occurred or the hashed yellow circle if no change had occurred.
Subjects again received 40 session blocks of 120 trials, with a change randomly occurring on half of them and no change occurring on the other half. If the monkeys failed to make a selection within the 5s window, the program returned to the start screen. Each session entailed presenting the same set of 120 unique checkerboards consisting of eight randomly placed checkers. For each of the 120 checkerboards used as the search display, 5 identical checkerboards were created for the test displays with one checker moving to an adjacent empty square. Given the difficulty of detecting a change to one of eight possible checkers, subjects were subsequently retested using identical procedures; however, this time, two checkers were used rather than eight (Test 4; Figure 6). One hundred and twenty unique checkerboards consisting of each possible combination of checker locations on the four by four grid were used. Similar to the eight checker version, the potential change (i.e., which checker moves and where it moves) was different across sessions, such that there were three different possible changes for each original checkerboard presented. Three possible changes were used here rather than the five used in the eight-checkers task because several potential locations in this configuration have only three potential changes that can be made.

Group Results
To explore performance at the group level, we combined all subjects' data for each test phase, respectively. We then conducted a generalized linear mixed model for each test phase after transforming the variables search display duration and mask duration to comparable scales. The five search display durations ranging from 250 ms to 5,000 ms were recoded as 1 to 5, and the six mask durations ranging from 0 ms to 1,000 ms were recoded as 1 to 6. Search display duration, mask duration, and the interaction between the two were used as fixed effects, and subject was used as a random effect in the model to predict the binary outcome of whether the subject chose correctly (i.e., whether a change, or lack thereof, was accurately detected).
Test 1: Same/different change detection. For the nine subjects who completed Test 1, the overall model predicted change-detection accuracy significantly better than the null hypothesis, χ 2 (3) = 274.37, p < 0.01. Both search display duration, β = 0.15, z = 8.27, p < 0.01, and mask duration, β = 0.05, z = 3.31, p < 0.01, were significant predictors of accuracy; however, the interaction between the two was not (Table 2). Comparing means (Figure 7), subjects performed best (80% correct) when the search display was 5,000 ms and the mask was any duration other than 0 ms. Subjects performed worst (65% correct) when the search display was 250 ms and the mask was 0 ms; however, they nonetheless consistently performed above chance levels (50%) across conditions. Overall, no learning effects were found, nor was there any evidence for age or sex differences; however, Test 1 results did reveal intriguing differences between sets of subjects. The six subjects (LRC) who had extensive computerized testing experience performed considerably better than three other subjects (NIH) who had significantly less computerized testing experience. Specifically, these three NIH monkeys correctly detected whether or not a change occurred on 62.19% of trials (see Figure 8 for mean accuracy by condition), regardless of condition, whereas the LRC monkeys' mean accuracy was 80.21% (see Figure 9 for mean accuracy by condition). Additionally, unlike the LRC monkeys, the other three subjects' change-detection accuracy did not decline significantly when there was a 0 ms mask and a short search display (250 ms, 500 ms, 1,000 ms). We therefore decided to rerun the analyses, this time excluding the three subjects whose performance seemed to remain relatively consistent regardless of search display and mask duration. It is important to note that all nine subjects received extensive pretraining, completing 2,384-87,428 total training trials over the course of several months before meeting criterion, with two LRC monkeys and three NIH monkeys requiring supplemental training. Considering only the six LRC monkeys who completed this task, the overall model predicted change detection accuracy significantly better than the null hypothesis, χ 2 (3) = 409.56, p < 0.01. Both search display duration, β = 0.24, z = 9.97, p < 0.01, and mask duration, β = 0.10, z = 4.92, p < 0.01, were significant predictors of accuracy; however, the interaction between the two was not (Table 3). Excluding the three NIH subjects from the analysis resulted in an improved model (AIC = 28032, BIC = 28073) compared to when all subjects were included (AIC = 47117, BIC = 47160). Test 2, the overall model did not predict change detection accuracy significantly better than the null hypothesis, and neither predictor variable nor their interaction were significant predictors of change detection accuracy (Table 4). Collectively, subjects were most accurate (55%) when the search display was 2,500 ms and the mask was 250 ms and least accurate (51%) when the search display was 500 ms and the mask was 0 ms (Figure 10). Additionally, unlike with Test 1, there were no differences between the six LRC monkeys and three NIH monkeys in Test 2.  Test 3: Feature location change detection. For the six subjects who completed Test 3, the overall model did not predict change detection accuracy significantly better than the null hypothesis, and neither predictor variable nor their 18 N = 9 interaction were significant predictors of change detection accuracy (Table 5). Subjects performed at approximately chance (50%) levels across all combinations of search display and mask durations ( Figure 11).  Test 4: Simple feature location change detection. For the five subjects who completed Test 4, the overall model did not predict change detection accuracy significantly better than the null hypothesis, and neither predictor variable nor their interaction were significant predictors of change detection accuracy (Table 6). Subjects performed at approximately chance (50%) levels across all combinations of search display and mask durations (Figure 12).

Individual Results
To explore results at the individual level, we ran a binary logistic regression for each subject to determine the effects of search display duration and mask duration on 20 Seach Display Duratio n (ms) N = 5 change detection accuracy. Search display duration, mask duration, and their interaction were included in the model. The longest search display duration (5,000 ms) was used as the reference contrast, as we predicted that subjects would detect changes most accurately when they had longer to view the stimulus. The shortest mask duration (0 ms) was chosen as the reference contrast, as, based on previous work, we predicted subjects would perform their best when the change was not masked (Rensink et al., 1997). As significant effects were only found for Test 1, individual results for Tests 2, 3, and 4 are reported in the Supplemental files. Number and percentage of correct trials per subject, test, and condition are also included at the end of the Supplemental files (Tables 36-40).

Discussion
To explore the role of search display duration, mask duration, and their interaction on change blindness and change detection, 22 capuchin monkeys were presented with a change detection training task. Nine monkeys successfully passed training and completed a same/different change detection task (Test 1) and an occlusion change detection task (Test 2). Of these nine, six also completed a feature location change detection task (Test 3), and five completed an additional feature location change detection task (Test 4). Whereas there were significant effects of both search display and mask durations on accuracy in the relatively simple same/different test, there was no interaction between the two. Moreover, we found no significant results on the three more complex tasks that involved within-stimulus changes (feature additions, feature subtractions, or feature location changes) to individual features within the display. We first consider the results of the same/different task, as well as the group difference we found, and then discuss why we suspect the monkeys struggled on the subsequent tasks.
In the simplest task, the same/different change (Test 1), subjects had to correctly indicate whether or not a stimulus changed to an entirely new stimulus (the alternative was that it remained the same) following the presentation of a mask. Overall, subjects detected changes significantly more accurately with the longest search display (5,000 ms), followed closely by the second longest search display (2,500 ms), than the shorter search displays (250 ms, 500 ms, 1,000 ms). These results are in line with change detection findings from humans (Pashler, 1988), chimpanzees (Tomonaga & Imura, 2015), and pigeons (Herbranson & Davis, 2016) and suggest that the monkeys performed better when they had longer to encode the stimulus into their visual working memory, likely resulting in a stronger memory trace of the stimulus than when presented with shorter search displays.
Mask duration was also a significant effect; however, these results are much harder to interpret and are counter to what we predicted, which was based on what others have found. Across individuals, change detection accuracy was significantly worse when the mask was 0 ms, which is to say that subjects performed better when there was a mask compared to when there was no mask. Excluding the 0 ms mask, there were no significant differences among the other mask durations. These findings directly contradict previous research on change blindness, in which subjects struggled when a mask was inserted to hide the change, not when the change was unmasked (humans: Eng et al., 2005;Phillips, 1974;Rensink et al., 1997;chimpanzees: Tomonaga & Imura, 2015;macaques: Elmore et al., 2012;Leising et al., 2013;pigeons: Herbranson et al., 2014;Leising et al., 2013). Indeed, the purpose of the mask is to mimic an eye blink or saccade during which a change may transpire without being detected. As such, the mask obscures the change as it occurs so that subjects cannot rely solely on where they detect movement to detect the change. Instead, subjects must attend to and encode the stimulus, then maintain a trace of the stimulus throughout the duration of the mask, and finally decide whether or not the stimulus changed based on their memory trace and the test display. Longer masks are therefore more difficult as they require subjects to retain the trace in their visual working memory for longer, during which time the trace may decay. Moreover, the training should have biased subjects towards performing better with no mask, as there was no mask in any of the training phases subjects completed. Accordingly, there was no need to generalize or learn new contingencies in the trials without a mask.
We do not know why this is the case but have several thoughts. First, and most obviously, this finding suggests that we did not actually induce change blindness in the monkeys. In addition, even if we did not induce change blindness, it still seems intuitive that trials should be more difficult with a mask than without one. An alternative explanation could therefore be that, compared to the 5,000 ms search display used in training, the usage of shorter search display times and no mask meant that these trials occurred too quickly for the monkeys to adequately attend to, encode, and retain a trace of the search stimulus in order to make an informed selection when prompted with the test display. Subjects may thus have learned from training that they did not need to instantly attend to the search stimulus when it appeared as they had 5,000 ms to do so. Then, in test trials that occurred more quickly, the monkeys may have failed to sufficiently attend to and encode the search stimulus when it was visible for shorter durations. Moreover, it is possible that subjects may not have realized that there even was a change occurring in the 0-ms mask condition. Thus, subjects may not have realized the stimulus changed because they failed to notice the appearance of a new stimulus, having instead attended only to the test display. Another possibility is that either the mask itself or the flicker effect created by alternating from search display to mask to test display was more attention-catching to the monkeys than the stimuli themselves. In this case, subjects' performance could theoretically have been due to a failure to attend to the appropriate stimulus rather than a failure to encode, retain a memory trace, and make a decision.
Subjects also did very poorly on the three tests following the same/different task, regardless of search display time or mask duration (including the 0-ms mask condition). Given the absence of significant results beyond the same/different test and individual and group change detection accuracies that were functionally at chance on the next three tasks, we think it is likely that the monkeys found these three tasks too difficult and were guessing on these trials. Task difficulty may also help explain why some subjects stopped participating in Tests 3 and 4. Considering that there were only two possible options, guessing was both less cognitively taxing and nearly as effective a strategy as attending to the task and recalling the details of the search display when the change that may have occurred was not obvious. This strategy seems plausible as capuchins have been shown to rarely, if ever, make use of uncertainty responses when presented with difficult trials (Beran, Smith, Coutinho, Couchman, & Boomer, 2009) and appear to be more tolerant of the risk of guessing and getting a trial incorrect than apes and macaques in at least some situations (Beran, Perdue, & Smith, 2014). It is also worth noting that, in the wild, capuchins may be more attuned to global changes, such as the appearance of a predator, than changes to small details of their environment, such as a once straight twig bent in half. Accordingly, it is possible that the global changes that occurred in Test 1, in which the entire stimuli did or did not change, were theoretically a more species-appropriate task for the monkeys than the ensuing tests that relied on smaller, local changes.
Moreover, if subjects are metacognitively aware that they do not know the answer, guessing could be viewed as a superior strategy as it requires less energy than attending to the task and retaining a memory trace of the stimulus. Evidence for metacognition in capuchins is extremely variable. Studies rarely find evidence for all subjects, and there is typically substantial variation within individuals as well (Beran, Perdue, Church, & Smith, 2016;Beran & Smith, 2011). While these results remain inconclusive, it has nonetheless been argued that capuchins do indeed possess at least a rudimentary form of metacognition (Vining & Marsh, 2015). It is at least possible that they were aware that the task was hard, so then chose not to learn it given their high probability of reward without having to try (Schubiger, Kissling, & Burkart, 2016). However, the extent to which this ability was used in the present study is unknown, as there is no way of knowing whether any guesses were actually a result of uncertainty monitoring and metacognition.
Another possibility is that the stimuli that were used in Tests 2-4were too complex for the monkeys to encode sufficiently in order to then detect a change, especially one as subtle as occluding just a small portion of the stimulus or moving only one of eight checkers on a checkerboard. Though capuchins are typically able to perform relatively well on delayed match-to-sample tasks (Truppa, Mortari, Garofoli, Privitera, & Visalberghi, 2011;Truppa, De Simone, Mortari, & De Lillo, 2014), one recent study found no evidence that capuchins monitor detailed contents of their memory traces (Takagi & Fujita, 2018). Accordingly, it may have been worthwhile to first conduct a delayed match-to-sample task using the stimuli used here to ascertain whether or not the capuchins were able to recall enough details of the sample stimulus to then match it with one of the match stimuli. If the capuchins were unable to do so, then that would have been evidence that less complex stimuli were needed.
Importantly, it has been argued that focused attention is required to see change (Rensink et al., 1997), and there is no way of knowing whether the subjects were reliably attending to the task, let alone focusing their attention on the potentially changing stimuli. This limitation is particularly troublesome when the test stimuli are complex, as these stimuli have more details to encode. Thus, if the subjects failed to focus their attention on both the search display stimulus and the possibly changed test display stimulus, they would not be expected to detect whether or not a change occurred greater than chance levels. Moreover, subjects may have overcome any failure to adequately attend to the task in the same/different test as they only needed to encode and recall minor details of the test stimulus to then determine if a change occurred. However, when the change became more complex in Tests 2-4, a similar failure to focus one's attention may have resulted in subjects guessing if a change occurred, as they were unable to encode sufficient details of the search display stimulus to then ascertain if the test display stimulus included a change.
We also anticipated that the capuchins would generalize from training to Tests 1-2, but it is possible they were unable to generalize to the occlusion phases or the checkerboard phases despite the continued presence of the change and no-change icons. In both cases, the change went from the entire stimulus changing to a relative subtle change within the stimulus. Accordingly, if the monkeys were expecting the entire stimulus to change, they may have failed to carefully attend to the details of the stimulus -because they had not previously needed to do so -and ultimately became frustrated when they could not figure out why "no change" was not the correct response half of the time. In particular, if this were combined with low working memory or difficulty in remembering details with precision, the subjects may never have even realized that changes were occurring. Depending on the research question, future studies may also benefit from including a training phase in which the stimuli are relatively simply geometric shapes in which only a small component of the shape changes rather than the entire shape.
An additional, albeit we believe unlikely, potential explanation for these results is that capuchins do not experience change blindness. The visual systems of New World monkeys are known to vary from species to species and between New and Old World monkeys (Gomes et al., 2002). Accordingly, while it seems improbable based on previous nonhuman change detection studies, the visual systems of capuchins may function in such a way that they do not experience change blindness as other species do, if they even experience it at all. Clearly, while this cannot be excluded, it is also not a conclusion that we feel should be drawn from the current data.
As always, additional research is needed to understand if and how capuchins experience change blindness. Given these results, future studies should ensure that a sufficiently long search display is used so that subjects have enough time to encode and recall memory traces of the stimulus or stimuli, ideally pre-testing the stimuli with a delayed match-to-sample task. Future studies may also utilize a different paradigm, such as one item in an array of three or four stimuli changing or using an eye tracker to record search paths and training subjects to fixate on the location of the change as has been done with macaques (Chau et al., 2011). The flicker paradigm should also be tried in addition to the one-shot paradigm used here to determine if providing subjects with multiple viewings of the change improves change detection accuracy in capuchins, as it does in other species. Further modifications to the type of change occurring (i.e., addition, subtraction, movement, etc.), the type of mask (i.e., blank screen, distractor images, etc.), and the type of stimuli (i.e., clip art, faces, etc.) may provide further insight into how capuchins experience change blindness. In particular, less complicated stimuli should be used first, and it may also be useful to require subjects to pass multiple training sessions, for instance, slowly building up the complexity of the stimulus or number of stimuli in an array.
Finally, our data suggest a note of caution when considering the role of experience in shaping subjects' cognitive performance. Our subjects had remarkably similar histories, yet still showed differences in performance. Specifically, all subjects were from long-term, mixed-sex, stable social groups comprised of species typical arrangements (i.e., matrilines), and all had been exposed to cognitive and behavioral testing throughout their lives (albeit different studies using different procedures). All had lived in the same facility for more than a year, experiencing the same husbandry and enrichment as well as the same testing regimes and, importantly, the same computerized testing that was used in the current study. All underwent extensive training, comprising thousands of trials over six or more months, and, based on which monkeys required supplemental training and which failed to complete Tests 3 and 4, it appears that they responded largely similarly to the training (i.e., both LRC and NIH monkeys required additional training, although only LRC monkeys met criterion without training). Nonetheless, they performed differently, suggesting that the LRC monkeys experienced the task differently than the NIH monkeys. We cannot determine what this difference was based on given these data, but possibilities include that the LRC monkeys were more attentive to the task, were more experienced in determining which features of the trial were important, and/or were more motivated to maximize their outcomes. It will be important in future work to discriminate some of these possibilities, not only to determine how context influences cognition, but to help interpret situations in which different results emerge for the same test run in different labs (which likely differ on far more features than the two groups of monkeys in our study).
In sum, despite very few significant results, 9 out of 22 of the capuchins were able to learn how to indicate whether or not a change occurred, suggesting that additional change blindness research with capuchins is feasible. As expected, the capuchins' change detection accuracy improved in the longest search display conditions in the present study. However, the capuchins' change detection accuracy was unexpectedly poorest on trials without a mask when subjects were predicted to be most accurate. Given these findings, replicating the paradigm used here with other species as well as presenting capuchins with different change detection paradigms is necessary to better understand these results and the evolution of change blindness more broadly.