Rats’ Choice in a Coordination Task

We designed a free-operant choice procedure that represents a technical improvement to assess the control of mutual reinforcement contingencies over the choice of coordinated behavior. We demonstrate the advantages of the new procedure with 8 rats that were trained to continuously move a steel ball from end to end of a gutter. Subjects were assigned to pairs and had to choose between 2 response options: 1 in which reinforcement was contingent upon an individual response and another in which reinforcement depended on the coordination of intra-pair behavior. We evaluated (a) the effect of reinforcement magnitude over the distribution of responses and (b) the role of behavioral cues on the rats’ coordinated actions via dividing the experimental chamber in 2 compartments with a clear/opaque partition. The coordinated actions were more likely when the larger reinforcer was initially associated with the mutual reinforcement option. The visual interaction between subjects did not impact their coordinated actions. The possibility to control organisms’ preference for social or nonsocial alternatives opens potential lines of research, such as identifying how the coordination of activities combines with the future value of outcomes to produce stable cooperative equilibria.

Cooperative behavior has been defined as "joint action for mutual benefit" (Dugatkin, 1997;Mesterton-Gibbons & Dugatkin, 1992). After decades of research, two approaches seem to predominate. The first one is centered on economic outcomes for each participant from cooperating with others "mutual benefit" (e.g., Axelrod & Hamilton, 1981;Clements & Stephens, 1995), and the second is focused on the behavioral patterns that occur during a cooperative episode "joint action" (e.g., Boesch & Boesch, 1989;Roberts, 1997;Scheel & Packer, 1991). Both approaches also differ in their methodologies. Researchers who follow an economic perspective have mostly used experimental choice procedures, such as the prisoner's dilemma game, in which the subjects are separated in adjacent chambers and each one receives one of four possible rewards by choosing cooperation or defection. The payoff matrix is typically manipulated, focusing on strategy selection, such as cooperating, defecting, or competing. Researchers' efforts have been mainly oriented towards the construction of formal models, such as those based on game theory (Baker & Rachlin, 2002;Green, Price, & Hamburger, 1995;Skinner, 1962;Stephens, McLinn, & Stevens, 2002). In contrast, researchers interested in behavioral social patterns have used the cooperative problem solving task, in which both subjects are required to respond coordinately by pulling a string to reach a receptacle that holds mutual reinforcers (Crawford, 1937). Researchers who follow this perspective have focused on how subjects coordinate their actions in tasks that cannot be solved individually, in which constraints require each subject to control the behavior of its partner (Chalmeau, Lardeux, Brandibas, & Gallo, 1997;Drea & Carter, 2009;Łopuch & Popik, 2011;Petit, Desportes, & Thierry, 1992;Seed, Clayton, & Emery, 2008;Tan & Hackenberg, 2016).
Economic and behavioral social approaches have generally analyzed the cooperation from dissimilar perspectives and studied it differently. We developed a task aimed to integrate the strength of both perspectives. Our procedure allows identification of the dynamics of the trajectory of the behavioral adjustments that lead to stable cooperative equilibria and exploration of the situations in which stability is not reached. We assessed (a) the effect of reinforcement magnitude over the individual and coordinated rates of responses (like economic researchers propose) and (b) the role of behavioral cues over the subjects' coordinated actions (like behavioral patterns researchers suggest) via dividing the experimental chamber into two compartments with a clear/opaque partition. In our task, pairs of rats had two concurrent response options. For one option, access to consequences depended solely on individual behavior (i.e., independent of the behavior of the other subject). For the other option, access to mutual consequences depended on the coordinated actions of both subjects. This is the first attempt to understand how mutual outcomes select cooperative strategies -why organisms cooperate -and how they allocate their behavior in the presence of other nonsocial sources of reinforcement (i.e., independent of the behavior of others). As Schuster and Perelberg (2004) suggested, the ideal procedure is one that dissects the social and nonsocial components of behavior to analyze their effects on preferences separately.
We manipulated the amount of reinforcers obtained for both response options to assess the hypothesis that cooperation arises when an individual acting alone would not obtain as many reinforcers as acting in coordination (Roberts, 1997;Visalberghi, Quarantotti, & Tranchida, 2000). Some studies have reported that direct interaction among subjects promotes the coordination of activities (Łopuch & Popik, 2011;Schuster, 2002;Segura & Bouzas, 2013). However, other studies have not found a clear relationship between different levels of interaction and coordinated behavior (Schuster & Perelberg, 2004;Tan & Hackenberg, 2016). To explore the impact of the partner's response-produced cues over the coordinated behavior, the apparatus was designed to allow exposure of the subjects to different barrier types (clear and opaque) to allow or restrict them from observing their partner's behavior. Furthermore, unlike studies in which the target response is instantaneous and discrete, such as pressing a button or a key (Baker & Rachlin, 2002;Green et al., 1995), or the response is continuous but lacks an operandum, such as the back-and-forth shuttling task (Schuster, 2002;Segura & Bouzas, 2013), or in which the temporal window is manipulated to reinforce synchronized responses (Łopuch & Popik, 2011;Tan & Hackenberg, 2016), we propose a free-choice task that minimizes spatiotemporal restrictions and involves subjects rolling a stainless steel ball continuously from end to end of a gutter, a response that can be performed either individually or coordinately. In the present study, we tested the task and, in doing so, demonstrated its feasibility and potential to advance the study of coordinated behavior under contingencies of mutual reinforcement.

Method Subjects
Eight 70-day-old experimentally-naïve male Wistar rats (Rattus norvegicus) served as subjects. Their weight ranged between 228 g and 264 g, and the rats were kept at 85 percent of their free-feeding weight. Deprivation level was maintained by providing postsession feeding when necessary. The rats were housed in individual home cages with continuous access to water. Room temperature was 21±2 °C, and relative humidity oscillated between 55 and 65 percent. A 12-hr light/dark cycle (light phase beginning at 07:00) was maintained throughout the entirety of the study. This research complied with Colombian laws and Animal Behavior Guidelines for Laboratory Animal Research.

Apparatus and Task
We designed an experimental chamber to study rats' choices between individual and mutual alternatives as a function of the reinforcers obtained. The apparatus was a rectangular box made of waterproofed wood measuring 80 cm long × 60 cm wide × 30 cm high, consisting of a floor base, two lateral black walls, and three clear acrylic panels measuring 80 cm long × 30 cm high. These panels divided the chamber into two compartments. An opaque panel could replace the center wall to obstruct visibility between compartments when needed.  The experimental chamber had a symmetrical design; therefore, all elements in the compartments maintained exact correspondence in size, shape, and position. Each compartment was equipped with two 40-cm-long × 2.4cm-wide aluminum gutters (response options) horizontally disposed over the floor without slant and diagonally opposed to each other. Two food dispensers were available, each located 10 cm horizontally from Point A of the gutter, with a height of 4 cm from the floor. One of the food dispensers was associated with the individual option, while the second feeder was associated with the mutual option. Two houselights were mounted on the wall, each one situated 20 cm above each feeder. Each reinforcer consisted of 40-mg food pellets custom-molded from pulverized rat food (rodent Laboratory Chow, Purina LabDiet®).
A stainless-steel ball (operandum) of 2.25 cm diameter and 55 g weight could be rolled by the rats from end to end of a gutter. The mutual gutters of both compartments were interconnected at their inner ends at the operandum transfer point, allowing the ball to circulate between compartments. The mutual gutter was twice as long the individual gutter; however, because each subject only had access to half of this distance, the response requirement in the coordinated and the individual response options was identical (i.e., 40 cm).

Procedure
The study consisted of a shaping phase with four training conditions and three experimental phases that differed in the type of partition between compartments (clear/opaque). Each experimental phase had three different combinations of reinforcement magnitudes for each response option. Subjects' partnering and assignment to a compartment were randomly established and remained constant throughout the study. Shaping and experimental sessions were conducted seven days a week at approximately the same hour of the day (10:00 AM).
Before the shaping phase, all subjects received two feeder-training sessions. The first session focused on training the approach response to the individual feeder. The second session focused on training the approach response to the feeder associated with the mutual reinforcement option. Thirty reinforcers were delivered on a VT 20-s schedule per session. At the end of the two sessions, all subjects approached the corresponding feeder when a sound signaled a pellet falling from the food dispenser.
Shaping of rolling-ball response. The target response consisted of the rat rolling the ball from one end of the gutter to the other end, continuously and without interruptions, and exclusively making contact with the ball with its two front legs throughout the entire journey (40 cm). As Timberlake (1983) has noted, "rats are predatory animals that may respond to small moving prey with a response sequence of digging, chasing, seizing, and various killing and/or food-handling behaviors" (p. 309). We chose this rolling-ball response because we assumed that it promoted hunting-related actions in this species.
Shaping consisted of four training conditions designed by Segura and Gutiérrez (2006); namely, (a) taking the ball from Point A to Point B in the individual gutter (see Figure 1B); (b) taking the ball in the opposite direction (B→A); (c) continuously displacing the ball from end to end of the individual gutter and vice versa (A→B and B→A); and (d) continuously displacing the ball in the mutual gutter (A→ intersection point and vice versa). Each displacement (e.g., A→B) was reinforced with one pellet (fixed-ratio 1, FR1). Throughout the shaping phase (37 sessions), compartments were separated by the opaque partition and subjects were trained individually. That is, coordinated actions were never trained. Each session lasted 20 minutes, during which the houselights constantly remained on.
The method of successive approximations was used to shape the target response (see Boakes, Poli, Lockwood, & Goodall, 1978). Training began by placing the steel ball at the end of the gutter (i.e., Point B). At first, any contact or approximation towards the steel ball by the rat was reinforced. Then, the experimenter placed the ball 2 cm from Point B, and the subject returned it to Point B to obtain the reinforcer. To begin a new trial, the experimenter returned the ball manually to the starting point. The response criterion (starting point) was either maintained, increased, or decreased individually as a function of the displacement performed by each subject (see Table 1). For example, at the end of Session 1, Rat 9 was displacing the ball 10 cm to Point B. The distance required increased to 17 cm in Session 2, decreased to 16 cm in Session 3, and then increased again and remained constant at 37 cm for three consecutive sessions. To change conditions, it was necessary to complete the entire journey (target criterion, 40 cm) without mistakes for two consecutive sessions.
The responses were considered mistakes if a subject pushed or handled the ball with any body part other than its front legs or if a response was temporally or spatially segmented by interrupting or stopping the journey or by reversing or abandoning the ball. In those cases, the experimenter returned the ball to the starting point, the rat was not reinforced, and a new trial began. Table 1 summarizes the order of training conditions and the number of sessions and responses to reach the successive criteria for each rat.
Experimental sessions. Once all subjects learned the target response (i.e., rolling the ball the entire journey continuously and without mistakes), the mutual reinforcement contingency was established, in which rolling the ball from one end (i.e., starting point) of the mutual gutter to the other end, from one compartment to the other one, produced the delivery of the reinforcer. To accomplish this, both subjects had to coordinate their actions in time and space; one subject should begin the journey, rolling the ball from Point A to the transference point (B), in which the other subject received the ball and rolled it to the other Point A (end point). It is important to mention that this intra-pair delivery/acceptance of the ball was non-trained. Reinforcement was contingent on both subjects completing each journey (rolling the ball) with their front legs, continuously and without interruptions, and concurring in time and space at the operandum-transfer point. When a journey was successfully completed (i.e., 80 cm), subjects simultaneously received the pellets programmed in their corresponding mutual feeder (Video 1 illustrates the performance of the rats in the coordination task. See supplementary material).
Given the continuous nature of responses in this task, subjects could abandon the option they were exploiting at any time (e.g., after completing a full journey, obtaining a reinforcer, or at any point in a journey). A switching response (from individual to coordinated or from coordinated to individual) was registered when subjects abandoned an option, headed towards the other alternative and touched the gutter and/or the corresponding operandum with their nose and/or front legs, or started a journey, failing to complete it. To register responses, we used continuous observation recording (i.e., observation procedures in which all target responses can be detected during observation periods; see Johnston & Pennypacker, 2009). The delivery of pellets was performed manually in the feeders associated with each response option.
Each experimental session comprised a total of four blocks. Namely, two forced-choice and two free-choice. Blocks were separated by a 10-s blackout, during which the experimenter placed the balls at the center of the gutters. Free-choice blocks began with one ball in the middle of the gutter, in each individual option (20 cm from either end) and with another ball in the mutual option at the transfer point (see Figure 1B). Thereby, before any successful journey, the subjects had to take the ball to the starting point (i.e., Points A or B in the individual gutter and either Point A in the mutual gutter). This was done to eliminate the experimenter's intervention. This constraint also operated at all times (e.g., when subjects either failed to complete a journey, left the ball at any place different from the end of a gutter, and/or performed a switching response). Note. d (cm) = distance in centimeters; M = target criterion with mistakes; R = number of correct responses.
Each forced-choice block (4-min long) was divided into two parts; in each part, only the coordinated or the individual response operandum was available. In individual forced-choice trials, we used two balls, each one located at the midpoint of the individual gutters. In mutual forced-choice trials, we used only one ball at the transfer point. In each session, random selection (without replacement) determined which alternative (individual or mutual) began the first block. This selection was reversed in the second block and applied for all subject pairs. Forced-choice blocks were meant to expose subjects to the contingencies of reinforcement.
Experimental conditions. An A-B-A' intra-subject/pair design was used. Table 2 shows the order of conditions and experimental phases for each pair of rats. Experimental phases differed in the type of partition between compartments (i.e., opaque/clear). Pairs 1 and 3 were assigned the clear panel first, while pairs 2 and 4 where initially assigned the opaque one. Each pair experienced one of two reinforcement magnitude sequences, three times (i.e., total of nine conditions): (a) LMutual-SIndividual → SMutual-SIndividual → SMutual-LIndividual, and (b) SMutual-LIndividual → SMutual-SIndividual → LMutual-SIndividual. Pairs 1 and 2 experienced the first sequence, while pairs 3 and 4 encountered the second sequence. The letters L "large" and S "small" represent the amount of food obtained for each correct response (4 pellets or 1 pellet, respectively). The first letter represents the reinforcement given by the mutual option, while the second letter represents the one provided by the individual option. Due to the exploratory nature of the task, the criterion used to change from one condition to another was visual inspection of the data.  Figure 2 presents, for each pair of subjects, the mean proportion of coordinated responses from the total number of responses (individual and coordinated) per session for each experimental condition. Changes in the relative frequency of coordination among pairs differed depending on the sequence of exposure to the reinforcers ratio between options (experimental conditions) but not across experimental phases (i.e., type of barrier). When the larger reinforcer was initially associated with the mutual option (L Mutual -S Individual → S Mutual -S Individual → S Mutual -L Individual sequence, Pairs 1 and 2), subjects' choice allocation varied across experimental sessions. During Phase A, in the first experimental condition (L Mutual -S Individual, grey circles), subjects showed a low proportion of responses for the mutual option (less than 0.2) and stabilized towards indifference at the end of this condition. The relative frequency of coordinated responses decreased abruptly from the beginning of the second condition (S Mutual -S Individual , white circles), when the amount of reinforcers provided by the mutual option decreased. Furthermore, coordinated responses reached values closer to zero in the third condition, in which the larger reinforcer was associated with the individual option (S Mutual -L Individual , black circles). These pairs of rats replicated this choice pattern in subsequent phases (B -A'), in which they exhibited a faster behavioral adjustment to different conditions. Subjects exposed to the opposite reinforcement sequence (S Mutual -L Individual → S Mutual -S Individual → L Mutual -S Individual , Pairs 3 and 4) chose the individual option consistently, even when contingencies favored the choice of the mutual option (L Mutual -S Individual ). The only exception was a slight increase in coordinated responses (in Pair 3 subjects) in the last experimental condition (see Figure 2). Figure 3 shows the response rates to each alternative of each rat of each pair per session and across conditions. The pattern of the data displayed on this figure replicates the choice pattern observed in Figure 2 and provides additional information about the individual execution of subjects across experimental conditions, included shaping phase. Unlike the intra-pair discrepancy in response rates that has been reported in coordination tasks in rats (Łopuch & Popik, 2011;Tan & Hackenberg, 2016), in our task, the two rats of each pair responded at approximately equal rates across conditions. Likewise, intra-pair individual response rates were similar when the larger amount of reinforcers was associated with the individual option (around 2 responses per minute in S Mutual -L Individual conditions) and when the food ratio was equal between options (around 5 responses per minute in S Mutual -S Individual conditions). The difference in response rates observed across conditions can be explained by the amount of time required for the consumption of the reinforces obtained in the individual option (4 pellets or 1 pellet, respectively).   Figure 4 shows switching responding (in a ratio) between alternatives for each subject, over time, across sessions, and for each condition. In general, switching was a function of the difference in reinforcement rates between options. Switching patterns were overall similar within and between pairs, with relatively high ratios when the proportion of reinforcers favored the mutual option (i.e., L Mutual -S Individual conditions) and a decrease to low ratios during the other conditions. Unlike the choice patterns observed in Figures 2 and 3, the order of exposure to the sequence of reinforcement did not impact the switching pattern between alternatives. Furthermore, the type of partition (clear/opaque) had no impact over switching responses.

Discussion
The overall pattern of results shows that the experimental protocol presented is a useful tool to assess the control of mutual reinforcement contingencies over coordinated behavior, as well as to identify the change in preferences for mutual options in a choice context that provides individual sources of reinforcement (i.e., independent of the behavior of the other subject). Our procedure permitted evaluation of the hypothesis that cooperative strategies arise in situations in which an individual acting alone would not be as successful as two or more individuals acting jointly, provided the benefits obtained offset the cost of coordinating (Roberts, 1997;Visalberghi et al., 2000). Rats from Pairs 1 and 2 coordinated their activities only when the reinforcement ratio was larger in the mutual option (4:1). Coordinated behavior was replicated with this proportion of reinforcers across experimental phases (see Figure 3, L Mutual -S Individual conditions). Under the same reinforcement ratio (4:1), Pairs 3 and 4 failed to coordinate their actions. Although they showed a similar switching pattern observed in Pairs 1 and 2, it occurred at a lower ratio (see Figure 4). These findings suggest that differences in preferences for the mutual reinforcement option among pairs might be due to interference in the learning of coordinated actions when the reinforcement ratio initially favored the individual option (i.e., order effect) and not to insensitivity to the consequences associated with the options available.
Similar to what Tan and Hackenberg (2016) reported, we observed that the level of visual restriction between subjects did not impact their coordinated actions. In our study, this variable was controlled systematically by counterbalancing the exposure to different types of barrier, namely clear and opaque. Łopuch and Popik (2011) showed that the type of barrier influenced behavior when subjects had less experience with the task and, consequently, when they had not established a coordination pattern. However, we did not observe a substantial change in coordinated responses by decreasing/increasing visual restriction between subjects, which suggests that the control of behavioral cues over coordination does not rely exclusively on the visual dimension of interaction. As Tan and Hackenberg (2016) argued, other sensory inputs accompany the target response (e.g., the sound of the ball being displaced); therefore, interindividual behavioral adjustment is flexible and sensitive to changes in nonvisual sensory inputs that allow subjects to succeed, even in the absence of visual cues. It should be noted that we never shaped the coordinated response; instead, it originated from exposure to the contingencies of reinforcement. This allowed us to assess the effects of visual interaction over the learning of joint actions with greater precision.
To our knowledge, the present task is the first choice protocol that entails a continuous free-operant response to study coordination in a mutualistic cooperation setting. The spatiotemporal aspects of our target response are consistent with the Boesch and Boesch (1989) definition of coordination, namely "each individual focuses similar actions in the same object and tries to relate in time and space to each other's actions" (p. 550). In this case, both subjects rolled a ball from end to end of the mutual gutter, continuously and without interruptions.
An additional relevant feature of our task is that the topography (including the continuity) of the target response is highly similar between options (individual and mutual), which allows for isolation and comparison of the social-nonsocial components of coordination. In the individual option, each subject coordinates its own behavior by manipulating and rolling the ball alone from end to end of the gutter. In the mutual option, both subjects adjust their actions jointly in time and space to successfully take the ball from end to end. This aspect of the task is an important contribution that we expect will foster the understanding of the costs of coordinating.
Even though two pairs of subjects learned to coordinate their actions, the choice of the mutual alternative reached, at most, levels close to indifference, which suggests that the intra-pair coordination of activities was costly. As Killeen and Snowberry (1982) have noted, "when access to mutual benefits requires the action of another organism, outcomes are probabilistic, and the existence of cooperation is more remarkable" (p. 359). In these situations, the future value of cooperating must be pondered (i.e., discounted) by the probability that coordination will occur. Furthermore, our task, like a baton race, requires sequential coordination. This represents a delay associated with the coordinated response (the delay of rolling the ball response by each rat of the pair). A technical obstacle of our experimental chamber is that it does not allow a quantitative measure of the cost of cooperating. Developing a system that allows for real-time tracking of the rats' behavior might be useful to evaluate the degree of discounting in mutual reinforcement contingencies. In summary, the task described here allows for replicating coordination patterns in controlled settings as a product of the adjustment to mutual reinforcement contingencies, while confronting organisms to choose between social and nonsocial response options. Our preliminary findings illustrate the technical advantages of the proposed procedure and show promise for its future refinement and systematization. Ultimately, this protocol shows potential to identify the factors that affect the organisms' sensitivity to the uncertain future benefits of coordination. For instance, further research could identify how the coordination of activities combines with the future value of outcomes to produce stable cooperative equilibria.