It Takes More Than Fish: The Psychology of Marine Mammal Training

The training of marine mammals is based on findings from comparative psychology, particularly those associated with the psychology of learning. In this paper, we examine the manner in which principles that were originally discovered in laboratory settings are now used in the training of marine mammals. These principles are used in a variety of training contexts, including teaching show behaviors at entertainment parks, husbandry, military applications, and research on cetacean cognition and communication. We also suggest future areas of research that would advance our understanding of marine mammal cognition and enhance the efficacy of existing training procedures

The success of marine mammal trainers and marine mammal researchers in teaching a wide range of behaviors to a variety of marine mammals is based in large part on basic psychological principles, many of which are derived from the literature on learning.Publications concerning marine mammal training contain many terms and ideas familiar to comparative psychologists, including approximation, association, classical conditioning, discrimination, extinction, generalization, habituation, observational learning, operant conditioning, orienting response, schedules of reinforcement, secondary reinforcer, and shaping (de Groot, 1990;Kastelein, 1990;Pryor, 1975Pryor, , 1995;;Ramirez, 1999;Turner, 2002).In this paper, we will highlight the role of basic psychological principles in marine mammal training, focusing on the ways in which principles discovered in laboratory settings are applied in less controlled situations.We will also consider directions that future research might take in order to increase our understanding of marine mammal cognition and further improve the quality of marine mammal training techniques.

Habituation and Desensitization
Dolphins sometimes strand on beaches and are rescued and transported to a local marine mammal facility for rehabilitation.Stranded animals have most likely never eaten dead fish and are also likely to be leery of humans.One of the first problems human caregivers face in such cases is teaching the animal to consume dead fish.If the rescued animal does not eat, it will certainly die.One technique that is used to teach new arrivals to eat dead fish is to slap the fish on the surface of the water.This often results in an orienting response by the dolphin, and possible interest in the dead fish that the trainer is either dangling in the water or has left in the water for the animal to investigate.The orienting response produced when the trainer slaps the water results in the dolphin investigating the area where the sound was produced, which in turn provides the dolphin with the opportunity to examine the dead fish.As this process is repeated, the animal learns to eat dead fish.Once the animal has learned to consume the fish, the animal's interest in the area of slapped water can be reinforced by providing a fish or some other form of reinforcement.Consequently, the dolphin learns to approach the area indicated by the trainer slapping the water.In addition, the fact that the trainer provides reinforcement for this behavior helps to desensitize the animal to humans, and results in the dolphin readily approaching humans rather than avoiding them.Over time, then, this procedure teaches the dolphin a variety of things: (1) it learns to come to the site where a trainer has slapped the water; (2) It learns to consume dead fish; and (3) It learns to interact with humans.
The above example illustrates how the psychological principles that underlie the orienting response, habituation and desensitization are used in marine mammal training.In fact, habituation and desensitization are used in many aspects of marine mammal training (Hurley & Holmes, 1998).Habituation is used to familiarize an animal with objects and situations that might otherwise distract them during shows, test sessions, or husbandry procedures.Desensitization is used to lessen animals' negative reactions to a variety of procedures, such as swimming through a gate into another pool (which many marine mammals hesitate to do when they first encounter an open gate) or allowing a decayed tooth to be drilled.In addition, the orienting response and habituation play important roles in the success and failures of environmental enrichment programs (Kuczaj, Lacinak, & Turner, 1998;Lacinak, Turner, & Kuczaj, 1997).For example, the timing of enriching events is as important as the nature of the events.Simply placing and leaving objects in an animal's environment quickly results in habituation, and so the objects lose their enriching qualities.

Operant Conditioning and Marine Mammal Training
Although the strict behaviorist approach has been abandoned by much of psychology, including some contemporary learning theorists (e.g., Gallistel, 1990), there is no doubt that behavior can be modified by experience.In fact, this characteristic of behavior was the theoretical foundation for the principles advocated by learning theorists such as Watson (1930) andSkinner (1938).
Many marine mammal trainers consider themselves behaviorists.Part of the reason for this is the simple fact that much of the literature available to novice trainers emphasizes the principles of operant conditioning (de Groot, 1990;Kastelein, 1990;Pryor, 1975Pryor, , 1995;;Ramirez, 1999;Turner, 2002).More importantly, trainers learn through their own experience that operant conditioning works.The general principles used to gradually shape behavior in marine mammals are the same that Skinner used to train pigeons to play ping-pong or turn in complete circles.Nonetheless, there are important differences between marine mammal training and the traditional operant conditioning experiment.

Food Deprivation
One difference between marine mammal training and the traditional operant conditioning study concerns the use of food deprivation as a means to motivate animals to produce behaviors that result in food.It was common in many traditional learning laboratories to maintain animals at less than their normal body weight, and to use food as a primary reinforcer to shape a hungry animal's behavior, be it playing ping-pong, turning in a complete circle, or pecking a key.In contrast, contemporary marine mammal facilities make every effort to maintain the optimal weights and health of the marine mammals under their care.Most marine mammals are fed their allotted amount of food each day regardless of their performance during training sessions or presentations.Thus, food is less likely to be as important a primary reinforcer for marine mammals as it is for a food-deprived animal in a learning laboratory.

Reinforcement and Punishment
Another difference between marine mammal training and some operant conditioning studies involves the use of punishment.In general, operant conditioning uses two main techniques to change behavior.Reinforcement is used to increase the frequency of desired behaviors (Skinner, 1938).Punishment is used increase the frequency of desired behaviors (Skinner, 1938).Punishment is used to decrease the frequency of undesired behaviors (Walters & Grusec, 1977), although severe forms of punishment may produce unexpected and even unwanted results, such as the suppression of overall behavioral activity or an increase in aggressive behavior (Azrin, 1960).Contemporary marine mammal training relies on reinforcement alone (Ramirez, 1999;Turner, 2002).The use of reinforcement and avoidance of the use of punishment is based on a number of factors.First, the use of reinforcement alone is effective.As a result, punishment is not needed to train new behaviors.Second, trainers typically form close relationships with individual animals, and would be reticent to do anything to harm an animal.Third, the use of punishment can lead to dangerous situations, particularly when animals and trainers work in close proximity in unprotected scenarios, as is often the case in marine mammal training (Turner & Tompkins, 1990).Finally, marine mammals are admired by the public and protected by law.Federal regulations that enforce the Animal Welfare Act specifically prohibit "unnecessary discomfort" in the handling of marine mammals.Thus, the use of punishment has negative legal and public affairs implications.

Discrimination and Discriminative Stimuli
Although there are some important differences between marine mammal training and traditional operant conditioning experiments, there are also a number of similarities.In both settings, animals may learn stimuli that signal that certain behaviors, if performed, will be rewarded.Such stimuli are called discriminative stimuli, the effectiveness of which depends on both the animal's ability to readily distinguish different discriminative stimuli and the animal's ability to associate the discriminative stimulus with the desired behavior.This process is facilitated in marine mammal training when animals are cued to attend to upcoming discriminative stimuli.These cues are beneficial to both the animals and the trainers (or experimenters).Cues free animals from the need to be on the alert for discriminative stimuli at all times.Instead, they learn to look for discriminative stimuli after they have been asked to attend.In the case of marine mammals, this might involve calling the animal to station in front of the trainer and to attend to the trainer.At this point, the trainer can present a discriminative stimulus (e.g., a gesture or a tone) that informs the animal that a specific behavior has been requested (if the animal has learned this association through previous pairings of the discriminative stimulus and the desired behavior).If the animal produces the desired response, it will subsequently be rewarded.
In addition to making the task of learning easier for the animals, the use of discriminative stimuli in marine mammal training provides an opportunity to advance our understanding of these animals' cognitive abilities and at the same time further the use of discriminative stimuli.In order for discriminative stimuli to be effective, animals must be able to distinguish them from one another.At the same time, some differences may not be sufficiently salient to yield discrimination.For example, a dolphin trainer might use one gesture to denote a slow swim and another gesture to denote a high jump.For these two gestures to function as discriminative stimuli, the dolphin must be able to discriminate them.But at the same time, the dolphin must be able to ignore irrelevant differences.The trainer might produce the slow swim gesture slightly differently each time she produces it, and different trainers may also produce slightly different forms of the gesture.The dolphin must learn that these subtle differences are not significant.Investigations of the manner in which marine mammals decide whether discriminative stimuli are the same or different would increase our understanding of the way in which these animals categorize their world, and also help to determine the types of discriminative stimuli that are most useful in particular training contexts.For example, gestures produced by trainers seem to be particularly salient to marine mammals.It is possible that this type of information is particularly easy for dolphins to process, resulting in increased salience of gestures as discriminative stimuli.It is also possible that gestures are significant discriminative stimuli because the animal is focusing on the trainer as a source of information and reinforcement.Research is needed to tease apart these two possibilities.Previous research has shown that dolphins pay particular attention to the initial position of a gesture (Shyan, 1985) and that exposure to gestures as symbols may affect the types of processing strategies employed by dolphins (Shyan & Wright, 1993).Additional work is needed to further specify the salient information and processing strategies used when marine mammals are presented with different sorts of discriminative stimuli in different contexts.

Shaping
Both marine mammal training and learning laboratories use shaping to help animals learn correct responses.This involves teaching a new behavior by shaping it through the reinforcement of successive approximations to the desired behavior and the nonreinforcement of earlier approximations.For example, the early stages of dolphin training typically involve a dolphin learning to approach and touch a target pole.Initially, the dolphin is rewarded for approaching the general area of the pole.The criterion then becomes more stringent, and the dolphin eventually must touch the target in order to be reinforced.Once this has been accomplished, the dolphin learns to follow the pole by maintaining physical contact with the target as it moves.As this latter behavior is being trained, simply approaching and touching the target no longer results in reinforcement.The criterion for reinforcement has become more stringent, and now requires the animal to follow the target.Once the dolphin has learned to follow the target, it can be used to guide the animal to perform a specific jump by following the target through the air.Of course, each of these accomplishments occurs gradually, and involves increasingly stringent criteria in order for reinforcement to occur.In a very real sense, then, a giant leap that a dolphin produces on cue is based on many small steps.

Behavior Chains
It is possible to train more complex behaviors by chaining simple behaviors together.A homogenous chain involves a sequence of identical responses, as in the case of a dolphin producing a series of seven identical leaps above the surface of the water.This sort of chain is relatively easy to train if an animal has learned that a particular conditioned reinforcer, such as a whistle, indicates the end of a successful behavior.Once the dolphin has learned to produce a single leap in response to a specific cue, the number of leaps can be increased by combining two techniques: (1) using a target pole to cue the animal to perform another leap, and (2) delaying the use of the whistle until the required number of leaps have been performed within a specified period of time.In this way, a sequence of identical behaviors can be shaped.
Heterogeneous chains involve sequences of different behaviors.For example, one might wish for a sea lion to dive into a pool of water, swim the length of the pool, prop itself up on the edge of the pool, and roar at the crowd.In this case, the sea lion is first trained to dive into the water in response to a specific cue.By shaping the animal's behavior to swim across the pool after entering the water, diving into the water becomes the stimulus for the response to swim across the pool.Subsequent training results in the sea lion learning to associate swimming across the pool with propping itself up on the edge of the pool, and learning to associate propping itself on the edge with roaring at the crowd.Once this sequence has been learned, completing the entire sequence by roaring at the crowd results in some sort of reinforcement.This example involves what is known as forward chaining.The first behavior in the sequence is trained first, the second behavior is trained second, and so on.
It is also possible to train sequences using backward chaining.In our example, roaring at the crowd would be trained first, propping oneself on the edge of the pool trained second, and so on.Although backward chaining might seem counterintuitive, the rationale is that the first behavior to be trained is the one closest to the delivery of the reinforcer, and so is an easy association for an animal to remember as the sequence leading to the reinforcer increases.Contrast this with forward chaining, in which the association between a behavior and reinforcement changes as behaviors are added to the sequence.Early laboratory work suggested that backward chaining was the most effective technique (Ferster & Perrot, 1968).However, more recent work suggests that forward chaining can be as effective as backward chaining (Sulzer-Azaroff & Mayer, 1991).In marine mammal training, the most common way to train complex behaviors is to first train individual behaviors and then to combine them through backward chaining (Ramirez, 1999;Turner, 2002).Less is known about the effectiveness of forward chaining in marine mammal training.Research that compared the success of forward and backward chaining in marine mammal training would help to determine the sorts of information that marine mammals find easiest to learn.

What Makes Something Reinforcing?
The Nature of Reinforcement Our discussion so far has emphasized the role of reinforcement in marine mammal training.There are two types of reinforcers: primary and secondary.Simply put, a primary reinforcer is one that meets some basic need and that works as a reinforcer without experience.Although this seems relatively straightforward, the notion of "basic" need can be problematic.For most mammals, basic needs certainly include oxygen, food, water, and warmth.Oxygen is reinforcing to an organism that needs to breathe, heat is reinforcing to an organism that is cold, water is reinforcing to a thirsty animal, and food is reinforcing to a hungry organism.
The need to reproduce might also be basic, and so sexual intercourse could be reinforcing.It is also possible that social animals need to be with others.If companionship is a basic need for social animals, then providing opportunities for social interactions would also be reinforcing.What all this means is that the correct use of a primary reinforcer requires an appreciation of the needs of individual animals at particular times.An animal that is hungry will find food quite reinforcing, but food will not be reinforcing for an animal that has recently eaten its fill.An animal may find it rewarding to be paired with another animal at certain times but not others.In order to be reinforcing, a reinforcer must fill a current need.
For example, Premack (1971) observed that thirsty and non-thirsty rats behaved differently when put in a cage with a running wheel and a drinking tube.As one might expect, the thirsty rats were more likely to drink than run.However, the non-thirsty rats were more likely to run than drink.The thirsty and non-thirsty rats also reacted differently when the running wheel only worked after the drinking tube had been licked.The thirsty rats licked the tube but did not run.The nonthirsty rats licked the tube, ran until the wheel stopped, licked the tube, ran until the wheel stopped, licked the tube, and so on.For the thirsty rats, licking the drinking tube was reinforcing in and of itself.But for the non-thirsty rats, running reinforced the licking of the drinking tube, rather than vice versa.Water, a potential primary reinforcer, was reinforcing only if a rat was thirsty.Otherwise, the opportunity to run on the wheel was a primary reinforcer.According to Premack (1965), an activity is reinforcing only if it is more rewarding than the behavior to be reinforced.For a thirsty rat, drinking is more rewarding than running, and so drinking reinforces the rat's running behavior.But for a non-thirsty rat, running is more reinforcing than drinking, and so running reinforces drinking.
If primary reinforcers meet basic needs, then what are secondary reinforcers?Strictly defined, secondary reinforcers are stimuli that were once neutral but have acquired reinforcing qualities because of their association with other reinforcers.For example, the word "good" can become a secondary reinforcer if a trainer consistently pairs the word "good" with a fish given to a hungry sea lion after the sea lion has produced the desired response.At this point, the word itself will reinforce desired behavior, and so provide reinforcement in the absence of a primary reinforcer (Rameriz, 1999;Turner, 2002).
It is important to remember that secondary reinforcers acquire their reinforcing qualities because of their association with other reinforcers.Touch (e.g., rubbing an animal's skin) and toys (e.g., a ball) are often used as reinforcers with marine mammals, and are oftentimes considered secondary reinforcers.But they are not secondary reinforcers unless they have acquired their reinforcing qualities by being paired with other reinforcement.If touch is reinforcing to an animal because it satisfies a need of the animal and does so without being paired with other reinforcement, it is a primary reinforcer.The same can be said for toys and other forms of stimulation (see Bekoff & Byers, 1998;Kuczaj et al., 2002).
Of course, it is possible for any reinforcer to lose its reinforcing properties if the organism becomes satiated.Certainly, food loses its reinforcing quality once an animal has eaten its fill.Similarly, animals that value tactile contact might be reinforced by touch, but too much tactile contact could cause touch to lose its reinforcing status.It would be valuable for trainers to know the changing reinforcement values of particular reinforcers during a session, a day, a week, or even a month, but there is precious little work on this with any species of marine mammal.

The Timing of Reinforcement
Reinforcement is most effective if it immediately follows the response to be reinforced (Skinner, 1938).This is not always possible, particularly in a training context.As noted above, secondary reinforcers are once neutral stimuli that have become associated with other reinforcers.This allows them to be effective substitutes for primary reinforcers.The significance of tokens as secondary reinforcers was demonstrated by Cowles (1937) and Wolfe (1936).Chimpanzees in these studies were as likely to produce behaviors that resulted in tokens as they were to produce behaviors that resulted in food, and produced behaviors for tokens even if there was a delay between obtaining the token and exchanging it for food.If chimpanzees housed together were given tokens, begging for tokens and stealing tokens were both observed.
The effectiveness of secondary reinforcers was also demonstrated by Kelleher (1958), who required chimpanzees to press a key 125 times in order to receive one token.When a chimpanzee had obtained fifty tokens and put them into a slot, it received a food reward.The chimpanzees had to produce 6250 key presses in order to receive a reward.Their willingness to do so was facilitated by the tokens that functioned as secondary reinforcers.
The natural tendencies of animals to behave in certain ways must be taken into account when designing secondary reinforcers.Breland & Breland (1961) reported a phenomenon they called instinctive drift, which occurs when animals produce behaviors normally associated with a specific activity such as foraging.For example, Breland & Breland attempted to train a raccoon to pick up two coins and insert them into a container, after which the raccoon would receive a food reinforcer.This proved difficult because the raccoon became less and less willing to put the coins into the box.As the raccoon learned to associate the coins with food, the coins became a substitute for food (a conditioned stimulus).Raccoons like to rub the food they are holding, and so once the coin became a conditioned stimulus for food, the raccoon rubbed the coins together rather than letting them go.In this case, the raccoon's natural tendency to rub its food resulted in learning that was counterproductive to the goals of the trainers.Similar results were obtained with rats that needed to deposit a metal ball into a slot in order to obtain food (Boakes et al., 1978).Instead of placing the ball in the slot, the rats manipulated the ball with their paws and gnawed on it, behaviors they would have performed with food.Strikingly, the longer the animals went without food, the more likely they were to treat the ball as if it were food rather than as an object they could use to obtain food.Timberlake, Wahl, and King (1982) demonstrated that instinctive drift occurred in both instrumental and classical conditioning procedures.They placed rats in a chamber in which a metal ball rolled across the floor during test trials.Regardless of whether the rats had to intercept the ball in order to obtain a food reward (instrumental conditioning) or simply witness the ball roll across the floor before receiving food (classical conditioning), the rats persisted in handling the ball and treating it as food.
The behavior described in the above paragraph is related to the phenomenon of autoshaping.Although the word "shaping" suggests operant conditioning, autoshaping is actually related to classical conditioning.Autoshaping occurs when an animal alters its own behavior in response to a stimulus.For example, consider a hungry pigeon that is in a cage where a response key is illuminated for a brief period, turned off, and food is dropped into a feeding tray (Brown & Jenkins, 1968).The pigeon need not respond in any way to the lit response key.The food will appear regardless of what the pigeon does.Despite this, the pigeon will peck the key.The illuminated key predicts that food will appear.The unconditioned response to food is to peck (and eat) it.The conditioned response to the conditioned stimulus (the illuminated key) is to peck at the key because it is a substitute for the unconditioned stimulus.Similarly, in our earlier examples of the raccoon and the rats, the coin and ball became conditioned stimuli for the unconditioned stimulus of food, and so were consequently treated it as if they were food.One example of autoshaping in marine mammals involved a dolphin that heard an artificial whistle and was given a fish (Sigurdson, 1993).This dolphin imitated the artificial whistle whenever it heard the whistle, even though such mimicry was not necessary in order to receive the fish.
Despite the potential problems that particular objects or events might cause in various learning contexts due to instinctive drift, the appropriate use of secondary reinforcers is essential in marine mammal training.Secondary reinforcers are effective because they provide feedback that correct responses are being made, sometimes act as cues for the next responses to be performed, and help to maintain the association between the behaviors being performed and the reinforcement to be received (Domjan, 1996).For example, dolphin training typically involves dolphins learning that a particular type of whistle means to return to the trainer and receive a reward (Ramirez, 1999;Turner, 2002).In such cases, the whistle indicates that the correct response has been made, and helps the dolphin to associate the reward it will receive from the trainer with the behavior that is being reinforced.This procedure can be used to increase the interval between the desired behaviors and reinforcement without hampering the effectiveness of the reinforcement.In marine mammal training, the use of a whistle to indicate a correct response is often called a bridge because it helps to bridge the gap between a behavior and reinforcement (Pryor, 1975;Ramirez, 1999;Turner, 2002).

The Significance of Reinforcement Expectations
As a result of their experiences, animals come to expect certain things to happen in certain contexts.The importance of expectation in learning has been demonstrated in a number of studies.Tinkelpaugh (1928) had monkeys choose one of two containers in order to obtain a food reward.If the monkeys chose correctly, they received a piece of banana that was located in the container.As one might expect, they quickly learned to choose the correct container.Tinklepaugh then substituted a less desirable food (a lettuce leaf) for the banana.When they looked in the container, the monkeys acted as if they were surprised and angry.They certainly seemed to have expected a specific reward.More recently, Watanabe (1996) reported data consistent with the notion that monkeys learning an operant conditioning task come to expect specific types of food rewards.
Elliot (1928) investigated how a change in the type of reward affected rats' maze running performance.One group of rats was reinforced with a low quality food as they learned to run the maze.Another group of rats was reinforced with a high quality food as they learned to run the maze.The rats that received the high quality food learned to run the maze more quickly than did the rats in the other group.Thus, the nature of the reinforcement affected learning.Elliot then changed the food reward for the high quality group to the same low quality food as the other group.The results were dramatic.The performance of the rats whose reward type had remained constant continued to improve (albeit only slightly).However, the performance of the rats whose food reward had been lowered in quality deteriorated significantly.They had learned to expect a higher quality reward, and when this expectation was violated their behavior was adversely affected.Results such as these led Tolman (1932) to conclude that animals form expectations that particular responses will be followed by particular outcomes.If these expectations are violated, the animal's behavior is affected.
It is possible for violated expectations to lead to improved performance.This was demonstrated by Mellgren (1972).In this study, the amount of food rats received once they reached the end of a runway was changed after the rats had become accustomed to receiving either two food pellets or 22 food pellets.If the change involved a positive behavioral contrast (an increase from two food pellets to 22 food pellets), the rats increased their running speed and so reached the end of the runway more quickly.However, if the change involved a negative behavioral contrast (a decrease from 22 food pellets to 2 food pellets), the rats decreased their running speed, and so took longer to reach the end of the runway.
The important point of all this is that the effectiveness of a reinforcer depends at least in part on the organism's previous experience with reinforcers (Flaherty, 1996;McSweeny & Melville, 1993;Williams, 1997).If an organism expects more than it receives, performance suffers.But if an organism receives more than it expects, performance is enhanced.
One way to reduce the possibility that expectations will influence behavior in undesirable ways is to vary the schedule of reinforcement (Ferster & Skinner, 1957).Rather than reward an animal following every correct behavior, a variable schedule might reward the animal following two correct behaviors, next reward the animal following five correct behaviors, and then reward the animal for one correct behavior.Variable schedules are commonly advocated for use in marine mammal training in order to reduce the predictability of the training situation and maintain the animal's interest (e.g., Ramirez, 1999;Turner, 2002).However, the frequent use of a bridge during training sessions may produce a more continuous schedule of reinforcement than is currently believed.Given that a bridge is a secondary reinforcer, its occurrence is in fact reinforcing to the animal, and so results in less variable schedules of reinforcement than might be intended.If the whistle was used as a marking stimulus rather than a secondary reinforcer, it would not affect the variable status of a reinforcement schedule.Marking stimuli are not reinforcing, but instead help to make the "marked" behavior more memorable during the delay between its occurrence and the reinforcement (Lieberman, McIntosh & Thomas, 1979;Thomas & Lieberman, 1990).Investigations of the relative effec-tiveness of whistles as secondary reinforcers and as marking stimuli would provide valuable information for the next generation of training paradigms in marine mammal training.
In addition to trying to use a variable schedule of reinforcement, many trainers use a variety of reinforcers in order to further reduce the predictability of training sessions and other forms of animal-human interaction.Variable reinforcement schedules and variable reinforcers help to maintain an animal's interest and increase the possibility that learned behaviors will remain in the animal's behavioral repertoire.Persistence of behavioral responses is more likely to occur following a variable reinforcement schedule than a continuous reinforcement schedule.This is called the partial reinforcement extinction effect (Humphreys, 1939), and has been explained in terms of the memories of rewarded and nonrewarded trials (Capaldi, 1967;Capaldi, Alptekin & Birmingham, 1996) and the frustration associated with an unpredictable schedule (Amsel, 1958(Amsel, , 1992)).Note that this means that associative strength per se does not determine extinction rate (Domjan, 1996).Otherwise, we would expect continuous reinforcement to result in more persistence because of stronger associations.But this does not happen.

What is Reinforcing to Marine Mammals?
Although marine mammals are often given fish following a correct response, they are also given other types of reinforcement.These might include secondary reinforcers such as verbal praise, a whistle or a click produced by a clicker.Other reinforcers include tactile stimulation and toy objects that the animals can manipulate (such as balls or seaweed).Whether or not these objects are primary or secondary reinforcers depends on the animal's history and current state.
Given that marine mammals are not food deprived, the reinforcing qualities of fish are likely to change throughout the course of a day as an animal becomes satiated, then becomes hungry again, eats its fill again, and so on.It seems clear, then, that food is not always important as a primary reinforcer for marine mammals.If this is so, why do they continue to "work" for fish?One possibility is that fish have become secondary reinforcers by virtue of being associated with other reinforcers (e.g., tactile stimulation).Recall that in Premack (1971) the nonthirsty rats licked the water tube in order to be able to run, a demonstration that the reinforcing quality of an event depends on the animal's state, not on the event itself.Marine mammals that are not hungry may learn that fish indicate a correct response, and so accept fish as they would a whistle "bridge".If this is true, then food sometimes functions as a secondary reinforcer.If such is the case, marine mammals cooperate during training sessions for reasons other than food.These may include the opportunity to interact with the trainer, and the physical and mental stimulation provided in training sessions (see Kuczaj, Lacinak & Turner, 1998, for a discussion of how training and research sessions may enrich a captive animal's life).
In fact, it is possible to train marine mammals without using food at all.A young killer whale was trained to perform a variety of show and husbandry procedures during a six month period without the use of food (Lacinak & Kuczaj, 2003).The killer whale was reinforced with tactile stimulation (e.g., rubbing), interactions with trainers, and a variety of objects.The killer whale was fed on a random schedule throughout the day, but food was not provided during shows or training sessions.This animal learned the required behaviors as quickly as did other whales, demonstrating that food is not necessary to train marine mammals.Such training places a tremendous burden on trainers to design sessions that are rewarding and interesting, and so is unlikely to replace training that uses fish as the primary source of reinforcement.Nonetheless, as noted earlier, marine mammals are fed regardless of what they do during training sessions, and so it is unlikely that food is as significant a reinforcer as many trainers believe.Additional research is needed to determine the types of rewards that individual mammals prefer, the conditions under which these preferences occur, and the effects of these preferences on training.

Conclusions
We have seen that marine mammal training is based on a number of psychological principles.Nonetheless, there are a number of areas in which additional information would improve our understanding of the marine mammal training process and perhaps increase the overall quality of training.In particular, information that would increase animals' ability to learn new behaviors and to remember what they have learned would benefit both animals and trainers, and also add to our understanding of marine mammal cognition.Training procedures that keep animals interested in the learning process and pique their natural curiosity will be the most successful, and a better appreciation of marine mammal motivation and cognition would enhance our ability to create training paradigms that incorporate animals' natural propensities to learn.
We have noted several areas for future research in earlier parts of this paper, including the reinforcing qualities of food, the significance of other types of reinforcers, the relative effectiveness of secondary reinforcers and marking stimuli, and the processes used to categorize and differentiate discriminative stimuli.We would like to end by emphasizing the need for additional research on four topics: failure, individual differences, observational learning, and reinforcement schedules.
In any learning situation, the organism is likely to produce some incorrect responses.What are the effects of these failures on learning?We assume that they are inconsequential if they are relatively infrequent, but what if the animal is learning a difficult task and so produces many incorrect responses in early training sessions.How does this affect the animal's willingness to learn the task?Information concerning the number of errors that marine mammals can tolerate without disrupting their motivation to learn would allow trainers to design flexible training sessions that maintain an animal's interest and thus optimize learning.
There are individual differences among marine mammals and among marine mammal trainers.For example, some trainers are more enthusiastic than others.Little is known about the influence of trainer style and attitude on marine mammal learning, but we suspect that individual differences among trainers interact with those among animals to produce different learning outcomes.Most marine mammal facilities emphasize the significance of the relationship between animal and trainer for animal training and well-being.It seems likely that individual differences affect this relationship.Determining how this influences learning outcomes is a worthy topic for future research.
Marine mammals can learn via observation (Kuczaj et al., 2002;Turner, 2002;Xitco, 1988).Young animals seem more likely to imitate the behavior of others than do older animals, but animals of all ages seem capable of observational learning.Some animals are more likely to be imitated than others.For example, dolphin calves are likely to mimic behaviors of their mothers but are even more likely to imitate the behaviors of older calves (Kuczaj et al., 2002).A better understanding of whom and what is likely to be imitated would add to our understanding of marine mammal cognition, and facilitate the incorporation of observational learning into training paradigms.
Finally, additional research is needed on the effectiveness of different schedules of reinforcement for marine mammal training.Comparisons of the typical variable schedule used in marine mammal training (variable use of fish and other reinforcers, but frequent use of the most common secondary reinforcer, a bridge), an identical schedule that used marking stimuli rather than a bridge, and a straightforward variable ratio schedule that did not use secondary reinforcers at all would help to determine the most effective type of reinforcement schedule to use with marine mammals.If the results of such research were combined with those on the relative effectiveness of different sorts of primary and secondary reinforcers, the resulting training procedures would be more likely to maintain an animal's interest and result in the desired learning objectives.At the same time, these results would provide additional peeks into the minds of marine mammals.
Throughout this paper, we have emphasized the contributions that findings from comparative psychology have made to the training of marine mammals.We have also suggested additional research on marine mammal learning that could benefit both training and science.We look forward to learning more about training and marine mammal cognition, and hope that a volume on applied comparative psychology produced five years from now will contain answers to some of the questions we have posed.