Comparisons of Animal “Smart” Using The First Four Stages of the Model of Hierarchical Complexity

The Model of Hierarchical Complexity is a behavioral model of development and evolution of the complexity of behavior. It is based on task analysis. Tasks are ordered in terms of their hierarchical complexity, which is an ordinal scale that measures difficulty. The hierarchical difficulty of tasks is categorized as the order of hierarchical complexity . Successful performance on a task is called the behavioral stage . This model can be applied to nonhuman animals and humans. Using data from some of the simplest animals and also somewhat more complex ones, this analysis describes the 4 lowest behavioral stages and illustrates them using the behaviors of a range of simple organisms. For example, Stage 1 tasks and performance on them are addressed with automatic unconditioned responses. Behavior at this Stage includes sensing, tropisms, habituation, and other automatic behaviors. Single cell organisms operate at this Stage. Stage 2 tasks include these earlier behaviors, also including respondent conditioning but not operant conditioning. Animals such as some simple invertebrates have shown respondent conditioning but not operant conditioning. Stage 3 tasks coordinate 5 instances of these earlier tasks to make possible operant conditioning. These Stage 3 performances are similar to those of some invertebrates Stage 4 tasks organisms coordinate 2 or more circular sensory-motor task actions into a superordinate concept . This explanation of the early stages of the Model of Hierarchical Complexity may help future research in animal behavior and comparative psychology. respondently conditioned by Thompson and McConnell (1955). The NS used in this study was light from light bulb. The unconditioned response to the light ( UCR 1 ) was a low (10–30%) rate of turn responses and a very low (<5%) contraction rate . The UCS used in this study was shock. The unconditioned responses to the shock ( UCR 2 ) were a sharp turning of the cephalic (head) region to one side or the other and a longitudinal contraction of the entire body.

How did complex behavior evolve? Adaptation to different circumstances has a strong behavioral component. This paper uses a behavioral-developmental model called the Model of Hierarchical Complexity to address the complexity of behavior required for the first four major behavioral adaptations. Here, complexity is a measure of the difficulty of tasks that animals successfully addresses. The difficulty of tasks that an animal solves is used here as an approximation of how "smart" an animal is.
The field of comparative cognition presents results from studies that have included mechanisms and origins of cognition in a large range of animal species. The comparisons made between animals often consist of (a) comparisons among performances of a small number of species on an identical or largely equivalent task, (b) less direct comparisons between species on tasks that are in the same domain but in which the procedures were not identical or even necessarily equivalent, and (c) the many illustrations of unique adaptations of different species to particular ecological niches. Based on these previous types of studies, it has been difficult to have any kind of comprehensive view of the sequence of development and evolution of thought and behavior in animals. This has the limitation of not giving any way to sequencing animals quantitatively or even in general. The Model of Hierarchical Complexity provides a cross-species sequence of task difficulty, which can be used to make predictions about which species will perform which tasks. This model is elaborated on in detail later in this paper.
Comparative psychology (i.e., the study of similarities and differences in capacities for environment adjustments and for behavioral organization among all living beings) came about in steps. Pavlov's experiments with dog salivation discovered classical conditioning (1897,1927). The empirical stimulus generalization of classical conditioning helped researchers understand the origin of novel responses. Thorndike (1898Thorndike ( , 1913 argued that learning comes from the process of trial-and-error selection and connection of reactions through a mechanical law of effect. Thorndike had a broad view in that he believed that the most rudimentary techniques could lead to the most intricate learning. As such, consciousness and other nonmechanistic explanations were excluded from Thorndike's theories. Thorndike's mechanistic view supported further work in behavioral modeling, and his broad view was a precursor to developmental task analysis. Yerkes' theories differed from Thorndike in that he discussed consciousness being relevant to the discussion of comparative psychology. In his view, consciousness exists in all life to varying degrees across five distinct ascending levels. He attempted to correlate these levels to the bodily and neural evolution, and he diagrammed them as a psychophylogenetic tree (Yerkers, 1911). The animals he studied for the project included earthworms, crawfish, crabs, frogs, turtles, crows, dancing mice, pigs, monkeys, gorillas, chimpanzees, and orangutans (Razran, 1971). Yerkes' psychophylogenetic tree and the cross-species empirical work he did to demonstrate it supported further work in comparative psychology.
After Yerkes's work, there were multiple advances in theories of conditioning. Skinner differentiated operant conditioning from classical (respondent conditioning). In the Thorndikean tradition, he replaced salivation with lever presses. The discovery of operant conditioning allowed for models that demonstrated the origin of novel behaviors. Clark Hull formalized behavioral research by modeling behavior as a system of intervening variables (1937). Hull's work opened up the field of the quantitative analysis of behavior to thrive. All of this work is valuable to comparative psychology, as well as task analysis generally, but there are limitations.
Attempts to coordinate the disparate findings of different comparative psychology researchers have been met with varying success. Razran (1971) wrote a guide to the research on the early steps in learning conditioning. Schneirla wrote many papers on the methodological concerns in comparative psychology to help ensure that researchers are measuring what they claim to be measuring (Aronson, Tobach, Rosenblatt, & Lehrman, 1972). Such work was an important starting point, but it did not provide a general model for comparing the tasks that different species perform.
In comparative cognition, historically, there has been success in recording tasks that animals successfully address. However, these findings tended to be specific to the species or the small number of species being studied. For example, hypothetically, an ape and a parrot solve the same puzzle box. This finding might be interesting and may support further work. However, with a quantitative general model of task difficulty, one could make predictions about tasks that either species could successfully perform. Without quantitative general modeling, the field of comparative cognition is left with taxonomic ways of addressing comparisons. The taxonomic approach has limited usefulness in the prediction and control of behavior.
With the addition of stages of evolution and development, it is possible to have a mathematically clear basis of what variables are operative in changing the difficulty of tasks that animals can successfully address. Evolutionarily informed mathematical modeling allows for the identification of more general variables that predict success or failure on particular tasks.
Another problem with many comparative cognition studies is that there is usually a limited number of species addressed, most usually two to three. The problem with having so few species is that the findings are not generalizable beyond the species included in these small comparative studies. The combination of few species and a lack of general model makes it difficult to apply such research toward a general understanding of behavior.
The issue with the research on an individual species or a small number of species is that there is no systematic and general way to relate the different tasks. It also becomes difficult to understand precisely what is required to successfully perform a task and in what ways different animals might actually differ. The lack of a general scale that would allow for task comparison makes it difficult to make sense of behavior or development within a species, as well as evolutionarily-based differences between species.

The Model of Hierarchical Complexity (MHC)
The Model of Hierarchical Complexity was formally proposed as a general model of development in 1984 (Commons, Richards, & Armon 1984) as an expansion of observations first outlined by Inhelder and Piaget (1958). The model's specific application to nonhuman animals started in 1999, with analysis of the behavior of great apes (Miller, 1999). Since then, several analyses of non-human animals using the MHC has been performed. These analyses range from specific examinations of the behavior of single celled organisms (Commons, & Jiang, 2014) to addressing questions of cross-species general intelligence (Commons, & Ross, 2008). The Model of Hierarchical Complexity was expressed with formal mathematics by Commons and Pekker (2004). This formalization was refined by Commons, Gane-McCalla, Barker, and Li (2014). The goal of the current paper is to give a more detailed account of the first four stages of the Model of Hierarchical Complexity.
This model allows for a scaling of tasks in terms of their difficulty Commons & Pekker, 2004;Commons & Richards, 2002;Commons, Trudeau, Stein, Richards, & Krause, 1998). The paper describes the basic features of the model first and then presents some of the supporting research for this model. It then moves to an application of the model toward understanding some of the less hierarchically complex tasks that animals have been observed performing. The model assesses a general, unidimensional, evolutionary developmental measure of task difficulty across all tasks and domains. The core of the model is task analysis. Tasks in different domains form a sequence from simpler tasks to more complex tasks. This difficulty of task is operationalized based on what is called the order of hierarchical complexity (OHC) (Commons et al., 1998).
The success of animals in addressing these tasks will be discussed. The model proposes the basis for the often observed stage-like nature of development. Using the MHC, a hierarchically organized task sequence can be shown from simpler to more hierarchically complex-that is, from less difficult to more difficult. It eliminates the need to appeal to notions such as mental structures or cognition. The model can also be used to assess performance on any kind of task. Therefore, it is context-free and can be used for general comparisons across all species Commons & Pekker, 2004;Commons & Richards, 2002;Commons et al., 1998).
The hierarchical difficulty of tasks is categorized as the Order of Hierarchical Complexity. The OHC scales task difficulty by seeing if a higher-order task is defined in terms of two or more adjacent lower order ones and organizes them in a non-arbitrary way.
Stage of performance of an animal is specified by the most hierarchically complex task that the animal successfully addresses. Using a more general theory, such as the MHC, should allow for a comparison of far wider range of species. It should also allow for comparisons of the difficulty of task that an animal has performed.
Mathematical measure of the complexity of behavior possibly starts with Claude Shannon (1948). Shannon's conception of complexity can be loosely translated as total number of bits. In the MHC, this is termed horizontal complexity. An example of horizontal complexity is that in humans, children first learn to add together single digits. While they gradually add larger and larger numbers, they continue to engage in addition (Commons et al., 1998). They simply execute it for larger numbers. An example of transion between stages of hierarchical complexity is when a child goes from counting to performing addition (Commons, Miller, Goodheart, & Danaher-Gilpin, 2005). The successful completion of a task of a given Order of Hierarchical Complexity is referred to as stage. The model shows that stage-like performances are highly predicted by the OHC of the task with r's of up to .98 .
The MHC is a model used for the sequencing of the difficulty of tasks. In order to understand the model, it is first useful to think of the environments in which animals live as consisting of a series of tasks. The tasks generally are organized around preferred outcomes, such as tasks that are food related, those that are related to producing the next generation, those related to maintaining one's own life, those related to successfully rearing one's young, etc. Not all preferred outcomes will be relevant for all animals. One other important observation about tasks is that there is almost always a sequence of tasks required to attain a preferred outcome. That is, one can observe that initially young organisms either do not carry out certain tasks at all or they carry them out imperfectly. For many animals, one can see the difficulty of tasks that an organism undertakes as developing over time. The different steps in a hierarchical sequence of task complexity are referred to as orders. These orders form an equally-spaced unidimensional ordinal scale which is called the Order of Hierarchical Complexity .
The MHC provides a way of determining the difficulty of a task a priori. The difficulty includes what constitutes a task and how they are related (i.e., environmental stimuli, responses, and reinforcers or punishers). Measures of difficulty may be confirmed by the probability that a task is completed. For example, a rat in a Skinner box is presented with the environmental stimulus of the lever and responds by pressing it. Pressing the bar may be reinforced by a food pellet. All of these elements are required for consistent bar pressing. The bar must be there, the rat must press it, and it will not press it consistently without a reinforcement. What the model adds to this analysis is the coordination of measurable lower order tasks.

Three Definitions of the Order of Hierarchical Complexity
1. Higher order actions defined in terms of two or more adjacent lower order actions. This forces the hierarchical nature of the relations and makes the higher order task include the lower ones. In mathematical terms, this is the same as a set being formed out of elements. This creates the hierarchy. A = {a, b}, in which a, b are "lower" than A and compose set A. A ≠ {A,...}, in which the A set cannot contain itself. This means that higher order tasks cannot be reduced to lower order ones. 2. Higher order of complexity actions organizes those adjacent lower order actions. This makes them more powerful. In mathematics' simplest terms, this is a relation on actions. The relations are order relations A = (a, b) = {a, {b}}, an ordered pair. 3. Higher order of complexity action organizes those lower order actions in a nonarbitrary way. This means that there is a match between the model designated orders and the real world orders. This can be written as: Not P(a,b), or not all permutations are allowed Commons & Pekker, 2004). Order n+1 is more difficult a task than Order n and less difficult a task than Order n+2.
As shown in Figure 1, a higher order action is defined as follows: (1) in terms of the adjacent task actions from the next lower order of hierarchical complexity, (2) the higher order task action organizes two or more adjacent task actions at the next lower order of hierarchical complexity, and (3) The ordering of the lower task actions has to be carried out in a nonarbitrary way.
Differences in the physiology of the researched organism and the apparatus used to stimulate the organisms are important to the validity of comparison studies. This paper focuses on hierarchical complexity of tasks and behavior that successfully addresses them as a means to compare task learning in different organisms and research paradigms. The effects of physiology and research paradigm unrelated to hierarchical complexity of behavior are beyond the scope of this paper.
An example of the MHC being applied to animal behavior is the finding that hierarchical complexity of tasks performed correlated with the number of neurons a member of a species has with an r of .87. Harrigan and Commons (2014) used this finding to argue that the rate of reinforcement received for higher stage behavior is the reason that brains with more neurons evolved.
What remains to be further explored is the extent to which this model might apply to tasks at the lower orders of complexity. Ultimately, having researchers create sequences of tasks and testing animals on those tasks would provide the highest standard of evidence. A not unreasonable first step, however, is to examine the behaviors of animals as reported by many different researchers to see to what extent they conform to the predicted orders of complexity. In the examples, only behavior of animals that are limited to correctly completing a given order and do not show any higher stage action are used in this paper.
The MHC is rooted in 20 th century developmental psychology research. There are extensive studies showing the utilities of applications of stage theories in human beings (e.g., Inhelder & Piaget, 1958;Kohlberg, 1981). Ideas from these and other researchers were applied to construct the MHC.
This paper applies a relatively new measurement Commons & Pekker, 2004;Commons & Richards, 2002;Commons et al., 1998) to all tasks of all animals. Any task that is sufficiently well-described can be analyzed and broken down into the coordination of its component lower-order tasks using the MHC. To do so, the process and tools of the measurement theory of the MHC is illustrated. This illustration is done by analyzing examples from other studies that have not previously been considered in this way. The measurement is not dependent on content of the task or organisms, so the examples can be generalized to all animals and all task content.
Many of the ideas in this paper have been well-established in previous work but not applied to animals Commons & Pekker, 2004;Commons & Richards, 2002;Commons et al., 1998). The MHC's application to comparative psychology is relatively new but has appeared in other work (Commons & Giri, 2016;Commons & Jiang, 2014;Commons & Ross, 2008;. The five respondent steps of operant conditioning are original to this paper and another contemporaneous paper (Commons, Miller, Malhotra, & Wei, 2019). These ideas are derived in part from Commons and Giri (2016) but have been developed more in the current paper. For example, two steps have been added in the construction of Stage 3. These are Steps 4 and 5 of Stage 3. Stage 3 describes the creation of operant conditioning out of respondent conditioning.
Step 4 introduces incentives to address why animals engaged with operant contingencies.
Step 5 addresses the relationship between drives and where operant contingencies occur. These steps are addressed in more detail in the Stage 3 section of this paper.
To understand the first 4 stages, it is important to know how an OHC is determined. When determining the OHC of a task, the most important things to remember are the three definitions: (1) higher order actions defined in terms of adjacent lower order action, (2) higher order of complexity actions organizes those lower order actions, and (3) higher order of complexity action organizes those lower order actions in a nonarbitrary way.

First Four Stages of the Model of Hierarchical Complexity
The following section will explain the first four stages of the MHC. In this section, there are examples of animals that perform at the highest stage they can. Animals at that stage do not perform any higher stage actions. In the following section of the paper we will elaborate more on Table 1, in particular to Stages 1, 2, 3 and 4, utilizing specific examples. These examples will make clear what kind of tasks characterize a stage of the MHC.

Stage 1 Automatic
For most of evolutionary time, there were only single-celled organisms. For this review, we assume that single-celled organisms in the evolutionary past only had hard-wired responses, including taxis, tropisms, and phagocytosis, much like today's single-celled organisms (Commons & White, 2006). The criteria for Automatic Stage 1 is that the organism engages in a single action at a time, and the action is "hard wired" into the organism. Examples of such built-in or automatic actions include taxis, tropisms, phagocytosis, and unconditionable reflexes (Commons & White, 2006). Obviously, single-celled animals do not have nervous systems. Here, conditionable and unconditionable reflexes are distinguished. Unconditionable reflexes are a Stage 1 behavior. Reflex is nearly an instantaneous movement in response to a stimulus (Purves, 2004).
Order 1 task actions only occur in response to changes in those specific stimuli to which those behaviors generally respond. These task actions are described by the modification of unconditionable reflexes. Reflexes that are not respondently conditioned are Automatic Stage 1 responses. They will be referred to as unconditionable reflexes. In an unconditionable reflex, the stimulus and the response are organized, and the coordination is totally automatic. The term reflex is used here, as opposed to tropism or taxis, because the term reflex is traditionally used for fast responses that do not have long durations. Reflexes that are respondently conditioned will be referred to as conditionable reflexes, which are Sensory or Motor Stage 2 responses. Learning that affects the response rate for unconditionable reflexes, such as behavioral habituation and sensitization, are also Automatic Stage 1 actions.
Behavioral habituation and sensitization are two forms of nonassociative learning. These are behavioral processes that may have evolved to deal with stimuli that occur iteratively in the environment (Eisenstein, Eisenstein, & Smith, 2001). Behavioral habituation is a decrease in magnitude of a response to an iterative stimulus. On the other hand, behavioral sensitization is an increase in magnitude of a response to an iterative stimulus. These forms of learning are distinct from later forms of respondent conditioning, sometimes called associative learning. Single-celled organisms at Stage 1 have limited sensors and effectors. There are no uncontroversial reports of such organisms responding in actions above Stage 1.
Here, the terms behavioral habituation and sensitization are used to distinguished them from the more specific uses of the terms habituation and sensitization found in the neuroscience literature (e.g., the neurological habituation research of Groves and Thompson (1973). Behavioral habituation and sensitization are not descriptors of neural processes but rather descriptions of the behavior of whole organisms.
Each particular action responds primarily to a single kind of stimulus. Examples of the environmental stimulus could be a chemical emitted by possible food, light, heat, or electricity. There may be generalization gradients associated with that stimulus. For example, responses will occur with some probability given varying intensities of light. Nevertheless, because the eliciting stimulus is still a light, no learning mechanism needs to be applied in order to explain the behavior. At Stage 1, the environmental stimulus that leads to the behavior is not paired with any other stimulus either before or after the occurrence of the behavior. Example 1. The unicellular amoeba Physarum polycephalum has been able to adapt its behavior in response to patterns of periodic environmental changes. Saigusa, Tero, Nakagaki, and Kuramoto (2008) exposed the Physarum to three spikes of cold temperature which elicited the reflex in the amoeba to slow down its movement speed. The temperature spikes occurred at a set rhythm at regular intervals. Eventually, the spikes were not administered at the time that would follow the pattern, but the Physarum still slowed down its speed at the appropriate time. This pattern shows alteration of behavior due to past events.
Example 2. This is an example of unconditionable reflex and habituation as an Automatic Stage 1 behavior in protozoan Vorticella convallaria by Patterson (1973). S1. Electric stimulation of different intensities administered every 10 s for 5 min.

R1.
Response to (S1) was contraction of the body and stalk. The S1 eliciting R1 is an example of unconditionable reflex, which is an Automatic Stage 1 behavior.

S2.
Mechanical stimulus administered by dropping different weights on the microscope stage every 10 s for 5 min.

R2.
Response to S2 was contraction of the body and stalk. S2 eliciting R2 is also an example of an unconditionable reflex, which is an Automatic Stage 1 behavior.

S3.
Mechanical stimulus was administered by modifying the media of the organism.

R3.
Response to S3 was contraction of the body and stalk. S3 eliciting R3 is also example of unconditionable reflex which is an Automatic Stage 1 behavior. Habituation occurred with administration of all three stimuli. The longer the organisms were exposed to the stimuli, the longer became the periods in which the organisms were nonresponsive.
Example 3. Paramecia are Automatic Stage 1 animals. This is shown by their failure to respondently (Mingee, 2013) and operantly condition (Mingee & Armus, 2009). They show behaviors of sensitization in the following example. S1. One of the stimuli used in the study by Mingee (2013) was level of illumination.

R1.
Response to S1, level of illumination, was moving away from light (in most paramecia, with the exception of Paramecium bursaria). S1 eliciting R1 is an example of taxis, which is an Automatic Stage 1 behavior.

S2.
The other stimulus used was shock in the cathode side of the trough.

R2.
Response to S2 was swimming to the non-cathode side (i.e., moving away from the shock). S2 eliciting R2 is also example of taxis, which is an Automatic Stage 1 behavior. When S1 and S2 were paired to investigate whether S1 would elicit the same response as S2 after the pairing (i.e., checking for presence of respondent conditioning), it was found that S1 no longer elicited R2 after 1 min of the first testing trial. Thus, pairing of the two stimuli was unsuccessful, and respondent conditioning did not occur, suggesting that paramecia behave at Automatic Stage 1.

Stage 2 Sensory or Motor
At Sensory or Motor Stage 2, organisms coordinate two stimulus response pairs from the lower Automatic Stage 1. An example of this is respondent conditioning. The criterion for classifying something as Sensory or Motor Stage 2 is that the pairing of stimuli leads to conditioning (Commons, Miller, Commons-Miller, & Chen, 2012). Unlike at Stage 1, the responses begin to be more flexibly associated with stimuli with which they have been paired. Either the detection of stimuli or the production of responses is somewhat flexible.
Respondent conditioning at Stage 2 of hierarchical complexity organizes two stimulus response pairs from the lower Automatic Stage 1. Three characteristics of this Stage are as follows: a) Two stimuli are paired either in a naturalistic environment or by an experimenter. In other words, an unconditioned stimulus that already elicits an unconditioned response is paired with another salient stimulus and; b) The organism's behavior does not directly cause the consequential reinforcing stimuli in this situation as it does in operant conditioning. Reflexes that are conditioned are also Stage 2 behaviors; c) The organism does not temporally or in some other way organize or coordinate more than one action in order to more adequately accomplish this task.
Order 2 conditioned reflexes are defined in terms of Order 1 automatic reflexes. This forces the hierarchical nature of the relations and makes the higher order task include the lower ones. In mathematical terms, this is the same as a set being formed out of elements. This creates the hierarchy. A 2 = {a1, b1} a1, b1 are lower order than A2 and compose set A2. When A2 is organized with other tasks, it is at least transitional to a higher order task and therefore beyond the scale of Stage 2 behavior. A2 ≠ {A2,...} Automatic Stage 1 behaviors such as behavioral habituation and sensitization are organized to make Stage 2 respondent conditioning behaviors. The salience of the unconditioned stimulus is determined by the automatic Stage 1 processes of behavioral habituation and sensitization. The organization of salience extended between the unconditioned and conditioned stimulus is the organization of two or more automatic Stage 1 tasks. This creates a Stage 2 task respondent condition task action, which transfers the salience between an automatically salient (unconditioned) stimulus and a previously nonsalient (conditioned) stimulus.
For organisms performing at Sensory or Motor Stage 2, the important forms of behavior for the account being presented here are reflexes, and the most complex process is respondent conditioning. A reflex procedurally links stimulus to response (Pavlov, 1927). Reflexes can be mediated by a reflex arc only a few neurons long (Palkovits & Záborszky, 1977). In a reflex, the stimulus and the response are organized, but the coordination is automatic. An example of an unconditioned reflex is that when water moves, mollusks open their shells reflexively (Palkovits & Záborszky, 1977). If something touches their membrane, the shells close. There is very little variability in these responses.
For a respondent conditioning procedure, a Sensory or Motor Stage 2 task action is the pairing of two eliciting stimuli: an environmental stimulus and an unconditioned stimulus (UCS). A salient UCS and S already exist before the pairing, and the endogenously salient UCS automatically elicits the unconditioned response (UCR). After a sufficient number of occurrences, such pairings transform the neutral stimulus (S) into a conditioned stimulus (CS). The CS becomes more salient by having acquired most of its saliency from being paired with the endogenously salient UCS (Lawrence, Klein, & LoLordo, 2009). This CS then elicits the conditioned response (CR), which is a variation of the unconditioned response (UCR) (Pavlov, 1927). In respondent conditioning, there is the organization of stimulus-elicited actions by organizing the stimuli.
Every organism for which the authors have found records of Stage 2 performance also has neurons. From this it is inferred that to perform Sensory or Motor Stage 2 task actions, organisms have to have networks of neurons to organize the conditioning of reflexes.
Stage 2 actions will be illustrated using examples from three studies. Finding animals that respondently condition but do not operantly condition is a difficult task. That is partly because many people who have been studying invertebrates in particular, who are candidates for being this kind of animal, have been primarily interested in doing neuronal studies of these relatively simple animals as they are undergoing respondent conditioning (Abramson, 1994). For most of the instances of respondent conditioning that we have come across, it is unclear whether operant conditioning of that organism has even been attempted. In most cases, no published reports have been found. Of course that does not mean that attempts have not been made.
Example 1. The first example comes from the study done by Mpitsos and Davis (1973) on marine gastropod Pleurobranchaea (sea slugs). In the study, they successfully respondently conditioned sea slugs. The neutral stimulus (NS) used in this study was tactile stimulation of the oral veil using a sterile glass probe. The unconditioned response to NS, tactile stimulation of the oral veil, was withdrawal and bite-strike response (UCR1). The UCS used in this study was food chemicals (homogenized squid). The unconditioned response to the food chemicals (UCR2) was feeding behavior. Here, Definition 1 is demonstrated by identifying the adjacent Stage 1 actions. The NS was paired with the UCS, food chemicals. The NS was coated with the food chemicals, UCS, and the oral veil was stroked for 10 s. After the NS and UCS pairing, tactile stimulation of the oral veil became the conditioned stimulus.
After the tactile stimulation of the oral veil became a conditioned stimulus, it elicited the same response as the UCR2 did, which was feeding behavior during CS, but before UCS. This pairing demonstrates the organization of two task actions, as shown in Definition 2. Thus, tactile stimulation of the oral veil became the conditioned response and the tactile stimulation of the oral veil no longer elicited the UCR1. The fact UCR1 is no longer elicited after conditioning shows the organization of the task actions was nonarbitrary. The fact that the conditioning is non-arbitrary demonstrates Definition 3.
Example 2. The second example (Henderson & Strong 1972) on Macrobdella ditetra (leech). In the study, they successfully respondently conditioned leeches. The NS used in this study was light from a light bulb. The unconditioned response to NS, light, was a head-turning response (UCR1). This is an unconditioned response to light. The UCS used in this study was shock. The unconditioned response to shock (UCR2) was the anteroposterior contraction after the presentation of UCS. During the pairing trials, the light was presented for 3 s, then the shock was presented for the last 0.1 s of that 3-s period. After 25 trials per day for 10 days, the presentation of light without a shock preceded anteroposterior contraction similar to a shock more than 50% of the time. Here, Definition 1 is demonstrated by identifying the adjacent Stage 1 actions. The first Automatic Stage 1 pairing is the unconditioned stimulus and the unconditioned response. The UCS used in this study was shock. The unconditioned response to the shock UCR2 was the anteroposterior contraction after the presentation of UCS. This is an untrained response to shock. The second Stage 1 pairing was light (NS) eliciting a cephalic-turning response (UCR1).
The NS, light, was paired with shock (UCS). After the NS and UCS pairing, light became the conditioned stimulus. As a conditioned stimulus, light elicited the same response as shock did, which was anteroposterior contraction. Thus, anteroposterior contraction became the CR and the light no longer elicited UCR1. Here, two Stage 1 actions are organized around the same response.
After the conditioning, the light consistently elicits full body contractions (UCR2) and not a head turning response (UCR1). The fact that the neutral response is no longer elicited after conditioning shows the organization of the task actions was nonarbitrary. The fact that the conditioning is nonarbitrary demonstrates Definition 3.
Example 3. The third example involves planarian (flatworms), Dugesia dorotocephala, that were respondently conditioned by Thompson and McConnell (1955). The NS used in this study was light from light bulb. The unconditioned response to the light (UCR1) was a low (10-30%) rate of turn responses and a very low (<5%) contraction rate. The UCS used in this study was shock. The unconditioned responses to the shock (UCR2) were a sharp turning of the cephalic (head) region to one side or the other and a longitudinal contraction of the entire body. Here, Definition 1 is demonstrated by identifying the adjacent Stage 1 actions. In this example, light (NS) rarely eliciting a low probability of turning or a weak contracting response (UCR1) in planarian is one automatic Stage 1 action. The second automatic Stage 1 action was the shock (UCS) eliciting a higher probability turning or a strong contracting response (UCR2).
The NS of light was presented for 3 s and then the UCS of shock was presented during the last 1 s of the NS. Through this process, the NS, light, was paired with the UCS, shock. After the light became a CS, it elicited the same responses as the UCR2 did, which were sharp turnings of the cephalic (head) region to one side or the other and a longitudinal contraction of the entire body. The above organization of lower order task actions demonstrates Definition 2.
After the conditioning, the light consistently elicits full body contractions (UCR2) and not a head turning response (NR). The fact the neutral response is no longer elicited after conditioning shows the organization of the task actions was nonarbitrary. The fact that the conditioning is nonarbitrary demonstrates Definition 3.

Previous of the Relationship between Operant and Respondent Conditioning
The relationship between operant (instrumental) and respondent (classical) conditioning procedures has been a concern in the field of learning and conditioning since the 1930s (Skinner, 1938). After that, two types of theories have tried to explain whether or not there might be a relationship between operant and respondent conditioning. Single-factor theorists (Hull, 1943(Hull, , 1952Pavlov, 1927Pavlov, , 1955 presupposed that all conditioning requires the reinforcement of stimulus-response associations. Some of these theorists (Hull, 1952) did not distinguish between reinforcing stimuli that follow a NS, as in respondent and reinforcing stimuli (S R+ ) that follow responses or as in operant conditioning. Because single factor theories do not make this distinction, they may not adequately account for differences found between the two conditioning processes. Two factor theories have focused on the differences between the two conditioning processes.
Especially, the operant accounts do not have a clear mechanism for strengthening operant behaviors. Whereas one factor theorists such as Pavlov (1927) argued that conditioning requires the reinforcement of stimulus-response associations, two factor theorists such as Skinner (1938) argued that associations between R and S R+ get strengthened, but in our respondent account of operant conditioning, all responses including operant responses have to be elicited.
Up to this point, past literature that looked at a possible relationship between operant and respondent conditioning has been reviewed. The major unexplained part of the mechanism is why the operant behavior occurs in the first place. From traditional operant accounts (Herrnstein, 1970), possible roles for both internal events that occur before the behavior have been largely neglected. To solve this problem of incomplete mechanistic accounts of operant conditioning, a procedural model to account for operant conditioning has been proposed here. Operant conditioning is based on five procedural steps of respondent or respondent conditioning procedures. To this end, the five-step model of the respondent basis of operant conditioning was created.

The case for Operant Conditioning being 5 Steps of Respondent Conditioning
This section presents an argument that Circular Sensory-Motor Stage 3 action (operant conditioning) may be accounted for by the 5 steps of procedural respondent conditioning.
As per Definition 1, Order 3 operant behaviors are defined in terms of the five Sensory or Motor Stage 2 conditioned reflexes. Each conditioned reflex is the result of a respondent conditioning Stage 2 action. This forces the hierarchical nature of the relations and makes the higher order task include the lower ones. In mathematical terms, this is the same as a set being formed out of elements. The relationship between sets and subsets demonstrates the hierarchy: A 3 = {a2, b2} a2, b2 are lower order than A3 and compose set A3. When A3 is organized with other tasks, it is at least transitional to a higher order task and therefore beyond the scale of Stage 2 behavior. A3 ≠ {A3,...} Operant conditioning is defined by five steps of respondent conditioning. All five respondent conditioning steps have a fixed order from 1 to 5. This makes the ordering nonarbitrary. There was an earlier version of this model with three respondent steps creating operant conditioning (Commons & Giri, 2016).
Steps 4 and 5 are being presented here for the first time, which are also presented in Commons et al. (2019).
These steps are as follows: Step 1, "What is the value of doing it?"; Step 2, "What to do?"; Step 3, "When to do it?"; Step 4, "Why to do it"; Step 5, "Where to do it?" As per Definition 2, operant conditioning is defined in terms of the 5 respondent conditioning steps. Each step represents a pairing of two stimuli. All the respondent steps have same form. The difference between the steps consist of which stimuli get paired in each step. The outcome of the pairing forms the basis for the next steps. This results in an absolute ordering of the steps. The steps must be done in the order described, or operant conditioning does not occur. Hence, as in Definition 3, the ordering is nonarbitrary. For example, in Step 1, the drive stimulus is paired with the consequence, changing the value of the consequence, making it a reinforcer or a punisher. The new valued consequence is then paired with the cause of the operant behavior.
Each explanation for the steps in the five respondent steps of operant conditioning is followed by examples from the study by Andrew and Savage (2000) on Lymnaea (pond snail). In Step 2, an example from the human literature is used in addition to Andrew and Savage (2000). These authors' methods for studying Lymnaea were as follows. A pond snail was placed in a glass gutter. The gutter was placed within a white surround, 30 cm high. Halfway along the gutter and visible through its sides, two panels, either black or white, were placed on either side of the gutter. Lymnaea were reinforced with sucrose when its head reached the level of the panels. Lymnaea learned to reach the level of panels, either black or white.
In this example, Definition 1 is demonstrated. Definition 1 states that an action at a higher order of hierarchical complexity is defined in terms of two or more adjacent lower order actions. Definition 1 is demonstrated by identifying the adjacent Stage 2 actions. Here, the adjacent lower order actions are the five respondent conditioning steps.

In
Step 1, from respondent conditioning, the activation of the drive stimulus (SDrive) gives value to the consequence SConsequence. That is, the nonarbitrary pairing of a drive stimulus with a consequence changes the value of the consequence simultaneously. The result is that Sconsequence changes to S R+ , a reinforcing consequence. This can also be referred to as the US/S R+ . A drive stimulus and the consequence of response are associated with changing the immediate value of that consequence.
An example for Step 1 is as follows: The drive ("hunger") fits the consequence (sucrose). Thus, the value of responding to consequence (sucrose) is established.

In
Step 2, from respondent conditioning, a particular internal representation of behavior (srb) is paired with the S R+ . This internal representation elicits the operant response. The association of the internal representation of behavior (srb) with the US/S R+ makes the internal representation of behavior more salient and thereby helps to strengthen the operant response. The response potential from Step 1 is nonarbitrarily organized with an internal representation of the behavior.

An example for
Step 2 is as follows: Support for the existence of the internal representation of the behavior (srb) that elicits the operant is given by Sutton, Braren, Zubin, and John (1965). They observed heightened EEG (evoked potential) readings as a discrimination was being acquired by human participants. Differences were found in the evoked potential as a function of whether or not the sensory modality of the stimulus was anticipated correctly (Sutton et al., 1965).
In pond snails, there is an inferred representation of behavior (rb) that elicits moving towards the level of the black and white panels (UCR/R). That representation of behavior (rb) becomes salient by being paired with the sucrose (UCS/S R+ ). This pairing, [rb → UCR/R] -(UCS/S R+ ), is a Sensory or Motor Stage 2 action.

In
Step 3, the stimulus control process is the pairing of the now more salient neural stimulus srb (along with the operant response) with the environment event SEnvironment. This is a "when" pairing because the cue or cues in the environment elicit the internal representation of behavior, srb, determining when it occurs. When the relevant cues are not here, the neural stimulus does not occur. In operant terms, this pairing changes the environmental stimulus into a discriminative stimulus. Both SEnvironment and srb have to be salient in order for learning to take place (Rescorla & Wagner, 1972). The internal stimulus becomes controlled by the occurrence of the environmental stimulus no matter what the time difference is.

An example for
Step 3 is as follows: Andrew and Savage (2000) indirectly showed that the now salient representation of behavior (rb), which elicited the operant behavior (R), was paired with the prior environmental stimulus, the visible black and white panel. Here operant behavior (R) is the pond snail moving towards the level of the black and white panels to get the sucrose. The pairing of salient rb and environmental stimulus is a Sensory or Motor Stage 2 action. This is represented as S o [rb → UCR/R].

In
Step 4, from respondent conditioning, the environmental stimulus is paired with the S R+ making SEnvironment more salient and valued. Pairing the environmental stimulus with the reinforcing stimulus establishes the environmental S as an incentive (see Killeen, 1982a). The incentive value means that there is an increase in the salience and value of the representation of a reinforcement rate relative to the representation of other behaviors that are not associated with reinforcement.
When drive-consequence-neural event pairing is organized with the environment, the environment becomes predictive of the valued consequence. Environmental stimuli that elicit this operant response lead to increased arousal in the organism.

An example for
Step 4 is as follows: In Step 4, the environmental S, the visible black and white panel, is paired with sucrose (UCS/S R+ ). This pairing makes the S more salient and valuable. This pairing acts to produce an incentive (Killeen, 1982a(Killeen, , 1982b(Killeen, , 1984(Killeen, , 1985. The environmental S takes on the elective properties of UCS/S R+ . This is represented as S o UCS/SR+. In Step 5 from respondent conditioning, SEnvironment gets paired with SDrive. After multiple trials of this type of pairing, the properties of the environment or a similar environment are paired with SDrive. The organism will then react to an environment where there is a drive associated with it.

In
Step 5, the environmental stimulus gets paired with SDrive. After multiple trials of this type of pairing, the properties of the environment or a similar environment are paired with drive stimulus SDrive. The organism will then react to an environment where there is a drive associated with it.
An example for Step 5 is as follows: In Step 5, the environmental stimulus of the tank water and clean water is organized with SDrive to feed. This is observed by the snails rasping for food in the environment when they are immersed in clean water 18 hr after the second trial. This is represented as [SEnvironment o SDrive → REnvironment].
In this example, Definition 2 is demonstrated. Definition 2 states that an action at a higher Order of Hierarchical Complexity organizes the adjacent lower order actions. Here, the organization is demonstrated by the fact that all were completed.
In this example, Definition 3 is demonstrated. Definition 3 states an action at a higher OHC organizes those adjacent lower order actions in a nonarbitrary way. The organization of the lower order tasks is nonarbitrary because the tasks were completed in the fixed order required for the 5 respondent steps of operant conditioning to work.
Organisms that solve Circular Sensory Motor Stage 3 tasks are multicelled with some sort of more complex nervous system than what is seen in Sensory or Motor Stage 2 animals.
Operant conditioning is a Stage 3 action. Operant conditioning is built out of the nonarbitrary coordination of four Sensory or Motor Stage 2 task actions or steps. These steps are Step 1, "What is the value of doing it"; Step 2, "What to do"; Step 3, "When to do it"; Step 4, "Why to do it,"; and, Step 5, "Where to do it?" as shown in Figure 3 (derived from Commons & Giri, 2016). The 5 steps of respondent conditioning are from Stage 2, but the they are not organized until Stage 3. These 5 very different cases of procedural pairings in respondent conditioning are used. The only commonality between the 5 respondent conditioning steps is the basic respondent conditioning procedure for pairing a stimulus with an already eliciting stimulus. Step 3 When to do it? SEnvironment o srb-R "Operant"  RConditioned Reflex Step 4 Why to do it? SEnvironment o S R+  RIncentive Step

Stage 4 Sensory-Motor
At Sensory-Motor Stage 4, organisms coordinate two or more circular sensory-motor subtask actions into a superordinate concept. In the tradition of Hull (1920), concepts are defined as several stimuli organized by following common response. For example, a participant could be tasked with sorting stimuli for color. Alternatively, they could sort for shape. Color or shape are each superordinate concepts organizing the operant selection behaviors. New and untrained instances of the concept are responded to correctly. These correct responses do not depend on simple stimulus generalization, as concepts like shape and color are superordinate properties of stimuli and not stimuli themselves. This analysis argues that nonarbitrary coordination of multiple operant tasks is required for concept learning to be successful demonstrated with novel stimuli.
Order 4 concepts are defined in terms of Order 3 operant behaviors. This forces the hierarchical nature of the relations and makes the higher order task include the lower ones. In mathematical terms, this is the same as a set being formed out of elements. This creates the hierarchy. A 4 = {a3, b3} a3, b3 are lower order than A4 and compose set A4. When A4 is organized with other tasks, it is at least transitional to a higher order task and therefore beyond the scale of Stage 4 behavior. A4 ≠ {A4,...} Example 1. Bhatt, Wasserman, Reynolds, and Knauss (1988) trained four pigeons individually in a Skinner box with a slide projector. The pigeons were pretrained to peck one of four keys when they were illuminated. They were also trained to peck one of the keys when the screen was illuminated the same color as the key. For the acquisition part of the study, the pigeons were presented with one slide after which all the keys lit up. Each slide was selected from a pool of 2,000 different slides. There were four categories: person, flower, car, and chair. Each category had 500 slides. Each category was associated with one of the four keys. The pigeons key pecks were reinforced for pecking the key that corresponded to the category of object on the slide. During training each day, a pigeon was presented with 10 randomly selected slides from each category, for a total of 40 slides. They never saw the same slide twice. The pigeons' performance rose from 24% correct in the first block of five training sessions to 70% in the last block. While the pigeons did not achieve perfect performance, the progress from chance level to 70% correct responses shows growing proficiency in discriminations between four different concepts.
In this example, Definition 1 is demonstrated. Definition 1 states that an action at a higher OHC is defined in terms of two or more adjacent lower order actions. Here, Definition 1 is demonstrated by identifying the adjacent Stage 3 actions. The first Stage 3 operant task is pecking the keys based on color of light. The second Stage 3 operant task is selecting the keys based off the category the slide depicts.
In this example, Definition 2 is demonstrated. Definition 2 states that an action at a higher OHC organizes the adjacent lower order actions. Here, Definition 2 is demonstrated by describing the organization of the adjacent Stage 3 actions. The Stage 3 task of pecking based off the color of light is organized with the Stage 3 task of pecking based off the category of the slide. These two tasks together allow the pigeon to select the correct illuminated key for the category the slide came from.
In this example, Definition 3 is demonstrated. Definition 3 states an action at a higher OHC organizes those adjacent lower order actions in a nonarbitrary way. The organization of the two Stage 3 operant tasks is nonarbitrary because they must be trained on selecting keys before they can apply this key selection proficiency to the representations on the slides.
Example 2. The following is a description of Stage 4 behavior in rats. Bailey and Thomas (1998) performed smell-based oddity tests in rats. First, pushing ping-pong balls was reinforced. Then the rats were repeatedly presented with three scented stimuli balls. Two were always of identical scent, while the third was always different from the other two. The scents were different in every trial. The behavior of selecting the third stimulus that was scented differently from the other two was reinforced. The rats learned to reliably select the differently scented ball 45% of the time in the first trial and 82% in the third trial. Thus, same/different concept learning has been demonstrated in rats.
In this example, Definition 1 is demonstrated. Definition 1 states that an action at a higher OHC is defined in terms of two or more adjacent lower order actions. Here, Definition 1 is demonstrated by identifying the adjacent Stage 3 actions. The first Stage 3 task action is pushing the balls. The second Stage 3 task action is only pushing the ball with the odd scent. Pushing the ball trains the operant behavior to the sight of the balls. Pushing the balls with only the odd scent and only after viewing them is responding to a set of stimuli.
In this example, Definition 2 is demonstrated. Definition 2 states that an action at a higher OHC organizes the adjacent lower order actions. Here, Definition 2 is demonstrated by describing the organization of the adjacent Stage 3 actions. The task of pushing a ball is organized with the task of pushing the ball and then only pushing the ball of the with odd scent. This organization yielded the result of reliably selecting the ball with the odd scent.
In this example, Definition 3 is demonstrated. Definition 3 states an action at a higher OHC organizes those adjacent lower order actions in a nonarbitrary way. The Stage 3 task of pushing a ball at all must precede the Stage 3 task of only pushing the ball with the odd scent. Therefore, the organization is nonarbitrary. A literature search did not find any more hierarchically complex tasks than this performed by rats or pigeons; therefore, it is inferred that rats and pigeons are operating at Stage 4.

An Example of Cross-Species Difference in Stage
Bitterman (1965) compared behavior of rats and fish (African mouthbreeders) being given the same spatial reversal task. Each subject from each species was placed in a box with identically colored illuminating panels. The boxes for rats and fish were similar. When they pressed the panel in the correct position, they received a food reward. For both species, there were 20 trials per day to the criterion of 17 out of 20 correct choices, positive and negative positions being reversed for each animal whenever it met that criterion. The rats eventually improved to the point where they would respond correctly to the new panel after a single error. The fish did not show significant improvement.
Each panel press demonstrates a neutral stimulus becoming salient due to reinforcement. In order to receive the reinforcement, they must complete the Stage 3 operant task of pressing the panel.
Organizing the operant task to respond correctly to the alternating pattern is an Order 4 task. Therefore, this study demonstrates a difference in Stage between rats and African mouthbreeder fish.

Limitations
The mathematical work on the MHC shows that the nonempirical aspects of the model are well defined (Commons & Pekker, 2004;. Application of the process of finding correspondence between the MHC and human problem-solving and perspective-taking has been demonstrated several times (Commons, 2007;Commons & Chen, 2014). However, issues do occur, particularly when interpreting work performed for other purposes.
One issue with any model that makes use of concepts like behavior and conditioning is the lack of correspondence of what people mean when they use these terms (Abramson & Wells, 2018). Fundamentally, this analysis is more concerned with the organization of adjacent lower order tasks into the next order task than meeting the requirements for the various definitions, such as classical and operant conditioning. Attempts that were made demonstrate sufficient evidence for popular definitions of these terms to apply.
Until animal research that is specifically designed to test predictions of the MHC is performed, claims about animal stages will not be statistically supported. Investigations that interpret pre-existing work according to a new model are subject to definition problems between the author of the preexisting work and the new interpretation. To ameliorate this problem, this paper focuses on descriptions of the procedures the animals went through. This reduces the potential for the error of asserting that the original authors share the same definitions the authors of this paper do.
Other forms of definitional confusion are still present. For example, a reader may disagree that particular nonarbitrary organization of Stage 1 behaviors fits a particular definition of classical conditioning. More important to this paper is the question: Were the lower order task actions organized nonarbitrarily?

Conclusion
In this paper, the MHC is applied to the evolution and development of the first four stages of behavior and conditioning. The OHC of a task completed has also been shown to predict real world outcomes, such as earnings by street peddlers, r(44) = .70 (e.g., Miller et al., 2015); income of high-level sales people (Goodheart, Commons, & Chen, 2015), and number of neurons in animal brains, r(17) = .87 . Most importantly in this context, the model also provides a framework for a more precise understanding of differences in successful task completions across different animals (e.g., . As a mathematical model, the MHC's validity is not particularly dependent on a set of studies' evidence. However, the applicability and usefulness may be illustrated with examples. The evidence in this paper was collected from the work by researchers with other goals. As the authors did not design this research, some of the models claims are better supported than others.
The MHC theory shows how it may be applied to the difficulty of tasks that different species of animal can successfully address. Evidence in the form of examples was provided about nonhuman animals. Each claim has an example in which these species performed at a particular stage in the MHC and did not operate in any of the higher stages.
Suggestions for ways to improve the field of comparative cognition were proposed by Beran, Parrish, Perdue, and Washburn (2014). One point they made is that the current field of comparative cognition says what an animal can or cannot do. They were still trying to look for ways to show why an animal performs in a certain way. The generality of the mathematics behind the MHC allows for species to be classified at different points along a behavioral developmental and evolutionary scale based on the tasks they perform.
As a content-free measure of difficulty, the MHC facilitates novel predictions. Giri, Commons, and Harrigan (2014) found that there is only one domain of the MHC. This implies that given proper reinforcement and bodily form, an animal may perform any task at or below the OHC at which the animal operates.
The MHC is a tool to scale the evolutionary and developmental place a task falls. The application of the OHC of a task is to use it as an independent variable to predict the number of neurons that an animal has, the number of stacks neural networks in the brain has, the limits of difficulty an animal may be trained to correctly address (Commons et al., 2019;Commons & Ross, 2008;. There has been a decline in interest in comparative psychology in both textbooks and in the annual review of psychology (Abramson, 2018). This may be addressed in part by providing a different perspective to comparative research. The a priori basis for task comparison across species that the MHC provides has the potential to not only focus better research questions but also to invigorate the discussion of comparative psychology.
While is largely accepted that our behaviors developed by an evolutionary process, the role of successfully addressing increasingly more difficult tasks has been largely mysterious. Use of the MHC, particularly on the early stages, could shed light on the origin of complex behavior. The methodologies employed could be diverse. There are certainly events in the brain that predict and control behavior. It is also important to study behavioral contingencies directly. Possibly the largest contribution of the MHC to such efforts is to supply questions that researchers are not currently asking.