About
The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.
Volume 19, 1997
Long Papers
Distinguishing Name Centrality From Conceptual Centrality
The features of a concept differ in their centrality. Having a seat is more central to the concept of chair than is having arms. This paper claims that centrality is not a homogeneous phenomenon in that it has at least two aspects, conceptual and naming. We propose that a feature is central to naming in proportion to the feature's category validity, the probability of the feature given the category. In contrast, a feature is conceptually central (immutable) to the extent the feature is depended on by other features. We predict that conceptual and naming centrality diverge as categories become more specific. An experiment is reported that provides corroborating evidence. Increasing the specificity of object categories increased the judged mutability of representative features without affecting their judged appropriateness for determining names.
Tracking the Time Course of Lexical Activation in Continuous Speech
Eye-movements to pictures of four objects on a screen were monitored as participants heard progressively larger gates and tried to identify the object (Experiment 1) or followed a spoken instruction to move one of the objects (Experiment 2), e.g., "Pick up the beaker; now put it below the diamond". The distractor objects included a cohort competitor with a name that shared the initial onset and vowel as the target object (e.g., beetle), a rhyme competitor (e.g. speaker) and an unrelated competitor (e.g. carriage). In the gating task, which emphasizes word initial information, there was clear evidence for multiple activation of cohort members, as measured by judgments and eye-movements. With continuous speech there was clear evidence for both cohort and rhyme activation as predicted by continuous activation models such as TRACE (Elman and McClelland, 1988). Moreover, the time course and probabilities of eye-movements closely corresponded to simulations generated from TRACE.
Claim Strength and Burden of Proof in Interactive Arguments
Previous research shows an anti-primacy effect (Bailenson & Rips, 1996), in that the first speaker in a conversational argument incurs more Burden of Proof (BOP) than the second speaker. In addition, claims may be encoded differently when they are embedded in a structured dialogue than when processed outside the context of the argument. We were interested in determining how the strength of specific claims in the argument depend on their location in the structure as a whole, and whether anti-primacy would persist in disputes where the claims offered by the two speakers were equally convincing. Subjects read interactive arguments between two speakers having a conversation, rated the convincingness and support levels of the individual claims when they were both embedded in the dialogue and removed from the dialogue, and judged overall burden of proof. Different groups of subjects saw the same arguments, the only difference being which speaker (first or second) made the particular groups of claims. The antiprimacy effect occurred even though the strength of the claims did not change as a function of which speaker presented them. In addition there was no difference between convincingness and support ratings, although the results demonstrated that the level of both types of ratings was somewhat a function of where in the argument structure the claims were situated. Specifically subjects perceived claims occurring in the initial position in the dialogue as less convincing than the same claims when the claims were removed from the context of the argument. furthermore, these initial claims correlated less with BOP than did the later claims.
Modeling Embodied Lexical Development
This paper presents an implemented computational model of lexical development for the case of action verbs. A simulated agent is trained by an informant giving labels to the agent's actions (here hand motions] and the system learns to both label and carry out similar actions. Computationally, the system employs a novel form of active representation and is explicitly intended to be neurally plausible. The learning methodology is a version of Bayesian model merging (Omohundro, 1992). The verb learning model is placed in the broader context of the L0 project on embodied natural language and its acquisition.
A Cortical Model of Cognitive 40 Hz Attentional Streams, Rhythmic Expectation, and Auditory Stream Segregation
We have developed a neural network architecture that implements a theory of attention, learning, and trans-cortical communication based on adaptive synchronization of 5-15 Hz and 30-80 Hz oscillations between cortical areas. Here we present a specific higher order cortical model of attentional networks, rhythmic expectancy, and the interaction of higher-order and primary cortical levels of processing. It accounts for the "mismatch negativity" of the auditory ERP and the results of psychological experiments of Jones showing that auditory stream segregation depends on the rhythmic structure of inputs. The timing mechanisms of the model allow us to explain how relative timing information such as the relative order of events between streams is lost when streams are formed. The model suggests how the theories of auditory perception and attention of Jones and Bregman may be reconciled.
Nonverbal Factors in Understanding and Remembering Indirect Requests
The present studies investigated the degree to which a speaker's nonverbal behavior, specifically eye gaze and hand gesture, influences how people understand and remember indirect requests. In the first study, we examined whether people consider a speaker's eye gaze and/or gesture toward an object in the environment when deciding if a particular utterance is indirect or not. We presented a sequence of short, videotaped scenarios to participants in which two characters produced speech which could potentially be construed as indirect. We found that respondents took nonverbal behavior into consideration when making their judgments. In a second study, we investigated whether nonverbal information intrudes upon people's memory for speech. Results from a cued recall study suggest that nonverbal information is occasionally incorporated into memory for speech.
A Neural Network Model of Visual Tilt Aftereffects
RF-LISSOM, a self-organizing model of laterally connected orientation maps in the primary visual cortex, was used to study the psychological phenomenon known as the tilt aftereffect. The same self-organizing processes that are responsible for the long-term development of the map and its lateral connections are shown to result in tilt aftereffects over short time scales in the adult. The model allows observing large numbers of neurons and connections simultaneously, making it possible to relate higher-level phenomena to low-level events, which is difficult to do experimentally. The results give computational support for the idea that direct tilt aftereffects arise from adaptive lateral interactions between feature detectors, as has long been surmised. They also suggest that indirect effects could result from the conservation of synaptic resources during this process. The model thus provides a unified computational explanation of self-organization and both direct and indirect tilt aftereffects in the primary visual cortex.
Categorization by Elimination: A Fast and Frugal Approach to Categorization
People and other animals are very adept at categorizing stimuli even when many features cannot be perceived. Many psychological models of categorization, on the other hand, assume that an entire set of features is known. We present a new model of categorization, called Categorization by Elimination, that uses as few features as possible to make an accurate category assignment. This algorithm demonstrates that it is possible to have a categorization process that is fast and frugal--using fewer features than other categorization methods--yet still highly accurate in its judgments. W e show that Categorization by Elimination does as well as human subjects on a multi-feature categorization task, judging intention from animate motion, and that it does as well as other categorization algorithms on data sets from machine learning. Specific predictions of the Categorization by Elimination algorithm, such as the order of cue use during categorization and the time-course of these decisions, still need to be tested against human performance.
Cue-based learners in parametric language systems: application of general results in a recently proposed learning algorithm based on unambiguous 'superparsing'
Cue-based learners have often been proposed as models of language acquisition by linguists working within the Principles and Parameters framework. Drawing on a general theory of cue-based learners described in detail elsewhere (Bertolo et al., 1997), we show here that a recently proposed learning algorithm (Fodor's Structural Triggers Learner (1997)) is an instance of a cue-based learner and that it is therefore unable to learn systems of linguistic parameters that have been proved to be beyond the reach of any cue-based learner We demonstrate this analytically, by investigating the behavior of the STL on a linguistically plausible space of syntactic parameters.
Causal Induction: The Power PC Theory versus the Rescorla-Wagner Model
Two experiments compared the influence of the probability of the effect given the absence of the candidate cause on the causal judgments of candidate causes with the same AP, defined as the difference between the probability of the effect in the presence of a candidate cause and that in its absence. Our results strongly support the power PC theory (Cheng, 1997) but contradict the Rescorla-Wagner model (1972) and the traditional AP model.
Representing Abstract Words and Emotional Connotation in a High-dimensional Memory Space
A challenging problem in the computational modeling of meaning is representing abstract words and emotional connotations. Three simulations are presented that demonstrate that the Hyperspace Analogue to Language (HAL) model of memory encodes the meaning of abstract and emotional words in a cognitively plausible fashion. In this paper, HAL's representations are used to predict human judgements from word meaning norms for concreteness, pleasantness, and imageability. The results of a single-word priming experiment that utilized emotional and abstract words was replicated. These results suggest that it is unnecessary to posit separate lexicons to account for dissociations in priming results. HAL uses global co-occurrence information from a large corpus of text to develop word meaning representations. Representations of words that are abstract or emotional are formed no differently than concrete words.
A Least-Action Model for Dyskinesias in Parkinson's Disease
A model of motor planning is proposed that relies on energy regulation. The system to be controlled is treated as a point mass, and its motion is governed in part by an artificial (or internal) potential. In this case, the energy to be regulated is also artificial, since it is the sum of real kinetic energy and artificial potential energy. Energy regulation is achieved by enforcing Hamilton's principle of least action to drive the motion. By regulating the energy of the point mass, straight=line reaches or circular orbits can be planned. An extension of a previous model for the striatum is summarized in terms of energy based control. Finally, this extension is discussed in the context of hypokinetic symptoms seen in Parkinson's disease.
Cognitive Processes in Regret for Actions and Inactions
Reasoning about matters of tact and reasoning about matters of possibility and impossibility may depend on the same sorts of mental representations and processes. We illustrate a mental model theory of counterfactual thinking with reference to the action effect (the tendency to regret actions more than inactions) and we describe an experiment which examines the effects of short-term and long-term perspectives on regret for actions and inactions.
Generating Coherent Messages in Real-time Decision Support: Exploiting Discourse Theory for Discourse Practice
This paper presents a message planner, TVaumaGEN, that draws on rhetorical structure and discourse theory to address the problem of producing integrated messages from individual critiques, each of which is designed to achieve its own communicative goal. TraumaGEN takes into account the purpose of the messages, the situation in which the messages will be received, and the social role of the system.
Organizational Adaptation and Cognition
A view of organizations as complex, computational and adaptive systems in which knowledge and learning are embedded in multiple levels is presented. According to this perspective, activity at one level can interfere with or support activity at other levels. As such, organizational adaptation requires finding a balance between these levels. These ideas are illustrated using results from a computational model of organizational performance. Results suggest that organizations can trade knowledge and learning at one level for knowledge and learning at another. As such, for the organization performance becomes a balancing act between levels.
Reinvestigating the Effects of Surface and Structural Features on Analogical Access
Competing theories of analogical reasoning have disagreed on the relative contributions of surface and structural features to the access of analogs. The present experiment attempted to systematically assess how access is affected by the number of surface and structural matches between a currently-read story and one that is presumably in memory. The results suggest that both surface and structural features affected access about equally.
Rationality the Fast and Frugal Way
In a major theoretical paper, Gigerenzer and Goldstein (1996a) argue that classical rationality should be rejected as a norm of good reasoning, and that this thesis undermines both rational models of human thought and the alternative heuristics-and-biases program. They illustrate their argument by proposing that a specific cognitive estimation problem may be carried out by the "Take the Best" algorithm, which is "fast and frugal," but not rational. We argue: (1) that "fast and frugal" cognitive algorithms may approximate rational norms, and only in this way can their success be explained; and (2) that new computer simulations, and considerations of speed and generality, suggest that other algorithms are at least as psychologically plausible as Take the Best.
Reaction Time Analyses of Repetition Blindness
Repetition blindness (RB) usually refers to the inability to detect or recall a repeated item as opposed to an unrepeated item in rapid serial visual presentation (RSVP). Using a category counting task (i.e., to count how many times a given category appears in an RSVP list). Experiment 1 found RB for repeated Chinese characters in RSVP lists. In Experiment 2, subjects were required to respond only to the second occurrence of a given category in RSVP lists. RB occurred under the fast display rate (117ms/item) but not under the slow rate (200ms/item). Moreover, longer response latencies were found in the repeated condition relative to the unrepeated condition under the fast rate, whereas a reverse pattern was shown under the slow rate. Implications of the present methodology and findings on the processing of repeatedly presented stimuli are discussed in the paper.
Identifying Dual-Task Executive Process Knowledge using EPIC-Soar
In this paper, we present a lineage of models that is used to identify the additional knowledge required to perform two tasks concurrently at an expert level. The underlying architecture used for this modeling is EPIC-Soar, a combination of the sensory and motor modules of EPIC, and the cognitive processing of Soar Within EPIC-Soar, we build models for the Wickens' task, a combination of tracking and choice-reaction time tasks. A key product of the models is an identification of the knowledge required to combine these two tasks: the executive process knowledge. We also demonstrate that it is possible to learn some of this knowledge through experience. We achieve performance comparable, in terms of error-rates and reaction times, to human data and an EPIC model.
Recursive Inconsistencies Are Hard to Learn: A Connectionist Perspective on Universal Word Order Correlations
Across the languages of the world there is a high degree of consistency with respect to the ordering of heads of phrases. Within the generative approach to language these correlational universals have been taken to support the idea of innate linguistic constraints on word order. In contrast, we suggest that the tendency towards word order consistency may emerge from non-linguistic constraints on the leaming of highly structured temporal sequences, of which human languages are prime examples. First, an analysis of recursive consistency within phrase-structure rules is provided, showing how inconsistency may impede leaming. Results are then presented from connectionist simulations involving simple recurrent networks without linguistic biases, demonstrating that recursive inconsistencies directly affect the leamability of a language. Finally, typological language data are presented, suggesting that the word order patterns which are infrequent among the world's languages are the ones which are recursively inconsistent as well as being the patterns which are hard for the nets to learn. We therefore conclude that innate linguistic knowledge may not be necessary to explain word order universals.
Incremental Sequence Learning
As linguistic competence so clearly illustrates, processing sequences of events is a fundamental aspect of human cognition. For this reason perhaps, sequence learning behavior currently attracts considerable attention in both cognitive psychology and computational theory. In typical sequence learning situations, participants are asked to react to each element of sequentially structured visual sequences of events. An important issue in this context is to determine whether essentially associative processes are sufficient to understand human performance, or whether more powerful learning mechanisms are necessary. To address this issue, we explore how well human participants and connectionist models are capable of learning sequential material that involves complex, disjoint, long-distance contingencies. We show that the popular Simple Recurrent Network model (Elman, 1990), which has otherwise been shown to account for a variety of empirical findings (Cleeremans, 1993), fails to account for human performance in several experimental situations meant to test the model's specific predictions. In previous research (Cleeremans, 1993) briefly described in this paper, the structure of center-embedded sequential structures was manipulated to be strictly identical or probabilistically different as a function of the elements surrounding the embedding. While the SRN could only learn in the second case, human subjects were found to be insensitive to the manipulation. In the new experiment described in this paper, we tested the idea that performance benefits from "starting small effects" (Elman, 1993) by contrasting two conditions in which the training regimen was either incremental or not. Again, while the SRN is only capable of learning in the first case, human subjects were able to learn in both. We suggest an alternative model based on Maskara & Noetzel's (1991) Auto-Associative Recurrent Network as a way to overcome the SRN model's failure to account for the empirical findings.
Learning to Make Decisions Under Uncertainty: The Contribution of Qualitative Reasoning
The majority of work in the field of human judgement and decision making under uncertainty is based on the use and development of algebraic approaches, in which judgement is modelled in terms of mathematical choice functions. Such approaches provide no account of the mental processes underiying decision making. In this paper we explore a cognitive model (implemented within COGENT) of decision making developed in order to account for subject performance on a simulated medical diagnosis task. Our primary concern is with learning, and empirical results on human learning in the modelled task are also reported. Learning in the computational model shares many qualitative features with the human data. The results provide further support for cognitive (i.e., non-algebraic) approaches to decision making under uncertainty.
Modelling the selection of Routing Action: Exploring the Criticality of Parameter Values
Several authors have distinguished automatic behaviour of routine or well-learnt action sequences from controlled behaviour of novel actions. In this paper we present an interactive activation model of routine action selection based on the Contention Scheduling theory of Norman & Shallice (1986). The model, developed in the specific domain of coffee preparation, provides a good account of normal behaviour in a complex yet routine task. In addition, we report lesioning studies which show breakdown of action selection qualitatively similar to that seen in a variety of neurological patients (action disorganisation syndrome, utilisation behaviour, and Parkinson's disease). These lesioning studies are based on the systematic variation of critical system parameters. Such parameters, which are implicit in all interactive activation models, raise complex methodological issues relating to the criticality of their values. We address these issues by reporting results of a detailed exploration of the parameter space.
Polysemy in Conceptual Combination: Testing the Constraint Theory of Combination
Most novel noun-noun combinations are polysemous in that they tend to suggest several possible meanings. A finger cup can be a cup in which fingers are washed, a cup shaped like a finger, a narrow cup and so on. In this paper, we present a new theory of concept combination, the constraint theory, that accounts for the polysemy of noun-noun combinations. Constraint theory, which uses three constraints (of inclusion, plausibility and informativeness) acting over a unitary mechanism that generates candidate interpretations, makes certain predictions about the polysemy of different combinations. In particular, it predicts that combinations involving artifact terms should be more polysemous than those involving natural kinds because the former have functional models that promote multiple interpretations. In a single experiment, this prediction is confirmed along with other predictions about the types of interpretation that tend to be produced.
Instructional Effects on Spatial and Temporal Memory for Videotaped Events in a Large-scale Environment
The separability of spatial and sequential mental representations was examined through the use of sketch-maps and ordered event-lists generated by subjects following the viewing of a videotape depicting movement through a natural space. Prior to viewing, subjects were instructed that they would either a) draw a map of the region depicted and place events on the map (map group), b) make a list of the events they saw in the order they saw them (list group), or c) answer some unspecified set of questions following the video (control group). In fact, subjects did all of the above. Although most measures of spatial and sequential accuracy were unaffected by the instructional manipulation, subjects who expected to draw maps were more likely to correctly indicate that the camera had negotiated the space in a figure-eight path, while subjects in the other groups predominantly indicated circular path shapes. None of our analyses provide any strong evidence that an independent spatial representation exists prior to map-drawing. In fact, the similarity between groups suggests that all subjects utilized similar encoding strategies, but that map subjects specifically attended to features of the film which constrain the overall layout of the space. This research raises specific questions about the mechanisms which allow path segments to be integrated into coherent spatial reference frames.
Communication in a Collaborative Health Care Team: Coordinating Tasks and Attaining Goals
Decisions are being made by groups with increasing frequency, requiring that individuals collaborate within teams. In order to do so, the team must create a shared mental model of its goals and processes. Communication has been shown to play a fundamental role in the development and evolution of this model as well as in the achievement of team goals. Previous research has established that roles within teams are well-defined and that each team member is familiar with them, that communication is most frequent among those whose tasks are most interdependent and interrelated, and that communication centers around attaining team goals. This study addresses the structure of team collaboration and the role of communication in maintaining the structure of an out-patient primary care unit at Beth Israel Deaconess Medical Center in Boston, Massachusetts. A work and activity analysis showed that individual roles are clear and distinct and part of the shared mental model of the team, reducing redundancy and omission of goal-directed tasks. Communication was found to be more frequent among team members with related tasks and with more similar models of practice. Communication topics were found to be related to team goals. The importance of the shared mental model and of communication in the collaborative process is emphasized. Different domain experts working together in a collaborative way complement each other through this shared understanding, maximizing the efficiency and the effectiveness of the process and outcome.
A Mixture of Experts Model Exhibiting Prosopagnosia
A considerable body of evidence from prosopagnosia, a deficit in face recognition dissociable from nonface object recognition, indicates that the visual system devotes a specialized functional area to mechanisms appropriate for face processing. We present a modular neural network composed of two "expert" networks and one mediating "gate" network with the task of learning to recognize the faces of 12 individuals and classifying 36 nonface objects as members of one of three classes. While learning the task, the network tends to divide labor between the two expert modules, with one expert specializing in face processing and the other specializing in nonface object processing. After training, we observe the network's performance on a test set as one of the experts is progressively damaged. The results roughly agree with data reported for prosopagnosic patients: as damage to the "face" expert increases, the network's face recognition performance decreases dramatically while its object classification performance drops slowly. We conclude that data-driven competitive learning between two unbiased functional units can give rise to localized face processing, and that selective damage in such a system could underlie prosopagnosia.
Recent Work in Computational Scientific Discovery
This paper reviews work in computational scientific discovery. After a brief discussion of its history, the focus will be on work since 1990. The second half of the paper discusses the author's use of three methods for studying reasoning strategies in scientific change: historical-philosophical vs. live-in-the-lab vs. computational, pointing out advantages and disadvantages of the computational method.
Ambiguity and Competition in Lexical Segmentation
Earlier research has suggested that left embedded words (e.g. cat in catalog) present a problem for spoken word recognition since it is potentially unclear whether there is a word boundary at the offset of cat. Models of spoken word recognition have incorporated processes of competition so that the identification of embedded words can be delayed until longer interpretations have been ruled out. However, evidence from acoustic phonetics has previously shown that there are differences in acoustic duration between the syllables of embedded words and the onsets of longer competitors. The research reported here used gating and cross-modal priming to investigate the recognition of embedded words. Results indicate that subjects use these acoustic differences to discriminate between monosyllabic words and the onset of longer words. We therefore suggest that on-line processes of lexical segmentation and word recognition are sensitive to acoustic information, such as syllable duration, that may only be contrastive with reference to prior spoken context.
Task Environment Centered Design of Organization
The central idea, which is not new for those who study human organizations but which is sometimes forgotten by computer agent researchers, is that the design of coordination mechanisms cannot rely on the principled construction of the agents alone, but must rely on the structure and other characteristics of the task environment. Such dependencies include the structure of the environment (the particular kinds and patterns of interrelationships that occur between tasks) and the uncertainty in the environment (both in the a priori structure of any episode within an environment and in the outcomes of an agent's actions). In this talk, I will briefly describe a modeling framework, TAEMS, for representing abstract task environments. TAEMS has been used both for environment modeling/simulation and as an internal representation for computer agents to plan, schedule, and coordinate thier activities with other agents (human or computer). I'll describe examples of both of these uses. This written summary provides a background bibliography, and pointers to the work discussed in the talk.
Disambiguation with Verb-predictability: Evidence from Japanese Garden-path Phenomena
This paper proposes a new model for human sentence processing which makes use of predictability of verbs from nouns for ambiguity resolution. The main claim is that verb distribution given a subject noun and an object noun varies depending on the animacy of the object noun, and that this variance influences the GP effect in Japanese. First, we report experimental results showing the asymmetry for the object-animacy in the GP effect, which cannot be explained in terms of semantic fitness, that is essential in constraint-based models. Then, we show, on the basis of a corpus analysis, that the difference of the object-animacy is related not to semantic fitness between nouns and verbs but to predictability of verbs from nouns. Finally, we propose our model of disambiguation using verb-predictability, and, based on this model, explain the asymmetry for the object-animacy observed in our experiment.
When Pseudowords Become Words - Effects of Learning on Orthographic Similarity Priming
This paper investigates empirical predictions of a connectionist model of word learning. The model predicts that, although the mapping between word form and meaning is arbitrary (thus rendering words as being symbols in the semiotic sense), novel pseudowords will be able to prime the concepts corresponding to word forms that are orthographically similar. If, however, pseudowords acquire meaning through an arbitrary mapping, this priming should be reduced. Two experiments support this hypothesis. Pseudowords, derived from and thus orthographically similar to English words, primed a categorization task involving those similar words. After a subsequent learning phase, in which subjects are asked to learn meanings for the pseudowords, this priming disappears. This interplay between iconic and symbolic use of words is proposed to emerge from connectionist learning procedures.
"On-Line" inductive Reasoning In Scientific Laboratories: What It Reveals About the Nature of induction and Scientific Discovery
"On-line" data of scientists thinking and reasoning in their laboratories were collected and analyzed, providing a rare glimpse into the day-to-day use of induction by scientists at work. Analyses reveal that scientists use different types of induction in specific orders and cycle through such types in ways that are dictated by their current goal and context. Further, the processes involved in major conceptual changes are identical to those involved in minor conceptual changes. Finally, first time analyses of women and men scientists reasoning in laboratories show that women and men scientists reason in a virtually identical manner.
The distinctiveness of form and function in category structure: A connectionist model
We present a new account of category structure derived from neuropsychological and developmental data. The account places theoretical emphasis on functional information. We claim i) the distinctiveness of functional features correlated with perceptual features varies across semantic domains, ii) the perceptual features representing specific functional mechanisms are strongly correlated with their function. The representational assumptions which follow from these claims make strong predictions about what types of semantic information is preserved in patients showing category-specific deficits following brain damage. We present a connectionist simulation which, when damaged, shows patterns of preservation of distinctive and shared functional and perceptual information varying across semantic domains. The data model both classic dissociations between knowledge for artefacts and for living things and recent neuropsychological evidence concerning the robustness of functional information.
Learning as formation of low-dimensional represntation spaces
Psychophysical findings accumulated over the past several decades indicate that perceptual tasks such as similarity judgment tend to be performed on a low-dimensional representation of the sensory data. Low dimensionality is especially important for learning, as the number of examples required for attaining a given level of performance grows exponentially with the dimensionality of the underlying representation space. Because of this curse of dimensionality, in shape categorization the high initial dimensionality of the sensory data must be reduced by a nontrivial computational process, which, ideally, should capture the intrinsic low-dimensional nature of families of visual shapes. We show how to make a connectionist system use class labels to leam a representation that fulfills this requirement, thereby facilitating shape categorization. Our results indicate that low-dimensional representations are best extracted in a learning task that combines discrimination and generalization constraints.
A Computational Theory of Vocabulary Expansion
As part of an interdisciplinary project to develop a computational cognitive model of a reader of narrative text, we are developing a computational theory of how natural-language-understanding systems can automatically expand their vocabulary by determining from context the meaning of words that are unknown, misunderstood, or used in a new sense. 'Context' includes surrounding text, grammatical information, and background knowledge, but no external sources. Our thesis is that the meaning of such a word can be determined from context, can be revised upon further encounters with the word, "converges" to a dictionary-like definition if enough context has been provided and there have been enough exposures to the word, and eventually "settles down" to a "steady state" that is always subject to revision upon further encounters with the word. The system is being implemented in the SNePS knowledge-representation and reasoning system.
What to Believe When Inferences are Contradicted: The Impact of Knowledge Type and Inference Rule
Simple belief-revision tasks were defined by a giving subjects a conditional premise, (p—>q), a categorical premise, (p, for a modus-ponens belief-set, or ~q, for a modus tollens belief-set), and the associated inference (q or ~p, respectively). "New" information contradicted the initial inference (~q or p, respectively). Subjects indicated their degree of belief in the conditional premise and the categorical premise, given the contradiction. Results indicated that the choice was a function of the knowledge type expressed in the conditional form; when that knowledge type was causal, the choice was affected by the number of disabling factors associated with the causal relationship. A "possible worlds" interpretation of the data is related to formal notions such as epistemic entrenchment, used in normative models of belief revision, and to reasoning from uncertain premises, from the human deduction literature.
Probabilities, Utilties and Hypothesis Testing
This paper considers the class of hypothesis testing tasks purporting to demonstrate pseudodiagnosticity. It argues that, as has recently been done with other hypothesis testing tasks, pseudodiagnosticity tasks may be re-analysed in terms of people's background beliefs about the probability of their evidential items and the utility of their various test outcomes. A sample analysis of a simplified task is presented along with the results of an experiment which demonstrate that subjects' behaviour corresponds to the prescriptions of the analysis. How the sample analysis might be applied to the standard pseudodiagnosticity task is discussed as are the implications of the results for current accounts of the effects of subjective probability on human hypothesis testing.
Combining Visual Cues to Depth and Shape: A Comparison of Three Models
Performance in estimating the depth and shape of an ellipse on the basis of stereo, motion, and vergence angle information was compared for three models of visual depth cue combination. The three models were a weak model (strict modularity, with no interaction between motion and stereo cues), a modified weak model (restricted interaction allowed between motion and stereo cues), and a strong model (unconstrained interaction between all visual cues). Results are that the modified weak model performed best overall indicating that its structure, which contains both modular and interactive features, has advantages over both the extreme modular organization of the weak model and the extreme interactive organization of the strong model. In addition, the different weighting of motion and stereo cues by the modified weak model in the depth and shape judgment tasks provides a motivation for multiple visual representations of three-dimensional space.
Towards a Computational Model of Evaluating and Using Analogical Inferences
Reasoning by analogy is a central phenomena in cognition. Existing computational models of analogy provide accounts of how analogical inferences are generated, but do not specify how they might be evaluated or integrated with other methods of reasoning. This paper extends the model of analogical inference in structure-mapping theory in two ways. First, we propose techniques for the structural evaluation of analogical inferences, to model one of the factors people appear to use in evaluating the plausibility of arguments based on comparisons. Second, we propose an information-level model of analogical inferences that supports reasoning about correspondences and mappings. We describe how this model fits with existing psychological evidence and illustrate its operation on several examples, using a computer simulation. These examples include evaluating the validity of a qualitative mental model and a prototype case-based coach that is being added to an already-fielded intelligent learning environment.
The Functions of Finite Support: a Canonical Learning Problem
The functions of finite support have played a ubiquitous role in the study of inductive inference since its inception. In addition to providing a clear and simple example of a leamable class, the functions of finite support are employed in many proofs that distinguish various types and features of learning. Recent results show that this ostensibly simple class requires as much space to leam as any other learnable set and, furthermore, is as intrinsically difficult as any other learnable set. This makes the functions of finite support a candidate for being a canonical learning problem. We argue for this point in the paper and discuss the ramifications.
Homographic Self-Inhibition and the Disappearance of Priming: More Evidence for an Interactive-Activation Model of Bilingual Memory
This paper presents two expenments providing strong support for an interactive-activation interpretation of bilingual memory. In both experiments French-English interlexical noncognate homographs were used, i.e., words like fin (= "end" in French), pain (= "bread" in French), that have a distinct meaning in each language. An All-English condition, in which participants saw only English items (word and non-words) and a Mixed condition, with half English and half French items, were used. For a set of English target words that were strongly primed by the homographs in the All-English condition (e.g., shark, primed by the homograph fin), this priming was found to disappear in the Mixed condition. We suggest that this is because the English "component" of the homograph is inhibited by the French component which only becomes active in the Mixed condition. Further, recognition times for these homographs as words in English were sigmficantly longer in the Mixed condition and the amount of this increase was related to the relative strength (in terms of printed-word frequency) of the French meaning of the homograph. We see no reasonable independent-access dual-lexicon explanation of these results, whereas they fit easily into an interactive-activation framework.
Discriminating Local and Distributed Models of Competition in Spoken Word Recognition
Local and distributed theories of representation make different predictions regarding the simultaneous activation of multiple lexical entries during speech perception. We report three experiments that use the cross-modal priming technique with fragments of spoken words to explore competition effects in the activation of multiple lexical representations. The experiments suggest that lexical activation is inversely related to the number of words being activated. This competition effect is stronger at the semantic than the phonological level of representation, supporting a model of speech perception in which sensory information is mapped directly onto distributed representations of both the form and the meanings of words.
The Dynamics of Prefrontal Cortico-Thalamo-Basal Ganglionic Loops and Short-Term Memory Interference Phenomena
We present computer simulations of a model of the brain mechanisms operating in short-term memory tasks that are consistent with the anatomy and physiology of prefrontal cortex and associated subcortical structures. These simulations include dynamical processes in thalamocortical loops which are used to generate short-term persistent responses in prefrontal cortex. We discuss this model in terms of the representation of input stimuli in cortical association areas and prefrontal short-term memory areas. We report on interference phenomena that result from the interaction of these dynamical processes and lateral projections within cortical columns. These interference phenomena can be used to elucidate the representational organization of short-term memory.
An Objective Approach to Trajectory Mapping through Simulating Annealing
Trajectory Mapping (TM) was introduced in 1995 as a new experimental paradigm and scaling technique. Because only a manual heuristic for processing the data was included, we offer an algorithm based on simulated annealing that combines both a computational approach to processing TM data and a model of the human heuristic used by Richards and Koenderink (1995). We briefly compare the TM approach with MDS and clustering, and then describe the details of the algorithm itself and present relevant several diagnostic measures.
Modelling the acquistion of syntactic categories
This research represents an attempt to model the child's acquisition of syntactic categories. A computational model, based on the EPAM theory of perception and learning, is developed. The basic assumptions are that (1) syntactic categories are actively constructed by the child using distributional learning abilities; and (2) cognitive constraints in learning rate and memory capacity limit these learning abilities. We present simulations of the syntax acquisition of a single subject, where the model learns to build up multi-word utterances by scanning a sample of the speech addressed to the subject by his mother.
A Cognitive Model of Learning to Navigate
Our goal is to develop a cognitive model of how humans acquire skills on complex cognitive tasks. We are pursuing this goal by designing computational architectures for the NRL Navigation task, which requires competent sensorimotor coordination. In this paper, we analyze the NRL Navigation task in depth. We then use data from experiments with human subjects learning this task to guide us in constructing a cognitive model of skill acquisition for the task. Verbal protocol data augments the black box view provided by execution traces of inputs and outputs. Computational experiments allow us to explore a space of alternative architectures for the task, guided by the quality of fit to human performance data.
Debunking the Basic Level
The goal of this paper is to introduce a new measure of basic-level performance that we will call the "category attentional slip." The idea behind it is very simple: The attentional mechanisms of an ideally rational categorizer are made to "slip" once in a while. We provide a formalization of attentional slip that specifies what an "ideally rational categorizer" is and how its attention "slips." We then compare its predictive capabilities with those of two established basic-level measures: category feature-possession (Jones, 1983) and category utility (Corter & Gluck, 1992). The empirical data used for the comparisons are drawn from eight classical experiments from Murphy and Smith (1982), Murphy (1991), and Tanaka and Taylor (1991).
The Precis of Project Nemo, Phase 1: Subgoaling and Subschemas for Submariners
Project Nemo examines the cognitive processes and representational structures used by submarine Commanders while attempting to locate an enemy submarine hiding in deep water. This report provides a precis of the first phase of this effort. Protocol data, collected from commanders with 20 years of submarine experience, have been transcribed and analyzed. The data suggest a shallow goal structure with a basic level of subgoals that are used by all Commanders throughout the task. Relatively few operators are required for each subgoal. The results are congruent with a schema theory interpretation in which the process of schema instantiation provides the control of cognition.
In Support of the Equal Rights Movement for Literal and Figurative Language - A Parallel Search and Preferential Choice Model
We challenge the commonly held view that the interpretation of metonymies should proceed from a literal-meaning-first approach and argue for an equally balanced treatment of literal and figurative language use. Resulting ambiguities are handled by a combination of two techniques. First, we incorporate discourse constraints into metonymy resolution, reflecting the systematic interaction patterns between the resolution of nominal anaphora and metonymies. We, second, impose constraints on metonymies that are based on pragmatic criteria.
Selecting Past-tense Forms for New Words: What's Meaning Got to Do With It?
When irregular verbs are semantically extended or used in novel ways, speakers often find the -ed past tense more natural than the irregular past tense, as in Ross Perot thought he couldn't be sound-bited. Speakers' preference for -ed with denominal verbs like sound-bited is consistent with the predictions of formal grammatical theory. Many theorists regard this as support for the relevance of the constructs of formal grammatical theory. We present data from two experiments supporting the predictions of an alternative view, the Shared Meaning Hypothesis. The data suggest that speakers' feelings of naturalness reflect how readily the two possible forms (soundbitten, soundbited) can be connected to the intended meaning. Our approach doesn't require formal constructs, and helps illuminate speakers' sensitivity to factors which facilitate error-free communication.
Expertise or Expert-ese? The Emergence of Task-Oriented Sub-Languages
This paper reports an experiment which demonstrates the emergence of group-specific sublanguages or 'expert-ese' within groups engaged in a series of task-oriented dialogues. Extending the findings of Garrod and Doherty (1994), it is argued that neither simple appeal to task expertise nor the collaborative establishment of mutual beliefs can adequately account for these results. An alternative proposal, that identifes repair as the critical locus of semantic coordination is sketched.
The Composition Effect in Symbolizing: The Role of Symbol Production vs. Text Compreshension
A person's ability to translate a mathematical problem into symbols is an increasingly important skill as computational devices play an increasing role in academia and the workplace. Thus it is important to better understand this "symbolization" skill and how it develops. We are working toward a model of the acquisition of skill at symbolizing and scaffolding strategies for assisting that acquisition. We are using a difficulties factors assessment as an efficient methodology for identifying the critical cognitive factors that distinguish competent from less competent symbolizers. The current study indicates there is more to symbolizing than translating individual phrases into symbols and using long-term schematic knowledge to fill in implied information. In particular, students must be able to compose these individual translation operations into a complete symbolic sentence. We provide evidence that in contrast to many prior models of word problem solving which address story comprehension skills, a critical element of student competence is symbolic production skills.
Designing for Understanding: Children's Lung Models
Complex systems are commonly found in natural and physical science. Understanding such systems is often difficult because they may be viewed from multiple perspectives and their analysis may conflict with or extend beyond the range of everyday experience. There are many complex structural, behavioral, and functional (SBF) relationships to understand as well. Design activities, which allow exploration of the way a system works and which eventually require deep understanding of that system for success, can be an excellent way to help children acquire a deeper, more systemic understanding of such complex domains. We report on a design experiment in which sixth grade children learned about the human respiratory system by designing and building artificial lungs. Students were interviewed pre- and postinstruction. Results of these interviews were analyzed using an SBF model for describing their understanding of the respiratory system. We consider the results in light of the children's actual activity and discuss some of the lessons learned.
Neuronal Mechanism of Memory Maintenance
We address the question of memory maintenance in a neuronal system whose synapses undergo continuous metabolic turnover. Our solution is based on neuronal regulation mechanisms. We develop this concept and demonstrate it within the framework of a neural model of associative memory. It operates in conjunction with random activation of the memory system, and is able to counterbalance degradation of synaptic weights, and to normalize the basins of attraction of all memories. Over long time periods, when the variance of the degradation process becomes important, synapses are no longer maintained at their original values. Nonetheless, memories can be maintained provided there exist appropriate bounds on synaptic growth. The remnant memory system is obtained by a dynamic process of synaptic selection and growth driven by neuronal regulatory mechanisms.
Attention and U-Shaped Learning in the Acquisition of the Past Tense
Plunkett & Marchman (1993) showed that a neural network trained on an incrementally expanded training set was able to master the past tense and show the U-shaped learning pattern characteristic of children. In Jackson, Constandse & Cottrell (1996) we argued that Plunkett & Marchman's restriction of the training set was unrealistic and proposed a model of selective attention that enabled our network to master the past tense without external restrictions on its training set. Analysis in the present paper shows that the network in Jackson, Constandse & Cottrell (1996) does not exhibit appropriate U-shaped learning, however. We propose a modified model of selective attention that results in the mastery of the past tense as well as the kind of U-shaped learning observed in children.
[i e a u] and Sometimes [o]: Perceptual Computational Constraints on Vowel Inventories
Common vowel inventories of languages tend to be better dispersed in the space of possible vowels than less common or unattested inventories. The present research explored the hypothesis that functional factors underlie this preference. Connectionist models were trained on different inventories of spoken vowels, taken from a naturalistic corpus. The first experiment showed that networks trained on well-dispersed five-vowel sets like [i e a o u] learned the inventory more quickly and generalized better to novel stimuli, compared to those trained on less dispersed vowel sets. Experiments 2-3 examined how effects due to ease of perception are modulated by factors related to production. Languages tend to prefer front vowel contrasts over back vowels because the latter tend to be produced with more variability. This caused networks trained on an [i e a u] inventory to perform better than those trained on [i a o u]. Thus both acoustic separation of vowels and variability in how they are realized in speech affect ease of learning and generalization. The results suggest that acoustic and articulatory factors can explain apparent phonological universals.
Strategy use while learning to perform the Kanfer-Ackerman Air Traffic Controller task
People chose different strategies for performing tasks, and that choice often plays a key role in performance. We investigate the use and evolution of strategic behavior in the Kanfer-Ackerman Air Traffic Controller© task, a fast-paced, dynamic task. We present strategies in two dimensions for one aspect of the task, examine how people use them and switch between them, and how their use relates to final performance. We also discuss the implications that the observed variety of strategic behavior has for cognitive modeling.
Control in Act-R and Soar
This paper compares the Act-R and Soar cognitive architectures, focusing on their theories of control. Act-R treats control (conflict resolution) as an automatic process, whereas Soar treats it as a potentially deliberate, knowledge-based process. The comparison reveals that Soar can model extremely flexible control, but has difficulty accounting for probabilistic operator selection and the independent effects of history and distance to goal on the likelihood of selecting an operator. In contrast, Act-R's control is well supported by empirical data, but has difficulty modeling task-switching, multiple interleaved tasks, and dynamic abandoning of subgoals. The comparison also reveals that many of the justifications for each architecture's control structure, such as some forms of flexible control and satisficing, are just as easily handled by both.
A Model Theory of Modal Reasoning
This paper presents a new theory of modal reasoning, i.e. reasoning about what may or may not be the case, and what must or must not be the case. A conclusion is possible if it holds in at least one mental model, whereas it is necessary if it holds in all the models. The theory makes a crucial prediction, which we corroborated experimentally. There is a key interaction: it is easier to infer that a situation is possible as opposed to impossible, whereas it is easier to infer that a situation is not necessary as opposed to necessary.
How to Make the Impossible Seem Possible
The mental model theory postulates that reasoners build models of the situations described in premises. A conclusion is possible if it occurs in at least one model; and it is impossible if it occurs in no models. According to the theory, reasoners can cope with what is true, but not with what is false. A computer implementation predicted that certain inferences should yield cognitive illusions, i.e. they have conclusions that should seem highly plausible but that are in reality gross errors. Experiment 1 showed that, as predicted, participants erroneously inferred that impossible situations were possible, and that possible situations were impossible, but they performed well with control problems. Experiment 2 replicated these results, using the same premises for both the illusory and the control inferences: the participants were susceptible both to illusions of possibility and to illusions of impossibility, but they coped with the control problems.
Constraints on the Design of a High-Level Model of Cognition
The TacAir-Soar system is a computer program that generates human-like behavior flying simulated aircraft in tactical air combat training scenarios. The design of the system has been driven by functional concerns, allowing the system to generate a wide range of appropriate behaviors in severely time-limited situations. The combination of constraints from the complexity and dynamics of the domain with the overall goal of human-like behavior led to a system that can be viewed as a model of cognition for high-level, complex tasks. This paper analyzes the system in such a light, and describes how the functional design constraints map on to cognitively plausible representations and mechanisms, sometimes in surprising ways.
Recognition Model with Narrow and Broad Extension Fields
A recognition model which defines a measure of shape similarity on the direct output of multiscale and multiorientation Gabor filters does not manifest qualitative aspects of human object recognition of contour-deleted images in that: a) it recognizes recoverable and nonrecoverable contour-deleted images equally well whereas humans recognize recoverable images much better, b) it distinguishes complementary feature-deleted images whereas humans do not. Adding some of the known connectivity pattern of the primary visual cortex to the model in the form of extension fields (connections between collinear and curvilinear units) among filters increased the overall recognition performance of the model and: a) boosted the recognition rate of the recoverable images far more than the nonrecoverable ones, b) increased the similarity of complementary feature-deleted images, but not part-deleted ones, more closely corresponding to human psychophysical results. Interestingly, performance was approximately equivalent for narrow (±15') and broad (±90') extension fields.
The processing of negatives during discourse comprehension
This paper investigates the eflfects of negation in discourse comprehension. The paper is based on the finding by Mac-Donald and Just (1989) that after reading sentences such as Elizabeth bakes some bread but no cookies subjects are faster to respond to the probe bread than to the probe cookies. The question arises whether this differential availability of the relevant concepts is due to negation, or whether it reflects the fact that a bread is present in the described situation, whereas cookies are not. In order to decide between these alternatives two experiments were conducted. In Experiment 1 negated entities that are absent from the described situation were compared with non-negated entities that are present, whereas in Experiment 2 negated entities that are present in the situation were compared with non-negated entities that are absent. The results of the two experiments indicate that both factors, namely 'negation' and 'absence from situation', affect the availability of concepts during discourse processing.
Reasoning with Multiple Diagrams: Focusing on the Cognitive Integration Process
In order to understand diagrammatic reasoning where multiple diagrams are involved, this study proposes a theoretical framework that focuses on the cognitive process of perceptual and conceptual integration. The perceptual integration process involves establishing interdependencies between the relevant data that have been dispersed across multiple diagrams, while the conceptual integration process involves generating and refining hypotheses by combining the individual data inferred from the diagrams. An experiment within the domain of business systems engineering was conducted where verbal protocols were collected. The results of the experimental study reveal that understanding a system represented by multiple diagrams involves a tedious process of visually searching for related information and of conceptually developing hypotheses about the target system. The results also showed that these perceptual and conceptual processes could be facilitated by providing visual cues that indicate where elements in one diagram are related to elements in other diagrams, and contextual information that indicates how the individual datum in one diagram is related to the overall hypothesis about the entire system.
Implicit Strategies and Errors in an Improved Model of Early Algebra Problem Solving
We have been refining a cognitive model, written in ACT-R, of student performance in early algebra problem solving. "Early algebra" refers to a class of problems and competencies at the boundary between arithmetic and algebra. Our empirical studies in this domain establish a striking contrast between students' difficulties with symbolic algebra and their relative success with certain kinds of "intuitive" algebraic reasoning. To better understand this contrast, we analyzed student solutions to identify the strategies and errors exhibited and then set out to account for this detailed process data with the utility-based choice mechanism of ACT-R. Our first model contained production mles for explicitly selecting strategies and for making certain systematic errors or bugs. It provided a good quantitative fit to student performance data (R2=.90), however, it had two quahtative shortcomings: 1) the productions for strategy selection appeared to serve no computational purpose and 2) the model systematically underpredicted the frequency of non-trivial errors on more complex problems. We created a new model in which explicit strategy selection was eliminated (strategic behavior is emergent) and in which failure to fire a production (an implicit, non-buggy error) is an option at every model choice point. Compared to the first model, this model achieved an equivalent quantitative fit with fewer productions and without the systematic deviations from the error data. We consider the implications of implicit strategies and errors for instruction.
Adding Spaces to Thai and English: Effects on Reading
Most research on reading has used Western languages, which have the property of being spaced. This paper examines how spacing and meaning affect reading in Thai, a modem, alphabetic and unspaced language. Results show that subjects were faster in reading and made less errors when spaces were added. Meaning facilitates reading as well, and does not interact with spacing. Finally, ability to read unspaced texts in Thai does not transfer to English. The results support the hypothesis that spaces, when present at all, offer perceptual cues that facilitate reading. Efficiency considerations raise the question of whether Thai should follow the example of Western languages and incorporate spaces and punctuation.
Informational Potentials of Dynamic Speech Rate in Dialogue
We examine five spontaneous dialogues conducted in Japanese and analyze the potential of speech rate change to signal the structure of information being exchanged in dialogue. We found (1) a bi-directional correlation between speech decelerations and the openings of new information, and (2) another bi-directional correlation between speech accelerations and the absence of information openings. Our data show that the correlations hold not only in the case of a single speaker's speech, but also in the case of multiple speakers' sequential utterances, with or without turn shifts. We also study possible disturbances to these default correlations and identify the limitation on speakers' cognitive resources as one major constraint that interferes with the accurate signaling of information opening by decelerated speech.
A Cognitive Model of Argumentation
In order to argue effectively one must have a grasp of both the normative strength of the inferences that come into play and the effect that the proposed inferences will have on the audience. In this paper we describe a program, NAG (Nice Argument Generator), that attempts to generate arguments that are both persuasive and correct. To do so NAG incorporates two models: a normative model, forjudging the normative correctness of an argument, and a user model, for judging the persuasive effect of the same argument upon the user. The user model incorporates some of the common errors humans make when reasoning. In order to limit the scope of its reasoning during argument evaluation and generation NAG explicitly simulates attentional processes in both the user and the normative models.
What It Means to Be "the Same": The Impact of Relational Complexity on Processing Efficiency
The fundamental relations that underlie cognitive comparisons -- "same" and "different" -- can be defined at multiple levels of abstraction, which vary in relational complexity. We compared reaction times to decide whether or not two sequentially-presented perceptual displays were the same at three levels: perceptual, relational, and system (higher-order relations). For both 150 msec and 5 sec interstimulus intervals, decision time increased with level of abstraction. Sameness at lower complexity levels contributed to decisions based on the higher levels. Relations at multiple levels of complexity can be abstracted and compared in working memory, with higher complexity levels requiring more processing time. Multiple levels can cooperate to reach a decision.
How Well Can Passage Meaning be Derived without Using Word Order? A Comparison of Latent Semantic Analysis and Humans
How much of the meaning of a naturally occurring English passage is derivable from its combination of words without considering their order? An exploratory approach to this question was provided by asking humans to judge the quality and quantity of knowledge conveyed by short student essays on scientific topics and comparing the inter-rater reliability and predictive accuracy of their estimates with the performance of a corpus-based statistical model that takes no account of word order within an essay. There was surprisingly little difference between the human judges and the model.
Learning to Act: Acquisition and Optimization of Procedral Skill
People become highly reactive when performing dynamic, real-time tasks like driving a car, playing a video game, or controlling air traffic. However, people also go through more deliberate stages in which they spend time reasoning about constraints and actions. In this paper, we argue that these are the end points on a learning continuum, and we discuss one potential mechanism for bridging these endpoints within the ACT-R (Anderson, 1993) framework.
Learning and Awareness in the Serial Reaction Time Task
This study examined evidence for implicit rule-based learning in the serial reaction time task and investigated the effect of explicit knowledge on performance. Participants responded to visual stimuli appearing in one of six locations. In each run, six stimuli were presented, with a stimulus appearing in each and every position exactly once in a random order. Participants implicitly learned the pattern as indicated by better performance on the sixth trials than on the first trials. Yet none of the three measures of explicit knowledge -- verbalization, free generation, and recognition -- were able to detect participants' awareness of the pattern. Explicit knowledge of the pattern improved performance, whereas active search for the pattern hurt performance if the pattern was not found. A possible learning mechanism is proposed to account for serial learning.
Logical and Diagrammatic Reasoning: the Complexity of Conceptual Space
Researchers currently seek to explain the observed tractability of diagrammatic reasoning (DR) via the notions of "limited abstraction" and inexpressivity (Stenning and Oberlander, 1995; Stenning and Inder, 1995). We point out that these explanations are inadequate, in that they assume that each structure to be represented (i.e. each model) has a corresponding diagram. We show that inefficacy (in the sense of incorrectness) arises in DR because some (logically possible) models fail to have corresponding diagrams, due to non-trivial spatial constraints. Further, there are good explanations of why certain restricted languages are tractable, and we look to complexity theory to establish such results. The idea is that graphical representation systems may be fruitfully analysed as certain restricted quantifier fragments of first-order logic, similar to modal logics and vivid knowledge bases (Levesque, 1986; Levesque, 1988). This focus raises some problems for the expressive power of graphical systems, related to their topological and geometrical properties. A simple case study is carried out, which pinpoints the inexpressiveness of Euler's Circles and its variants. We conclude that there is little mileage in spatial (i.e. diagrammatic) approaches to abstract reasoning, except perhaps in relation to studies of human performance. Moreover, these results have ramifications for certain claims about mental representations, and the recent trend in cognitive semantics, where "meanings" and "concepts" are to be explicated spatially. We show that there should be combinations of "concepts" or "meanings" which are prohibited by the structure of the spaces they supposedly inhabit. The formal results thus suggest an empirical programme.
Mediated Priming in High-dimensional Meaning Space: What is "Mediated" in Mediated Priming?
Four experiments are presented that demonstrate that mediated priming (e.g., lion stripes) does not rely on weak, although direct, semantic relationships or lexical co-occurrence as suggested by McKoon and Ratcliff (1992). A view of mediation in priming consistent with a distributed view of memory is presented that relies on shared contexts between the prime and target. Not all mediated items appear to share contexts, and ones that do not also do not show mediated priming. The focus on contextual mediation is consistent with how word meanings are acquired as modeled by the HAL memory model.
Improving Associative Memory Capacity: One-Shot Learning in Multilayer Hopfield Networks
Our brains have an extraordinarily large capacity to store and recognize complex patterns after only one or a very few exposures to each item. Existing computational learning algorithms fall short of accounting for these properties of human memory; they either require a great many learning iterations, or they can do one-shot learning but suffer from very poor capacity. In this paper, we explore one approach to improving the capacity of simple Hebbian pattern associators: adding hidden units. We propose a deterministic algorithm for choosing good target states for the hidden layer. In assessing performance of the model, we argue that it is critical to examine both increased stability and increased basin size of the attractor around each stored pattern. Our algorithm achieves both, thereby improving the network's capacity to recall noisy patterns. Further, the hidden layer helps to cushion the network from interference effects as the memory is overloaded. Another technique, almost as effective, is to "soft-clamp" the input layer during retrieval. Finally, we discuss other approaches to improving memory capacity, as well the relation between our model and extant models of the hippocampal system.
Incremental processing and infinite local ambiguity
In incremental parsing, infinite local ambiguity occurs when the input word can be combined with the syntactic structure built so far in a infinite number of ways. A common example is left recursion (e.g. "'railway station clock" or "his sister's boyfriend's shirt"), where local information cannot tell us the depth of embedding of the left descendent chain of nodes. From the processing point of view, infinite local ambiguity causes a technical problem, which a model must solve in order to implement incrementality fully. This paper provides a general solution to the problem of infinite local ambiguity, by introducing the concept of Minimal Recursive Structure. We give two examples of parsers in which the solution is used.
When a Word is Worth a Thousand Pictures: A Connectionist Account of the Percept to Label Shift in Children's Reasoning
We present a connectionist model of children's developing reliance on object labels as opposed to superficial appearance when making inductive inferences. The model learns to infer a fact about an object based on the object's label (and not percept) even though that fact has never been previously associated with the label. The shift in reliance from perceptual to label information is found to depend on; (a) the presence of a pre-linguistic ability to categorize perceptual information, and (b) the greater variability of percepts than labels. The model predicts that children will shift their inductive basis at different ages depending on the perceptual variability of the test categories. This prediction is discussed with respect to studies of children's induction and with particular reference to conflicting results reported in the literature concerning the onset of label use.
Modeling Individual Differences in a Digit Working Memory Task
Individual differences in working memory are an important source of information for refining theories of memory and cognition. Computational modeling is an effective tool for studying individual differences because it allows researchers to maintain the basic structure of a theory while perturbing a particular component. This paper presents a computational model for a digit working memory task and demonstrates that varying a single parameter captures individual differences in that task. The model is developed within the framework of the ACT-R theory (Anderson, 1993), and the continuous parameter manipulated represents attentional capacity for the current goal.
On the Trail of Information Searchers
In this paper, we sketch a model of how people search for information on the World Wide Web. Our interest lies in the cognitive properties and internal representations used in the search for information. We first collected behavioral data from individuals searching for answers to specific questions on the web, and we then analyzed these data to learn what searchers were doing and thinking. One finding was that individuals focus on key nodes when recalling their searches, and that these key nodes help structure memory. A second finding was that people tend to use the same search patterns over and over, and that they recall their searches in terms of their standard patterns—regardless of what they actually did. Overall, our results suggest that people form cognitive maps of web space in much the same the way that they form cognitive maps of physical space.
Does Complex Behavior Require Complex Representations?
Models in cognitive science often postulate that individuals maintain complex representations of their environment when simpler explanations, based on simple behaviors interacting with each other and environmental constraints, would suffice. As an example, I consider representational approaches to animal behavior (e.g., Gallistel, 1990; Myerson and Miezin, 1980), which posit that complex group behavior results from complex representations of events within the central nervous systems of individual animals. For example, ducks feeding from two food sources distribute themselves proportionately to the density of food available at each source. This phenomenon, probability matching, is typically explained by attributing representations of the density of food available at each source within the central nervous system (CNS) of each duck. Are such complex representations required to explain this phenomenon? I will compare the results of two simulations of probability matching in groups. In one, individuals maintain and update representations of food available at each source. Although probability matching emerges, the organisms exhibit various unrealistic behaviors. In the second, each individual follows simple behavioral rules but has no representation of the food density at each source. Probability matching emerges and the behavior observed is more realistic than that in the first simulation. This adds to demonstrations in other domains that complexity at one level of analysis need not result from complexity at lower levels (e.g., Resnick, 1994; Sigmund, 1993).
People's Folk Theory of Behavior
The folk theory of behavior is a conceptual framework that guides all of people's dealings with behavior, including attention, explanation, and control. Philosophy of action and developmental research into children's "theory of mind" have relied heavily on plausible but speculative assumptions about this folk theory. The present paper describes empirical research on three key elements of the theory, as found in the adult social perceiver: (a) how people conceptuaUze intentionality and differentiate intentional from unintentional behavior; (b) which types of behavior (intentional vs. unintentional, observable vs. unobservable) they attend to and choose to explain; and (c) how they explain these behaviors.
A Connectionist Account of Interference Effects in Early Infant Memory and Categorization
An unusual asymmetry has been observed in natural category formation in infants (Quinn, Eimas, and Rosenkrantz, 1993). Infants who are initially exposed to a series of pictures of cats and then are shown a dog and a novel cat, show significantly more interest in the dog than in the cat, However, when the order of presentation is reversed — dogs are seen first, then a cat and a novel dog — the cat attracts no more attention than the dog. We show that a simple connectionist network can model this unexpected learning asymmetry and propose that this asymmetry arises naturally from the asymmetric overlaps of the feature distributions of the two categories. The values of the cat features are subsumed by those of dog features, but not vice-versa. The autoencoder used for the experiments presented in this paper also reproduces exclusivity effects in the two categories as well the reported effect of catastrophic interference of dogs on previously learned cats, but not vice-versa. The results of the modeling suggest connectionist methods are ideal for exploring early infant knowledge acquisition.
Modality Specificities in Lexical Architecture?
This paper argues for asymmetries in lexical architecture and function, based on a series of repetition priming experiments examining the representation and access of morphologically complex forms in English. These results point to modality differences in representation at the level of the lexical entry, and to marked differences in access from speech and from text. We argue that speech inputs can map directly onto abstract morphemic representations, while input from text seems to involve mediated access, via intervening orthographic representations of word form.
From Image to Word: A Computational Model of Word Recognition in Reading
This paper describes a working, computational model of word recognition that combines a letter classification component with a component that segments the string of classified letters into words and uses a dynamic programming method for matching the words against a lexicon of over 2,800 words. The letter classification component is a neural network trained to classify, in parallel, inputs corresponding to 20x188 pixel array images of letter sequences, 14 or more letters long. Consistent with human capabilities, the system can classify all 14 letters at a level above chance, and on average, classifies the first 7 or 8 letters in the sequence correctly. Dictionary lookup improves classification accuracy by 1 character per image. The model is robust, having been trained and tested on the entire text of the book The Wonderful Wizard of Oz, printed in multiple fonts and in both mixed and upper-case letters. It provides a computation-level understanding of word recognition capabihties, in which errors are attributable to the theoretically inevitable difficulties associated with learning to classify large input patterns. The model mimics human capabilities for circumventing some of these difficulties by imposing constraints on fixation positions that reduce image variability.
Systematicity and Specialization in Semantics: A Computational Account of Optic Aphasia
Optic aphasic patients are selectively impaired at naming visually presented objects but demonstrate relative intact comprehension of those objects (e.g., by gesturing or categorization) and are able to name them when presented in other modalities (e.g., via tactile input). This and other modality-specific naming deficits have been taken as evidence that semantics is organized into distinct modality-specific subsystems. We adopt an alternative view in which semantics is a set of learned, internal representations within a parallel distributed processing system that maps between multiple input and output modalities. We account for the critical aspects of optic aphasia in terms of the effects of damage to such a system, despite its lack of modality-specific specialization. We show that the robustness of a task in such a system depends critically on its systematicity, and that modality-specific naming deficits can arise because naming is an unsystematic task.
Comprehension Skill: A Knowledge-Based Account
Gernsbacher (e.g., 1990) has proposed that comprehension skill is a function of the ability to suppress inappropriate or irrelevant information. This hypothesis is based on the finding that the inappropriate meaning of an ambiguous word loses activation for skilled comprehenders after a delay, but remains activated and slows comprehension for less-skilled comprehenders. It is hypothesized here that comprehension skill is not due to the suppression of information, but rather is enhanced by the activation of more knowledge. Simulations based on the Construction-Integration model of comprehension (Kintsch, 1988) show that the activation of more knowledge leads to an initial activation of an inappropriate meaning of a concept which quickly decays Without the activation of the knowledge, the inappropriate meaning remains activated. This account thus predicts and explains Gernsbacher's empirical data.
The Source and Character of Graded Performance in a Symbolic, Rule-based Model
This paper presents ongoing work that demonstrates how a discrete rule-based model may appropriately manifest graded performance and investigates the source contributing to graded performance of a particular rule-based model called SCA. Previous results have demonstrated that SCA produces appropriate graded performance as a function of learning experience, instance typicality, and other similarity-dependent properties. However, the source of its graded behavior has been somewhat obscured by the presence of continuous components in some aspects of the model. Fully symbolic altemates are presented here and the qualitative predictions from previous work is replicated, thereby suggesting that explicit gradient representations are not necessary for producing graded behavior In addition to replicating previous results, the results presented here clarify a peculiar character of the model, namely, that the model's typicality differences disappear after extended learning.
A Sublexical Locus for Repetition Blindness: Evidence from Illusory Words
When words containing an orthographically similar segment (rock, shock) are rapidly displayed in word lists and immediately reported by subjects, the second critical word (W2) is frequently omitted, a deficit known as repetition blindness (Kanwisher, 1987). Three experiments used an illusory words paradigm to demonstrate a sublexical locus for repetition blindness in orthographically overlapping words. In Experiment 1, we constructed RSVP streams of words and word fragments which would allow the W2's unique letter clusters to combine with a word fragment to create a word, as in rock shock ell. The illusory word shell was produced 36% of the time in the RB condition, compared to 16 % of the time for letter migration control trials (rock shoeu ell) and 16% of trials containing sequential presentation of the illusory word's fragments (rock sh ell). Experiment 2 demonstrated the same superiority for the RB condition over a letter migration control using nonword stimuli (riwu shiwu ell). Experiment 3 showed that the unique letters left-over after RB are marked for position. Implications for models of repetition blindness are discussed.
Body Schemas
Two studies investigated the existence and properties of the body schema, people's mental representation of the space of their bodies. Participants verified whether a named and a depicted body part were the same or different either when presented a picture of a whole body or when presented the body part alone. Part significance accounted for verification times better than part size or part discontinuity, suggesting that mental representations of the body reflect proprioceptive as well as visual knowledge.
Context-dependent Recognition in a Self-organizing Recurrent Network
Cognition of an object depends not only upon the sensory information of the object but also upon the context in which it occurs, as demonstrated in many psychology experiments. Although there has been considerable amount of research in cognitive science that demonstrates the importance of context, seldom has this research concerned specific computational mechanisms for learning and encoding of context. As context is largely an integration of the past up to the present, some form of information about the past stimuli must be abstracted and stored for a certain period of time so as to be used in the interpretation of the present stimulus. In this modelling approach we explore such mechanisms. In particular, we describe an unsupervised, sparsely connected, recurrent network that creates its own codings of input stimuli on ensembles of network units. Moreover, it also self-organizes itself into a short-term memory system that stores such codings. Simulations demonstrate the context-dependent recognition performance of the network.
Evolution of a Rapidly Learned Representation for Speech
Newly bom infants are able to finely discriminate almost all human speech contrasts and their phonemic category boundaries are initially identical, even for phonemes outside their target language, A connectionist model is described which accounts for this ability. The approach taken has been to develop a model of innately guided learning in which an artificial neural network (ANN) is stored in a "genome" which encodes its architecture and learning rules. The space of possible ANNs is searched with a genetic algorithm for networks that can learn to discriminate human speech sounds. These networks perform equally well having been trained on speech spectra from any human language so far tested (English, Cantonese, Swahih, Farsi, Czech, Hindi, Hungarian, Korean, Polish, Russian, Slovak. Spanish, Ukranian and Urdu). Training the feature detectors requires exposure to just one minute of speech in any of these languages. Categorisation of speech sounds based on the network representations showed the hallmarks of categorical perception, as found in human infants and adults.
Ways of Locating Events
This paper argues that the basic modes of spatial cognition can be best identified in terms of argument/participant location, and shows that natural language uses "simple" types of semantic denotations to encode spatial cognition. First we review event-based approaches to spatial location, and point out that spatial expressions should be interpreted not as locating an event/state as a whole but as locating arguments/participants of the sentence/event. Section 2 identifies the ways of locating events/states in terms "argument orientation", which indicates the ways of interpreting locative expressions. We identify four patterns of argument orientation which reveal substantial modes of spatial cognition—spatial properties and relations. Section 3 illustrates various classes of English transitive verbs with which spatial expressions induce argument orientation. We consider four types of locative prepositional phrases and show that the argument orientation pattern of a sentence is not determined by the type of spatial expressions but mostly by the type of the verb, i.e., the event type of the sentence. Section 4 concludes that semantic denotations of locative prepositional phrases are restricted to the "intersecting" functions mapping relations to relations, which are "basic and familiar" semantic objects out of the "heterogeneous" field of functions from relations to relations.
Talking the Talk is Like Walking the Walk: A Computational Model of Verbal Aspect
I describe an implemented computational model of verbal aspect that supports the proposition that the semantics of aspect is grounded in sensory-motor primitives. In this theory, aspectual expressions refer to schematized piocesses that recur in sensory-motor control (such as goal, periodicity, iteration, final state, duration, and parameters such as force and effort). This active model of aspect grounded in sensory-motor primitives is able to model cross-linguistic variation in aspectual expressions while avoiding some paradoxes and problems in model-theoretic and other traditional accounts.
Teachers' and Researchers' Beliefs of Early Algebra Development
Mathematics teachers and mathematics educational researchers were asked to rank order arithmetic and algebra problems for their predicted problem-solving difficulty for students. It was discovered that these predictions matched closely the view presented implicitly by common mathematics textbooks, but they deviated systematically from actual algebra students' performances in important ways. The Textbook view of early algebra development was contrasted with the Verbal Precedence (VP) model of development. The latter was found to provide a better fit of students' performance data. Implications for student and teacher cognition are discussed in light of these findings.
A Cognitive Model of Agents in a Commons Dilemma
KIS (knowledge and intentions in social dilemmas) is a process model of a cognitive-motivational theory of acting in a three person commons dilemma. The model provides an experimental tool to study how ecologically harmful actions evolve in commons problems by having differently parameterized variants of KIS interact with each other and with human subjects. KIS models the application and acquisition of ecological, social, and practical knowledge using a motive-driven decision procedure. To test this model, 42 subjects played a commons dilemma game in an unselfish or greedy social environment. Both environments were realized by pairs of appropriately configured KIS variants. Subjects did not recognize these co-players as being artificial and judged their motives accurately. Subjects' behavior in the unselfish environment was well predicted, however, in the greedy environment subjects based their decisions more on the state of the resource than was expected. To further test the model, we constructed a KIS variant for each subject with respect to the assessed individual motive structure and knowledge. These variants played the same in the same environments. Their actions were compared to the subjects' on both an aggregate and individual level. We obtained good fits in the unselfish environment. Systematic deviations in the greedy environment revealed that under this condition behavior was more determined by ecological aspects than by social comparison.
Modelling Physics Knowledge Acquisition in Children with Machine Learning
A computational approach to the simulation of cognitive modelling of children learning elementary physics is presented. Goal of the simulation is to support the cognitive scientist's investigation of learning in humans. The Machine Learning system WHY , able to handle domain knowledge (including a causal model of the domain), has been chosen as tool for the simulation of the cognitive development. In this paper the focus will be on knowledge representation schemes, useful to support further modelling of conceptual change.
Populations of Learners: the Case of Portugese
We present new results of a novel computational approach to the interaction of two important cognitive-linguistic phenomena: (1) language learning, long regarded as central to modem synchronic linguistics; and (2) language change over time, diachronic linguistics. We exploit the insight that while language learning takes place at the level of the individual, language change is more properly regarded as an ensemble property that takes place at the level of populations of language learners — while the former has been the subject of much explicit computer modeling, the latter been less extensively treated. We show by analytical and computer simulation methods that language learning can be regarded as the driving force behind a dynamical systems account of language change. We apply this model to the specific (and cognitively relevant) case of the historical change from Classical Portuguese (CP) to European Portuguese (EP). demonstrating how a particular language learning model (for instance, a maximum-likelihood model akin to many statistically-based language approaches), coupled with data on the differences between CP and EP, leads to specific predictions for possible language-change envelopes, as well as delimiting the class of possible language-learning mechanisms and linguistic theories compatible with a given class of changes. The main investigative message of this paper is to show how this methodology can be applied to a specific case, that of Portuguese. The main moral underscores the individual/population difference, and demonstrates the potential subtlety of language change: we show that simply because an individual child will, with high probability, choose a particular grammar (European Portuguese) does not mean that all other grammars (e.g.. Classical Portuguese) will come to be eliminated; rather, contrary to surface intuition, that is property of the dynamical system and the population ensemble itself.
The Role of Semantic Similarity in the Comprehension of Metaphor
According to the comparison view, preexisting similarities between the constituent terms of a metaphorical sentence are an important source of information for generating a figurative meaning. The interaction approach, by contrast, claims that similarity is not an antecedent but a product of comprehension. We shall argue, however, that each of these approaches is too narrow to provide a complete and exhaustive account of metaphor comprehension. Instead, both theories point out to two different but complementary cognitive processes. We present three experiments that support the theoretical distinction between analysis-based vs. synthesis-based processes in the comprehension of metaphor.
Simulation Models and the Power Law of Learning
The power law of learning has frequently been used as a benchmark against which models of skill acquisition should be measured. However, in this paper we show that comparisons between model behavior and the power law phenomenon are uninformative. Qualitatively different assumptions about learning can yield equally good fit to the power law. Also, parameter variations can transform a model with very good fit into a model with bad fit. Empirical tests of learning theories require both comparative evaluation of alternative theories and sensitivity analyses, simulation experiments designed to reveal the region of parameter space within which the model successfully reproduces the empirical phenomenon. Abstract simulation models are better suited for these purposes than either symbolic or connectionist models.
The Structure of the Verb Lexicon: Evidence from a Structural Alignment Approach to Similarity
Two different views of the organization of verbs in the mental lexicon have been formulated in recent years: the matrix view and the cluster view. The matrix view suggests that a verb shares as many features with verbs from other clusters as it shares with verbs from its own cluster. Thus, instead of being organized, like concrete nouns into well-defined hierarchies, verbs in the mental lexicon form a matrix like structure. While admitting differences between the organization of verb and noun lexicons, the cluster view claims that verbs form hierarchically organized clusters that resemble noun hierarchies in many ways. We report one study that extends research on similarity of nouns to verbs in order to shed light on these accounts. Subjects were presented with pairs of verbs and asked to list their commonalities or differences. The obtained patterns of commonalities, alignable and nonalignable differences are similar to the patterns obtained for hierarchies of nouns and are consistent with the cluster view of verb organization.
Comprehensible Knowledge-Discovery in Databases
Large databases are routinely being collected in science, business and medicine. A variety of techniques from statistics, signal processing, pattern recognition, machine learning, and neural networks have been proposed to understand the data by discovering useful categories. However, to date research in data mining has not paid attention to the cognitive factors that make learned categories comprehensible. We show that one factor which influences the comprehensibility of learned models is consistency with existing knowledge and describe a learning algorithm that creates concepts with this goal in mind.
Modeling a Functional Explanation of the Subitizing Limit
We present a model of enumeration that demonstrates one possible explanation for the limited capacity of subitizing. This analytical approach can be contrasted with most previous research on subitizing which has been primarily descriptive in nature, and which has tended to assume a structural limitation on the phenomenon. Our simulation results suggest instead that the limitation may arise from the functional constraints of learning to optimize among enumeration strategies for a space whose combinatorics increase greatly with number.
Simulations with a Connectionist Model for Implicit and Explicit Memory Tasks
A connectionist model incorporating activation and elaboration learning was investigated in five simulations of dissociation effects between implicit and explicit memory tasks. The first rwo simulations concerned the word frequency effect, revealing a high-frequency advantage in free recall and a low-frequency advantage in word completion. The third and fourth simulations were of the interference effect, which appeared to depend upon the amount of overlap between experimental material and intervening material. The last simulation addressed the focused vs. divided attention dissociation effect. Free recall performance was primarily affected by divided attention, but under conditions of high load word completion performance was also reduced. It is argued that a full model will probably not only implement activation/elaboration learning, but will also incorporate elements of the two other accounts available.
Systematicity: Psychological evidence with connectionist implications
At root, the systematicity debate over classical versus connectionist explanations for cognitive architecture turns on quantifying the degree to which human cognition is systematic. We introduce into the debate recent psychological data that provides strong support for the purely structure-based generalizations claimed by Fodor and Pylyshyn (1988). We then show, via simulation, that two widely used connectionist models (feedforward and simple recurrent networks) do not capture the same degree of generalization as human subjects. However, we show that this limitation is overcome by tensor networks that support relational processing.
Classification and Prior Assumptions about Category "Shape": New Evidence Concerning Prototype and Exemplar Theories of Categorization
According to prototype theories of categorization, the cognitive system makes the default assumption that a category, C, is a roughly convex region in an internal space. This suggests that the default assumption for the "negative" category, not-C, should be the complement of this region—i.e., the internal space, minus a convex "hole." These different prior assumptions suggest potentially radically different patterns of generalization in category learning. We show experimentally that such effects do occur. These results are compatible with prototype accounts of categorization, but seem incompatible with exemplar accounts. We consider potential empirical extensions of this research, and its wider theoretical implications.
Subjective Confidence and the Belief Bias Effect in Syllogistic Reasoning
An experiment is reported in which participants were asked to record how confident they felt about the correctness of their responses as they assessed the validity of deductive arguments whose conclusions varied in prior believability. The results showed that participants were more confident of their responses to valid problems than invalid problems irrespective of believability status, providing support for the idea that invalid problems are more demanding to process than valid problems. Effects of belief, logic on conclusion acceptance rates and a logicxbelief interaction are also demonstrated, and evidence is provided to suggest that belief bias principally reflects a tendency to reject unbelievable arguments. A theory is proposed in which belief bias effects are accounted for by the variations in the processing demands of valid and invalid syllogisms.
Is there a Place for Semantic Similarity in the Analogical Mapping Process?
Ramscar & Pain (1996) argued that the analogical process cannot be easily distinguished from the categorisation process at a cognitive level. In light of the absence of any distinction between analogy and categorisation, we have argued that analogy is supervenient upon an important part of the classification process, and that as such 'analogical' models are capable of illuminating some categorisation tasks, for instance, the way in which structural systematicity can detennine not only analogical judgements, but also category decisions. Our scepticism regarding the cognitive distinction between these two processes has implications for both analogy and categorisation research: in this paper we consider two leading analogical theories. Centner's Structure Mapping Theory and Holyoak's Multi-Constraint Theory, and argue that results from our use of analogical modeling techniques in categorisation tasks offer some important insights into exactly which elements should be included in a theory of analogical mapping.
Symmetries of Model Construction in Spatial Relational Inference
This article studies spatial relational inference within the framework of mental model theory. It focuses on the phase of model construction for which two cognitive modelings currently exist (Berendt, 1996; Schlieder, 1995). Both refer to the aggregated results of a former experiment (Knauff, Rauh & Schlieder, 1995). However, conflicting evidence exists with respect to symmetry properties of model construction that makes the assessment of the cognitive adequacy of certain explanations impossible. We therefore conducted an experiment using computational tools provided by AI research on Qualitative Spatial Reasoning (QSR) to investigate whether the model construction process works the same from left to right and vice versa (symmetry of reorientation), and whether the processing of spatial relations depends on what was already processed (symmetry of transposition). Experimental results clearly indicate that the symmetry of transposition cannot be found in subjects' answers to indeterminate spatial four-term series problems and that the degree of reorientation symmetry is not perfect. The latter, however, can be entirely attributed to performance variation, since the responses of retested subjects to the same problems were only concordant to the same degree.
Modeling the Mirror effect in a Continuous Remember/Know Paradigm
Words of varying pre-experimental frequency were presented up to 10 times each. On each presentation, three responses were allowed—new, remember, and know—the last for words that seem familiar, but give no conscious recollection of an earlier presentation. A novel pattern of results was predicted by the SAC memory model. SAC used the same parameter values used in fits to other tasks and provided good fits to the participants' remember and know responses.
The Roles of Causes and Effects in Categorization
The effect of knowledge about causal relationships between category attributes on categorization decisions was investigated. Participants were taught that category attributes were causally related in either a common-cause or a common-effect causal pattern. The weight given to attributes during subsequent categorization depended on the causal pattern: In the common-cause condition the common cause was weighted most heavily, whereas in the common-effect condition the common-effect was weighted most heavily. Participants also attended to correlations between causally related features, generating lower categorization ratings if a cause-effect relationship was violated. Participants displayed a wide variety of different strategies in making categorization decisions, including ones that employed higher-order configural information involving more than two attributes. There was no effect of the "kmd" of the category (biological kind, nonliving natural kind, or artifact) on categorization decisions, and kind of category did not interact with causal pattern.
Simple Recurrent Networks and Natural Language: How Important is Starting Small
Prediction is believed to be an important component of cognition, particularly in natural language processing. It has long been accepted that recurrent neural networks are best able to learn prediction tasks when trained on simple examples before incrementally proceeding to more complex sentences. Furthermore, the counter-intuitive suggestion has been made that networks and, by implication, humans may be aided in learning by limited cognitive resources (Elman, 1993, Cognition). The current work reports evidence that starting with simplified inputs is not necessary in training recurrent networks to learn pseudo-natural languages; in fact, delayed introduction of complex examples is often an impediment. We suggest that the structure of natural language can be learned without special teaching methods or limited cognitive resources.
Neural Correlates of Mathematical Reasoning: An fMRI Study of Word-Problem Solving
We examined brain activation, as measured by functional magnetic resonance imaging, during mathematical problem solving in six young, healthy participants. Participants solved problems selected from the Necessary Arithmetic Operations Test (NAOT) which is known to correlate with fluid reasoning tasks. In three conditions, participants solved problems requiring (1) one operation (Easy problems), (2) two operations (Hard problems) or (3) simple reading and matching of words (Match problems) in order to control for perceptual, motor and text reading demands of the NAOT problems. Major bilateral frontal activation and minimal posterior activation was observed while subjects solved Easy problems relative to Match problems. Minor bilateral frontal, temporal and lateralized activation of left parietal regions was observed in the Hard problems relative to Easy problems. All of these regions were activated more by Hard than by Match problems. Many of these activations occurred in regions associated with working memory. These results suggest that fluid reasoning is mediated by a composite of working memory systems that include central executive and domain specific numerical and verbal working memory.
The Many Functions of Discourse Particles: A Computational Model of Pragmatic Interpretation
We present a connectionist model for the interpretation of discourse particles in real dialogues that is based on neuronal principles of categorization (categorical perception, prototype formation, contextual interpretation). It can be shown that discourse particles operate just like other morphological and lexical items with respect to interpretation processes. The description proposed locates discourse particles in an elaborate model of communication which incorporates many different aspects of the communicative situation. We therefore also attempt to explore the content of the category discourse particle. We present a detailed analysis of the meaning assignment problem and show that 80%-90% correctness for unseen discourse particles can be reached with the feature analysis provided. Furthermore, we show that 'analogical transfer' from one discourse particle to another is facilitated if prototypes are computed and used as the basis for generalization. We conclude that the interpretation processes which are a part of the human cognitive system are very similar with respect to different linguistic items. However, the analysis of discourse particles shows clearly that any explanatory theory of language needs to incorporate a theory of communication processes.
General and Specific Expertise in Scientific Reasoning
Previous research on scientific reasoning has shown that it involves a diverse set of skills. Yet, little is known about generality of those skills, an important issue to theories of expertise and to attempts to automate scientific reasoning skills. We present a study examining what kinds of skills psychologists actually use in designing and interpreting experiments. The results suggest: 1) that psychologists use many domain-general skills in their experimentation; 2) that bright and motivated undergraduates are missing many of these skills; 3) some domain-general skills are not specific to only scientists; and 4) some domain-specific skills can be acquired with minimal domain-experience.
A Model of Rapid Memory Formation in the Hippocampal System
Our ability to remember events and situations in our daily life demonstrates our ability to rapidly acquire new memories. There is a broad consensus that the hippocampal system (HS) plays a critical role in the formation and retrieval of such memories. A computational model is described that demonstrates how the HS may rapidly transform a transient pattern of activity representing an event or a situation into a persistent stuctural encoding via long-term potentiation and long-term depression.
The Language of Physics Equations
The central hypothesis of this paper is that physics students learn to understand equations in terms of a number of conceptual elements that are referred to as "symbolic forms." Each symbolic form associates a simple conceptual schema with a pattern of symbols in an equation. Taken together, the set of symbolic forms constitutes a vocabulary of elements out of which novel expressions can be constructed, and in terms of which expressions can be understood. The work described here is based on an extensive analysis of a corpus of videotapes of moderately advanced university students solving physics problems.
On Using Theory and Data in Misconception Discovery
Approaches to concept formation tend to rely solely on similarities in the data, with the few that take into consideration causalities in the background knowledge doing so prior to or upon completion of a similarity-based learning phase. In this paper, we examine a multistrategic approach to misconception discovery that utilizes data and theory in a more tightly coupled way.
The Relation of Similarity to Naming: Chinese versus American Conceptions of Bottles and Jars
We distinguish two forms of categorization: recognizing objects and choosing a name for them. Understanding the relation between similarity -- which we take to underlie recognition -- and naming is therefore fundamental. Two sources of complexity in naming are described that distinguish recognition from naming. We distinguish the taslcs empirically by comparing linguistic category boundaries and perceived similarity for speakers of Chinese and English for sixty common containers. Although the two groups have different linguistic catgegory boundaries, their similarity judgments are largely convergent.
How Currency Traders Think About the Spot Market's Thinking
This paper discusses a model of decision making in environments characterized by information that may change more rapidly than the decision maker can respond. The exemplar environment is the spot market for currency. The discussion focuses on the part of the trading model that explains how spot currency traders anticipate the market.
Effects of Goal Specificity and Explanations on Instance Learning and Rule Learning
We distinguish between instance learning and rule learning (e.g. Shanks & St. John, 1994). Instance learning involves memorizing learning mstances while rule learning involves the abstraction of an underlying rule. Instance learning and rule learning can be explained by a dual space model of leammg (Klahr & Dunbar, 1988; Simon & Lea, 1974). In relation to Simon and Lea's model, instance learning can be said to occur in instance space while rule learning makes use of both instance space and hypothesis space. We describe an experiment to test the view that whether instance learning or rule learning occurs depends on the learning goal and on whether or not the subjects explain what they are doing. Subjects were asked to learn a dynamic computer control task guided by either a specific or a non-specific goal. During learning, subjects also carried out a secondary task. They either described what they were doing during learning or explained what they were doing. We predicted that giving descriptions would favour instance learning and prevent rule learning irrespective of the learning goal, since giving descriptions forces subjects to focus on the task itself. Giving explanations should favour rule learning when subjects are given a non-specific goal, since both the non-specific goal and giving explanations focus on the reasons for the computer's behaviour. Giving explanations should not lead to rule learning when subjects have a specific goal since the specific goal forces subjects to focus on a search of instance space and to neglect the hypothesis space. The results confirmed these predictions. They support the view that goal specificity guides learning by directing attention to either instance space or both instance space and rule space, and that giving explanations encourages the revision of hypotheses in the light of the evidence.
Architecture and Experience in Sentence Processing
Models of the human sentence processing mechanism have traditionally appealed to innate architectural restrictions to explain observed patterns of behavior. Recently, a number of proposcds have instead emphasized the role of linguistic experience in guiding sentence interpretation, suggesting that various frequency measures play a crucial role in ambiguity resolution. What has been lacking thus far is a detailed analysis of the linguistic and computational properties that could explain why those particular aspects of experience are effective in shaping behavior. In this paper, we present a linguistic analysis that reveals restrictions on the representational ability of the sentence processor, explaining its sensitivity to particular factors in the linguistic environment. The proposal receives strong support from a large-scale corpus analysis.
Centered Segmentation: Scaling up the Centering Model to Global Referential Discourse Structure
introduce a methodology for determining referents in full-length texts in a computationally parsimonious way. Based on the centering model, whose focus is on the local coherence of discourse, we build up a hierarchy of referential discourse segments from the local centering data. The spatial extension and nesting of these discourse segments constrain the reachability of potential antecedents of an anaphoric expression above the level of adjacent center pairs. Thus, the centering model is scaled up to the level of global discourse structure.
A Rational Analysis of Alternating Search and Reflection Strategies in Problem Solving
In this paper two approaches to problem solving, search and reflection, are discussed, and combined in two models, both based on rational analysis (Anderson, 1990). The first model is a dynamic growth model, which shows that alternating search and reflection is a rational strategy. The second model is a model in ACT-R, which can discover and revise strategies to solve simple problems. Both models exhibit the explore-insight pattern normally attributed to insight problem solving.
Solution Compression in Mathematical Problem Solving: Acquiring Abstract Knowledge That Promotes Transfer
The purpose of this study was to find the level of abstraction that facilitates transfer in mathematical problem solving. Two experiments in this study showed that subjects who made good abstraction showed better transfer (Experiment 1), and it is possible to teach an abstracted schema quickly (Experiment 2), although a hint is necessary in testing. The abstracted schema was the idea of how to construct correct equations for target problems. This schema was at an more abstract level than the form of equations. Thus, we argue that the process named solution compression, in which two or more equations are considered to be constructed from one idea, is needed in order to generalize this schema and to promote transfer in mathematical problem solving.
Medical Analogies: Why and How
This paper describes the purposes served by medical analogies (why they are used) and the different cognitive processes that support those purposes (how they are used). Historical and contemporary examples illustrate the theoretical, experimental, diagnostic, therapeutic, technological, and educational value of medical analogies. Four models of analogical transfer illuminate how analogies are used in these cases.
Dissociation between Categorization and Similarity Judgments
A dissociation between categorization and similarity was found by Rips (1989). In one experiment. Rips found that a stimulus half-way between a pizza and a quarter was categorized as a pizza but was rated as more similar to a quarter. Smith & Sloman (1994) discuss these results in terms of the role of necessary and characteristic features. In one experiment, participants had to learn to categorize new stimuli (unknown shapes) built with necessary and characteristic features. We compared two experimental conditions in which we manipulated the association between the characteristic features and the two categories. Contrary to the suggestion made by Smith and Sloman, subjects categorized the stimuli on the basis of a necessary feature. However, their similarity judgments relied on the characteristic features. This resulted, for one of the two experimental conditions, in a perfect dissociation betweensimilarity and categorization. According to Rips, the dissociation indicates that categorization and similarity rating are different processes. On the contrary, we suggest that categorization and similarity are the same processes, but that they sometimes operate on different subsets of features.
When children fail to learn new categories: the role of irrelevant features
When subjects are confronted with new stimuli that they have to learn to categorize, they have to segment them into relevant features for categorization. Two experiments with four to eleven year old children investigated whether certain irrelevant perceptual aspects of the stimuli prevent learning the relevant features for categorization. In the first experiment, it was shown that children used salient holistic aspects of stimuli for categorization despite the fact that they were only partially relevant for categorization, whereas perfect cues for categorization requiring analysis were not discovered by children. In the second experiment, it was shown that children cannot abstract the relevant cues for categorization when irrelevant perceptual characteristics were crossed with the relevant ones. When these irrelevant cues were absent, children could learn the relevant cues for categorization. Children's biases towards locally salient properties can impair, or even prevent learning new categories, when these are defined by comparatively less salient features. Results are discussed in terms of the relation between children's cognitive competences and the abstraction of relevant descriptors for new stimuli.
Connectionism and Psychological Notions of Similarity
Kitcher (1996) offers a critique of connectionism based on the belief that connectionist information processing relies inherently on metric similarity relations. Metric similarity measures are independent of the order of comparison (they are symmetrical) whereas human similarity judgments are asymmetrical. We answer this challenge by describing how connectionist systems naturally produce asymmetric similarity effects. Similarity is viewed as an implicit byproduct of information processing (in particular categorization) whereas the reporting of similarity judgments is a separate and explicit meta-cognitive process. The view of similarity as a process rather than the product of an explicit comparison is discussed in relation to the spatial, feature, and structural theories of similarity.
Beyond Representativeness: Productive Intuitions About Probability
Although research has found many flaws in people's probabilistic reasoning, we have found that middle-school students have many productive ideas about probability. This study examines the probabilistic reasoning used by middle-school students as they used a technology-mediated inquiry environment that was concepmalized and developed to engage students in the task of analyzing the fairness of games of chance. This research demonstrates that students employ productive probabilistic reasoning when participating in this task, and also demonstrates that commonly reported heuristics such as representativeness do not adequately describe student reasoning.
Causal Judgements That Violate the Predictions of the Power PC Theory of Causal Induction
The causal power theory of the probabilistic contrast model (or power PC theory) of causal induction (Cheng, in press) states that estimates of the causal importance of a candidate cause are determined by the covariation between the cause and the effect and the probability of the effect as indexed by the probability of the effect in the absence of the cause. In two causal induction experiments we tested predictions derived from the equations of the power PC theory. In Experiment 1, the power PC theory predicted equivalent causal estimates in conditions where the probability of the effect given the presence of the cause, P(effect | cause), equalled 1 and in conditions where P(effect | cause) equalled 0. Judgments, however, differed significantly within these conditions and conformed to the predictions of a simpler contingency model. These prediction failures might be attributable to the particular values of P(effect | cause), and thus Experiment 2 set this probability to values other than 1 or 0. Causal judgments again disconfirmed the predictions of the power PC theory and this time significantly failed to conform to the predictions of a simple contingency model.
An Architectural Account of Errors in Foreign Languange Learning
It has often been observed among teachers of English as a foreign language that the English article system is difficult for learners to master. This paper provides a processing account which pinpoints the source of these errors as being within the learner's architecture for production. We illustrate our account with a computational model of one group of foreign language learners embedded within NL-Soar. The model's control structure and learning mechanism are used to explain the architectural character of errors and to predict the conditions required for overcoming them.
Dual-task Interference When a Response is Not Required
When subjects are required to respond to two stimuli presented in rapid succession, responses to the second stimulus are delayed. Such dual-task interference has been attributed to a fundamental processing bottleneck preventing simultaneous processing on both tasks. Two experiments show dual-task interference even when the first task does not require a response. The observed interference is caused by a bottleneck in central cognitive processing, rather than in response initiation or execution.
Modeling planning and reaching
Recently developed models of reaching have been based on the general principle that an actor first specifies a task goal, then plans a goal posture that can achieve the task, and then specifies a movement to that goal posture. Selection of a particular goal posture is based on the degree to which movement from the starting posture to possible candidate goal postures best satisfies a number of constraints, including biomechanical efficiency and the avoidance of obstacles. We describe methods used to simulate and test this model.
How Motivation Affects Learning
In our cognitive-motivational process model (Vollmeyer & Rheinberg, in press) we assumed that motivational factors have an impact on how people learn about a task and how well they can perform it. Many motivation theories (if not all) have such assumptions in common. Our approach emphasizes four task specific motivational factors: mastery confidence, incompetence fear, interest, and challenge. We investigated how these motivational factors influence the learning outcome through mediators. Our framework proposes that the motivational state and the strategy systematicity could mediate motivational effects on learning. Path analysis supported this assumption in two studies.
Building Lexical Representations Dynamically Using Artificial Neural Networks
The topic of this paper is the development of dynamic lexical representations using artificial neural networks. In previous work on connectionist natural language processing a lot of approaches have experimented with manually encoded lexicon representations for words. However from a cognitive point of view as well as an engineering point of view it is difficult to find appropriate representations for the lexicon entries for a given task. In this context, this paper explores the use of building word representations during a training process for a particular task. Using simple recurrent networks, principal component analysis and hierarchical clustering we show how lexical representations can be formed dynamically, especially for neural network modules in large, real-world, computational speech-language modeIs.
Is Mental Rotation a Motor Act
We find evidence for a tight coupling between motor action and transfomation of visual mental images: in a dual-task experiment involving both mental and manual rotation, it is found that mental rotation of abstract visual images is faster and less error-prone when accompanied by manual rotation in the same direction, slower and more error-prone when motor rotation is in the opposite direction. Variations in motor speed, on both large and small scales, are accompanied by corresponding variations, in the same direction, in mental rotation speed. We briefly speculate on the mechanisms that could give rise to this interaction.
The Influence of Semantic Magnitude Representations on Arithmetic: Theory, Data, and Simulation
Arithmetic research reveals longer RTs for large problems (6x8) than small problems (2x3). While several factors have been implicated, they cannot be dissociated in normal arithmetic. Subjects were trained on an artificial operation designed to independently manipulate these variables. Results suggest that semantic operand representations and presentation frequency are involved. A new theory of arithmetic fact retrieval is introduced which suggests that arithmetic facts are stored and retrieved using a magnitude representation of the problem operands. Simulations suggest the theory is able to account for the major arithmetic fact retrieval phenomena.
Negative Effects of Domain Knowledge on Creative Problem Solving
Experts generally solve problems in their fields more effectively than novices because their well-structured, easily-activated knowledge allows for efficient search of a solution space. But what happens when a problem requires a broad search for solution? One concern is that subjects with a large amount of domain knowledge may actually be at a disadvantage because their knowledge may confine them to an area of the search space where the solution does not reside. In other words, domain knowledge may act as a mental set, promoting fixation in creative problem solving attempts. Two experiments using an adapted version of Mednick's (1962) Remote Associates Task demonstrates conditions under which domain knowledge may inhibit creative problem solving.
Deductive Reasoning Competence: Are Rule-Based and Model-Based Methods Distinguishable in Principle?
Much argument has been generated concerning the problem whether human deductive performance can best be viewed as rule-based (e.g. Rips) or model-based (e.g. Johnson-Laird). This paper argues that the distinction is ill-founded, and demonstrates that an ostensibly model-based syllogistic reasoning method can easily be implemented in a natural deduction calculus, which moreover makes fully explicit reference to the different possible interpretations of the premisses. More generally, it is unclear that other model-based methods cannot be given similar natural-deduction treatments, raising doubts about the distinguishability in principle of rule-based and model-based methods.
Sublexical Processing in Reading Chinese
The nature of sublexical processing in reading logographic Chinese was investigated in three primed naming experiments. Experiment 1 showed that phonetic radicals in low frequency complex characters are automatically decomposed and used to activate their own phonological representations. Experiments 2 and 3 demonstrated that the semantic properties of these phonetic radicals are also activated. It is argued that sublexical processing in reading Chinese is both a phonological and a semantic event and there is no fundamental difference between sublexical processing of phonetic radicals and lexical processing of single and complex charaaers. The implications of these results for theories of lexical processing are discussed.
Spread of Activation in the Mental Lexicon
Spread of activation and interaction between dififerent types of knowledge representations in the mental lexicon were investigated in three semantically mediated phonological priming experiments, conducted on both English and Chinese. Facilitatory effects were found in naming not only for words (e.g., boy) that were semantically related to their primes (e.g., girl), but also for words that were homophonic to the semantic targets (e.g., buoy). The amount of priming varied according to whether homophone targets were also orthographically similar to semantic targets. An inhibitory priming effect was also found for words that were orthographically similar to but phonologically different from semantic targets. It is concluded that spread of activation between words sharing semantic properties is not encapsulated in the semantic system. The phonological and orthographic representations of words receiving spread of semantic activation are also automatically and immediately activated, even though they are not supported directly by sensory input.
How Do They Do It? Delving Into The World Of An Aging Medical Expert
It is well established that there are declines in basic cognitive functions associated with aging. However, for individuals with extensive knowledge in a particular domain (e.g. experts), there does not appear to be age-related limitations in performance. Although expert performance relies on certain fundamental cognitive processes, such as information processing and memory capacity, the strategy with which an expert maintains a high level of functioning in his/her domain may be altered. One may view this alteration as a sign of compensation for age-related limitation, or one may deem that this alteration of strategies is due to the natural course of extensive practice in the field. The aim of this paper is to explore diagnostic reasoning processes in an aging medical specialist. Specifically, this study explores what aspects of performance approximates that of a younger expert, and what aspects deviate from the current model of expertise in medicine.
Short Papers
Learning Pathways to Temporal Inference
Temporal inference is defined as the cognitive capacity that motivates and implements the acquisition and use of a system's derivatives to infer future conditions and influence behavior. This poster discusses learning pathways to develop temporal inference using "information-flow/processor" graphs.
Modifying Mental Models of Studying
When persons intentionally act to learn, they use a mental model of studying to choose among their repertoire of study acts on the basis of beliefs about the effectiveness of these acts. One important educational objective is to develop that mental model to be more in agreement with our scientific knowledge about studying. In this experiment, subjects were asked to recommend study actions for fictitious students described in computerpresented scenarios. Feedback for one group was designed to reflect our scientific knowledge about learning, and for the other it was randomly determined. Student's repertoire of study acts expanded in both groups, and the acts selected in the scientific-feedback group became more congruent with scientific knowledge about studying.