About
The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.
Volume 13, 1991
Paper Presentations -- Discourse and Text
Empirical Analysis of a Discourse Model for Natural Language Interfaces
A structural model of discourse for natural language interaction developed for the LINLIN-system is evaluated using the Wizard-of-Oz method. 21 dialogues were collected using five different background systems, making it possible to vary the type and number of tasks possible to perform by the users. The results indicate that the structural complexity of the discourse in man-machine dialogues is simpler than most human dialogues, at least for information retrieval and some types of ordering systems, suggesting that computationally simpler discourse models can be used in these domains.
A Distributed Representation and Model for Story Comprehension and Recall
An optimal control theory of story comprehension and recall is proposed within the framework of a "situation" stale space. A point in situation state space is specified by a collection of propositions each of which can have the values of either "present" or "absent". Story comprehension is viewed as finding a temporally-ordered sequence of situations or "trajectory" which is consistent with story-imposed constraints. Story recall is viewed as finding a trajectory consistent with episodic memory constraints. A multistate probabilistic (MSP) machine representational scheme is then introduced for compactly and formally assigning a "degree of belief (i.e., a probability value) to each trajectory in the slate space. A connectionist model is also introduced which searches for trajectories which arc highly probable with respect lo a set of constraints and an M S P machine representation. Like human subjects, the model (i) recalls propositions with greater causal connectivity as retention interval is increased, and (ii) demonstrates how misordered propositions tend to "drift" more towards their canonical position in a text as retention interval is increased.
Tests of Some Mechanisms That Trigger Questions
We have identified mechanisms that generate questions when individuals solve problems, comprehend text, and engage in conversations. Some of these mechanisms have been discussed in previous research in cognitive science and discourse processing, whereas other mechanisms were discovered when we analyzed videotapes of student-tutor interactions. The present study tested whether anomalous information causes an increase in questions when individuals solve mathematics problems and comprehend stories. College students were instructed to generate questions while they were solving problems (i.e., algebra and statistics) or while they were comprehending stories (e.g., fables and parables). There were several different versions of each problem or story: (1) complete original, (2) deletion of critical information, (3) addition of contradictory information, and (4) addition of irrelevant information. The deletion versions elicited the most questions whereas the original versions elicited the fewest questions; the addition versions were in-between. The validity of some of the question generation mechanisms is supported by the fact that these transformations of content caused an increase in questions.
In the Eye of the Beholder: The Coherence of Nonstandard Discourse
Researchers investigating discourse coherence typically examine the various mechanisms that bring about coherence. This body of research has acknowledged that the specific coherence relations which unite the individual discourse units work as a result of an assumption about the coherence of discourse in general. The standard approach to coherence investigation has been to analyze conventional texts and conversations in which both coherence relations and the assumption of coherence are present. By limiting themselves to the analysis of standard discourse, researchers have ignored nonstandard sources, which can provide insight into the necessity and sufficiency of these mechanisms. This paper provides several examples of nonstandard discourse. From these examples, we conclude that an assumption of coherence is the only necessary and sufficient mechanism required for judgments of coherence.
The Story Gestalt A Model of Knowledge Intensive Processes in Text Comprehension
How are knowledge intensive text comprehension processes computed? Specifically, how are 1) explicit propositions remembered correctly, 2) pronouns resolved, 3) coherence and prediction inferences drawn, 4) on-going interpretations revised as more information becomes available, and 5) how is information learned in specific contexts generalized to novel texts? The Story Gestalt model, which uses a constraint satisfaction process to compute these processes, is successful because each of the above processes can be seen as examples of the same process of constraint satisfaction, constraints can have strengths to represent the degrees of correlation among information, and the independence of constraints provides insight into generalization. In the model, propositions describing a simple event, such as going to the beach or a restaurant, are sequentially presented to a recurrent P DP network. The model is trained to process the texts by requiring it to answer questions about the texts. Each question is the bare predicate from a proposition in the text or a proposition that is inferrable from the text. The model answers the question by completing the proposition to which the predicate belongs. The model accomplishes the five processing tasks listed above and provides insight into how a constraint satisfaction model can compute knowledge intensive processes in text comprehension.
Paper Presentations -- Imagery
Encoding Images into Constraint Expressions
This paper presents a method, generalization to interval, that can encode images into symbolic expressions. This method generalizes over instances of spatial patterns, and outputs a constraint program that can be used declaratively as a learned concept about spatial patterns, and procedural as a method for reasoning about spatial relations. Thus our method transforms numeric spatial patterns to symbolic declarative/procedural representations. We have implemented generalization to interval with Acorn,^ a system that acquires knowledge about spatial relations by observing 2-D raster images. We have applied this system to some layout problems to demonstrate the ability of the system and the flexibility of constraint programs for knowledge representation.
Imagery and Categories: The Indeterminacy Problem
One of the classical problems faced by theories of mental imagery is the Indeterminacy Problem: a certain level of detail seems to be required to construct an image from a generating description, but such detail might not be available from abstract, categorical descriptions. If we commit to unjustified details and incorporate them into an image, subsequent queries of the image might indiscriminately report not only information implied by the description but also information that was arbitrarily fixed. The Indeterminacy Problem is studied in a simplified domain, and a computational model is proposed in which images can be incrementally adjusted to satisfy a set of inter-constraining assertions as well as possible. In this model, queries can discriminate between those details in an image which are necessary (implied by the generating description) and those which are incidental (consistent but arbitrarily fixed). The computational model exploits the graded prototypicality of the categorical relations in the simplified domain, and suggests the importance of a grounded language for reasoning with categories.
Telling Where One is Heading and Where Things Move Independently
W e summarize our recent novel approach to computing the Focus of Expansion for an observer moving with unrestricted motion in a scene with objects of unrestricted shape. This method also detects points not moving rigidly with the scene. The approach, using collinear image points, is based on an exact method for cancelling effects of the observer's rotation from optic flow. The computational results are being presented elsewhere (da Vitoria Lobo & Tsotsos 1991). Here, we argue that this algorithm is biologically plausible.
A Knowledge Representation Scheme for Computational Imagery
After many years of neglect, the topic of mental imagery has recently emerged as an active area of debate. One aspect of this ongoing debate is whether an image is represented as a description or a depiction of its components. This paper is not so concerned with how mental images are stored, as with what machine representations will provide a basis for imagery as a problem solving paradigm in artificial intelligence. In fact, we argue that a knowledge representation scheme that combines the ability to reason about both descriptions and depictions of images best facilitates the efficient implementation of the processes involved in imagery.
Can Images Be Rotated and Inspected? A Test of the Pictoryal Medium Theory
Since the "equivalence" of imagery and perception has been one of the cenU^al tenets of the pictorial theory, the negative results of Chambers and Reisberg (1985) on an image reinterpretation task may be seen as posing a fundamental challenge for the pictorial account. Finke, Pinker and Farah's (1989) claimed refutation of these negative results may be questioned on a number of methodological grounds. In addition to examining these issues, we report results of an experiment which tests what is seemingly another direct prediction of pictorial theories. Our investigation employs newly devised imagery tasks whose success depends on being able to "rotate", "inspect" and reinterpret images. Our negative results add further weight to a tacit knowledge account of images as intrinsically interpreted, abstract symbols.
Paper Presentations -- Neuroscience Models of Language
A Connectionist Model of Alphabetic Spelling Developmnet and Developmental and Acquired Dysgraphia
In this paper w e describe a connectionist model of the development of alphabetic spelling. The model learns to spell regular words more quickly than words with irregular spellings. W h e n the computational resources available to the model are restricted, the model learns more slowly and, analogously to developmental dyslexics, fails to learn some of the irregular items in its vocabulary. Experimental evidence is reported, which shows that both normal and dyslexic children of various ages have difficulty with particular word types that are similar to the problems experienced by the model on the same words. Finally, the model is "lesioned," and its performance is then similar to that of "surface dysgraphics." The good fit between model and data is taken as evidence that, throughout much of the relevant developmental period, the task facing children can be usefully viewed as a statistical one.
Generating Expressions Referring to Eventualities
We note (a) the well-rehearsed linguistic observation that eventualities can be referred to by using either noun phrases or sentences, and (b) the seductive ontological parallels drawn by Bach [1986] between eventualities and individuals. W e show how the mechanisms for knowledge representation and referring expression generation in an existing natural language generation system [Dale 1988, 1989] can be easily extended to combine these two insights in the generation of a wide variety of forms of reference to eventualities.
Effects of Word Abstractness in a Connectionist Model of Deep Dyslexia
Deep dyslexics are patients with neurological damage who exhibit a variety of symptoms in oral reading, including semantic, visual and morphological effects in their errors, a part-of-speech effect, and better performance on concrete than abstract words. Extending work by Hinton & Shallice (1991), we develop a recurrent connectionist network that pronounces both concrete and abstract words via their semantics, defined so that abstract words have fewer semantic features. The behavior of this network under a variety of "lesions" reproduces the main effects of abstractness on deep dyslexic reading: better correct performance for concrete words, a tendency for error responses to be more concrete than stimuli, and a higher proportion of visual errors in response to abstract words. Surprisingly, severe damage within the semantic system yields better performance on abstract words, reminiscent of CAV, the single, enigmatic patient with "concrete word dyslexia."
Language and the Primate Brain
New data on the large number of modality-specific areas in the post-central cortex of several non-human primates, and recent anatomical and functional studies of the human brain suggest that very little of the cortex consists of poly-modal 'association' areas. These observations are used to reinterpret psychological and neuropsychological data on language comprehension in normal and brain-damaged humans. I argue that language comprehension in sighted people might best be thought of as a kind of code-directed scene comprehension that draws heavily upon specifically visual, and probably largely prelinguistic processing constraints. The key processes of word-recognition and the assembly of visual word meaning patterns into interacting chains, however, may be mediated in part by species-specific activity patterns in secondary auditory cortex similar
Focal and Diffuse Lesions of Cognitive Models
With the recent ability to construct fault tolerant computer models using connectionist approaches, researchers are n o w able to investigate the effects of damage to these models. This has great appeal for cognitive science as it provides a further way to verify or falsify a computer model. Existing studies employ a concept of network "lesioning" that fails to have explanatory adequacy for neurobiology. While using anatomically plausible architectures for cognitive models, they nonetheless use biologically implausible methods for simulating neurological damage to these networks. This paper examines the different objects of computational networks and their analogical neurobiological counterparts, and suggests a taxonomy of connectionist network lesion methods. Finally, an existing visual system model is used as a testbed to study the differential effects of focal and diffuse lesions. Tlie exj)erimcnts with focal damage versus diffuse damage suggest that while the effects of focal brain injury m a y be due to the particular computations performed in some brain area, the effects of diffuse brain injury or degeneration may cause cognitive defici ts because of the inherent nature of the brain as a distributed computational device, and not through differential local effects
Paper Presentations -- Planning and Action
Incorporating Resource Analyses into an Action System
OOPS is a reactive planner which integrates sensory perception and action selection. A principal feature of the OOPS architecture is the use and discovery of cheap, diagnostic features to indicate opportunities, which are then verified in more expensive computations. This diagnostic relationship is established analytically and refined using tools from decision theory. The use and refinement of diagnostic features depends upon assumptions of conditional independence. However, in the case of a multiplanning agent (one which simultaneously pursues several goals), while conditional independence holds true for features vis-avis individual opportunities, plans may interact in the world, and conditional independence may not hold. In this paper we discuss how the knowledge needed to avoid detrimental action interactions can be incorporated into OOPS's inexpensive diagnostic computations, with benefits for robustness, performance, and learning.
Opportunistic Memory and Visual Search
In earlier work, we proposed a memory model that would facilitate the detection of opportunities to satisfy suspended goals. In this opportunistic memory model, suspended goals are indexed under feature sets that are predictive of the presence of the opportunity, and which are likely to be encountered in the normal course of future activity. The functional benefit of such encoding depends crucially on the particular vocabulary of features used, the costs of their detection, and the overlap of features relevant to the pursuit of different goals. In this paper we investigate the feature vocabulary implied by recent work on visual search [Treisman, 1985, Tsotsos, 1990], and its use in indexing goals suspended due to the lack of a particular object.
Determining What to Learn in a Multi-Component Planning System
An intelligent agent which is involved in a variety of cognitive tasks must be able to learn new methods for performing each of them. W e discuss how this can be achieved by a system composed of sets of rules for each task. To learn a new rule, the system first isolates the rule set which should be augmented, and then invokes an explanation-based learning mechanism to construct the new rule. This raises the question of how appropriate target concepts for explanation can be determined for each task. We discuss the solution to this problem employed in the CASTLE system, which retrieves target concepts in the form of performance specifications of its components, and demonstrate the system learning rules for several different teisks using this uniform mechanism.
Adaptive Action Selection
In earlier papers we presented a distributed model of action selection in an autonomous intelligent agent (Maes, 1989a, 1989b, 1991a, 1991b). An interesting feature of this algorithm is that it provides a handful of parameters that can be used to tune the action selection behavior of the algorithm. They make it possible, for example, to trade off goal-orientedness for data-orientedness, speed for quality, bias (inertia) for adaptivity, and so on. In this paper we report on an experiment we did in automating the tuning and run-time adaptation of these parameters. The same action selection model is used on a meta-level to select actions that alter the values of the parameters, so as to achieve the action selection behavior that is appropriate for the environment and task at hand.
Memory for Incomplete Tasks: A Re-examination of the Zeigarnik Effect
An important feature of human memory is the ability to retrieve previously unsolved problems, particularly when circumstances are more favorable to their solution. Zeiganiik (1927) has been widely cited for the finding that interrupted tasks are better remembered than completed ones; however, frequent replications and non-replications have been explained in terms of social psychological variables (Prentice, 1944). The present study examines differences in memory for tasks based on completion status by appealing to cognitive variables such as the nature of interruption, time spent during processing, and set size. In one experiment using word problems, subjects were interrupted on half of the problems after a short interval of active problem solving, and completed tasks were in fact better remembered than interrupted ones. However, less processing time was necessarily spent on problems that were interrupted. A second experiment held time constant, allowing subjects to abandon tasks they could not con^lete. In this experiment, the opposite result occurred, replicating Zeigarnik and showing better access to unsolved problems in free recall. However, enhanced memorability in this study may have resulted from a subject-generated impasse in problem solving rather than "interruption" per se. This successful replication also included set size differences in favor of incomplete problems. Under these conditions, the status of completion can serve as a useful index to past problem situations. These experiments are successful in identifying cognitive variables that explain when one can suspend effort on a failed problem, and recall it at a later time.
Paper Presentations -- Reasoning and Mental Models
Characterizing, Rationalizing, and Reifying Mental Models of Recursion
Mental models reflect people's knowledge about entities and systems around them. Therefore, knowing and understanding mental models can help in exploring cognitive issues in instruction including why a student takes a certain approach or applies a particular strategy to solve a problem, why a student makes mistakes, and why and how misconceptions are developed. Four different mental models of recursion, used for synthesizing solutions to recursive programming problems, have been identified through students' protocols. Each model has been characterized in a way consistent with the students' protocols. Various problem solving behaviours are rationalized in terms of the models. Suggestions are made as to how the mental models develop and evolve in the course of learning. W e also present a learning environment in which these mental models are reified and we show how mental models can be incorporated into an intelligent tutoring system.
Extending a Model of Human Plausible Reasoning
When one looks at transcripts of people answering questions or carrying on dialogues about everyday matters, their comments are filled with plausible inferences -- inferences that are not certain, but that make sense. Often, in forming these inferences, generalizations are made that are equally uncertain, but are nevertheless useful as a guide to their reasoning. This paper describes some extensions to our earlier description of a core theory of plausible reasoning (Collins and Michalski, 1989), based in large part on a recent protocol study. The primary focus is on the inductive inference patterns people use to form plausible generalizations, weakly held beliefs based on few examples. W e also show how the model was extended to deal with plausible inferences involving continuous quantities and inequalities.
Learning Strategic Concepts from Experience: A Seven-Stage Process
One way novices improve their skill is by learning not to repeat mistakes. Often this requires learning entirely new concepts which must be operationalized for use in plans. W e model this learning process in seven stages, starting with the generation of expectations which, when proven faulty, invoke mechanisms to modify decision making mechanisms in order to prevent the failure from occurring again. This process is demonstrated in the context of our testbed system which learns new rules for detecting threats, formulating counterplans and other cognitive tasks. It is shown how this process may be used to learn the concept of immobility as it occurs in the domain of chess.
Modeling the Self-explanation Effect with Cascade 3
Several investigations have found that students learn more when they explain examples to themselves while studying them. Moreover, they refer less often to the examples while solving problems, and they read less of the example each time they refer to it. These findings, collectively called the self-explanation effect, have been reproduced by our cognitive simulation program. Cascade. Cascade has two kinds of learning. It learns new rules of physics (the task domain used in the human data modeled) by resolving impasses with reasoning based on overly-general, non-domain knowledge. It acquires procedural competence by storing its derivations of problem solutions and using them as analogs to guide its search for solutions to novel problems. This paper discusses several runs of Cascade wherein the strategies for explaining examples is varied and the initial domain knowledge b held constant. These computational experiments demonstrate the computational sufficiency of a strategy-based account for the self-explanation effect.
Paper Presentations -- Case Representation and Adoptation
A Model-Based Approach to Case Adaptation
In case-based reasoning, a given problem is solved by adapting the solutions to similar problems encountered in the past. A major task in case-based problem solving is to generate modifications that are useful for adapting a previous solution to solve the present problem. We describe a model-based method that uses qualitative models of cjises for generating useful modifications. The qualitative model of a case expresses a problem solver's comprehension of how the solution satisfies the constraints of the problem. We illustrate the model based method in the context of case-based design of physical devices. A designer's understanding of how the structure of a previously encountered design produces its functions is expressed in the form of a function structure model. The functional differences between a given problem and a specific case are mapped into structural modifications by a family of modification generation plans. Each plan is applicable to a specific type of functional difference, and uses the function structure model to identify the specific components that need to be modified. We discuss the evaluation of this model-based method in
Towards a Content Model of Strategic Explanation
Over the past few years there has been a growing interest in the notion of using causal explanations in both learning ([Dejong and Mooney, 1986] [Mitchell et a/., 1986]) and planning ([Hammond, 1989], [Hammond, 1987] and [Simmonsand Davis, 1987]). The study of how complex causal explanations can be used in learning has turned into something of a cottage industry in AI; however, little attention has paid to how explanations may be constructed. In this paper, we will examine some of the current proposals concerning the process of explanation, augment them with a few ideas of our own, and suggest a new, more strategic level of knowledge about explanations that can be used to guide the explanation process. In particular, we are interested in the problems involved with integrating rule-based methods of explanation construction with memory-based approaches.
Adapting Abstract Knowledge
For a case-based reasoner to use its knowledge flexibly, it must be equipped with a powerful case adapter. A case-based reasoner can only cope with variation in the form of the problems it is given to the extent that its cases in memory can be efficiently adapted to fit a wide range of new situations. In this paper, we address the task of adapting abstract knowledge about planning to fit specific planning situations. First we show that adapting abstract cases requires reconciling incommensurate representations of planning situations. Next, we describe a representation system, a memory organization, and an adaptation process tailored to this requirement. Our approach b implemented in BRAINSTORMER, a planner that takes abstract advice.
Adaptation Strategies for Case-Based Plan Recognition
Figuring out what plan another agent might be executing is an important and difficult type of explanation problem which involves a special type of knowledge about plans and goals. Elsewhere, we have discussed an approach to the general explanation problem that involves adapting stored explanations to new situations. In this paper, we review that discussion very briefly and then examine the special issues that arise when applying that approach to explaining intentional actions, focusing on adaptation strategies that are specifically relevant to plan recognition.
A Functional Taxonomy of Abstract Plan Failures
To reason about a plan failure requires an appropriate explanation of the failure. Such explanations, as has been argued elsewhere, can be generated by instantiating and adapting abstract knowledge structures. But the two fundamental goals in reasoning about failure, recovering from the failure and repairing the knowledge that led to the failure, require qualitatively different types of explanation. This paper presents an abstract taxonomization of failures, oriented toward the latter problem. Each abstract failure type in the taxonomy is tied to heuristics for recognizing occurrences of failures of that type and for identifying the knowledge that can be repaired to avoid future occurrences of the same failure.
Paper Presentations -- Memory for Objects
Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks
In connectionist networks, newly-learned information can completely destroy previouslylearned information unless the network is continually retrained on the old information. This behavior, known as catastrophic forgetting, is unacceptable both for practical purposes and as a model of mind. This paper advances the claim that catastrophic forgetting is a direct consequence of the overlap of the system's distributed representations and can be reduced by reducing this overlap. A simple algorithm is presented that allows a standard feedforward backpropagation network to develop semidistributed representations, thereby significantly reducing the problem of catastrophic forgetting.
The Role of Physical Properties in Understanding the Functionality of Objects
We investigate the role of physical properties in determining how people select objects for use in physical activities. W e propose a geometric model in which dimensions represent properties relevant to the goals of the activity and objects occur as points in this property space. A n object's proximity to an ideal value on each property is additively combined across properties to produce a measure of the usefulness of the object for that activity. W e report an experiment that shows that this ideal-point model successfully describes how people select an object for use in a physical activity by using physical properties as an intermediary factor. This model is derived from models of preference choice in which an individual selects objects that he or she prefers.
Blend Errors During Cued Recall
Connectionist models of memory account for recall behavior using processes which simultaneously access multiple memory traces and interactively construct the recalled information. This also allows the models to account for prototyping phenomena, but seems to predict retrieval of composite or "blended" information during ordinary recall. By contrast, models that simulate recall as a probabilistic selection of a single trace would not predict recall blend errors. To examine memory blending during recall, three experiments were performed; in each, subjects read sentences, some sharing words with one other sentence. They later recalled the sentences given partial-sentence cues. In all experiments subjects made blend errors, recalling one word from each of two similar sentences more often than one word from each of two dissimilar sentences, as predicted by multiple-trace models. The frequency of blend errors was relatively low, but a good account of this and other aspects of the results was provided by a multipletrace model based on an Interactive Activation network
Learning Object-Relative Spatial Concepts in the L0 Project
This paper reports on the learning of spatial concepts in the £o project. The starting point Ls the identification of a visual primitive which appears to play a central role in the visually-based semantics for terms which express spatial relations between two objects. This primitive is simply the orientation of the imaginary ray connecting the two related objects where they are nearest each other. Given this, an important part of the learning consists of determining which other orientations this particular one should align with (e.g. it should align with upward vertical for "above"). These other orientations may be supplied by an objectcentered coordinate frame, as in English "in front of and Mixtec "cii", as well as by the upright coordinate frame. A central feature of the system design is the use of orientation-tuned Gaussian nodes which can learn their orientation and
The Ontogeny of Units in Object Categories
Theories of object recognition and categorization rely on a set of primitives to represent objects. The nature and the development of these primitives have been neglected in computational vision and in concept learning theories. W e present a theory of part ontogeny in which not only perceptual, but also categorical constraints play a role. A two-phase experiment using categories of synthesized 3 D objects (Martian rocks) was conducted to test the theory. The first phase tested the hypothesis that part identification is dependent on categorical context. The second phase tested whether the units extracted in the first phase played a conceptual role in learning a new category. In both phases, subjects interactively delineated the parts of the stimuli while learning the categories. The units subjects identified in the first phase were those that were predictive of the object's category. These units then influenced the perception of parts in the new categories of the second phase.
Paper Presentations -- Regularities and Estimation
Effects of Background Knowledge on Family Resemblance Sorting and Missing Features
Despite people's strong bias to sort exemplars based on a single dimension, various situations where family resemblance (FR) categories tend to be created have been identified. In a previous study (Ahn 1990b), knowing prototypes or theories underlying categories led subjects to create FR categories. The current study investigates why existence of background knowledge encourages creation of F R categories. Comparison of results from two experiments indicates that there is no intrinsic tie between knowing theories or prototypes and F R structure. The role of background knowledge on FR sorting seems to lie in leading subjects to weight dimensions equally, in helping them to infer unavailable values in favor or F R sorting, and / or in relating surface dimensions in terms of a deeper feature.
Understanding and Improving Real-world Quantitative Estimation
One possible method for improving real-world quantitative estimation is to "seed the knowledgebase" with explicit quantitative facts. This method was employed in two population estimation experiments. In Experiment 1, subjects estimated the populations of 99 countries. They then studied the populations of 24 of these countries. Finally, they estimated the populations of all 99 countries a second time. A s predicted, the post-learning estimates for the 75 "transfer" countries were much more accurate (48%) than the pre-leaming estimates. However, the rank-order correlations between estimated population and true populations showed almost no improvement. These results suggested that there m a y be two analytically distinct components to estimation, a range component and a ranking component, and that an arbitrary set of quantitative facts is likely to affect the former but not the latter. The aim of Experiment 2 was to demonstrate that one can affect the ranking component by presenting subjects with a consistent set of population facts. In this experiment, one group of subjects was presented with facts that consistently confirmed their prior beUef that European countries are quite large and Asian countries are quite small. Another group was presented with a set that consistently disconfirmed this view. As predicted, rank-order correlations between estimated and true populations were negatively affected by the biasconfirming facts and positively affected by the bias disconfirming facts.
Implicit Detection of Event Interdependencies and a PDP Model of the Process
We report on an experiment in which subjects were asked to predict the location of a stimulus based on observation of a series of five events. Unbeknownst to subjects, the location of the sixth event was determined by a double contingency between the second and fourth events in the sequence. This material is therefore highly complex, since the relevant events are embedded in a large number of irrelevant contexts. The results indicated that subjects improved their prediction performance over 10 sessions encompassing over 2400 trials of training, despite the fact that they remained completely unaware of the existence of the rule, and unable to verbalize their knowledge of the contingencies in the material. W e propose a model of performance in this task, in the form of a PDP model of sequence processing. The model successfully accounts for performance and illustrates how knowledge about the temporal context may develop in a way that does not necessarily yield decomposable representations. Interestingly, the model also predicts that performance would be worse if subjects were required to predict successive events rather than simply observe them.
Implicit Understanding of Functions in Quantitative Reasoning
We present a theoretical analysis of students' implicit understanding of the concept of variables and functions, and present a cognitive model of this understanding based on the idea that reasoning involves a successful interaction between psychological agents and the things and other people in a situation. In the first part of the paper, we provide evidence that middle-and high-school students demonstrate implicit understanding of functional relations among quantities when they reason about a physical model of linear functions. Implicit understanding is knowledge of concepts or principles that enables and constrains performance, but is not articulate. In the second part of the pa^r, we describe several theoretical properties of our computational model: a.) activities are modeled as interactions between a parson and a situation; b.) reasoning is modeled as a form of activity that produces new information; and c.) understanding is modeled as attunement to the constraints of conceptual activities.
Does Memory Reflect Statistical Regularity in the Environment?
Anderson and Milson (1989) derived optimal performance functions for memory based on assumptions about the goals of memory, the computational costs of achieving those goals, and the statistical structure of the environment. Based on these assumptions, and a good deal of Bayesian analysis, they accounted for a substantial number of empirical findings. Here we started with the same assumptions about the goals of memory, but instead of simulating the statistical structure of the environment, we analyzed it directly. It was found that the factors that govern memory performance also predict the probability with which words are spoken in children's linguistic environments. These factors include frequency, recency, and spacing between exposures. The ability of these factors to predict word use was analyzed in the context of four laboratory memory phenomena: 1) the power law of practice; 2) the power law of forgetting; 3) the interaction between study spacing and retention interval and 4) the combined effects of practice and retention. These factors predict information demand and lend strong support to Anderson and Milson's claim that memory behavior can be understood in terms of the statistical structure of the environment.
Paper Presentations -- Word and Concept Learning
Shifting Novices' Mental Representations of Texts Toward Experts': Disagnosis and Repair of Novices' Mis- and Missing Conceptions
The shape of the memory representation for a 1000 word text was measured for the text's author, 7 independent subject matter experts and 2 groups of novices (N = 83 Air Force recruits). To measure the shapes, w e chose the 12 most important concepts from the text, and then collected proximity data on all possible pairs of them. Then w e made maps of the mental representations from the proximities. In Experiment 1, results of empirical tests of text learning showed that the novices' mental representations after reading the Original Version of the text were correlated only +.1 with the author's or experts' representations. But for a Principled Revision the correlations were above +.5. In Experiment 2, the proximity data from Experiment 1 were used to diagnose specific misconceptions and missing conceptions in both the Original text and the Principled Revision. This revealed unsuspected cognitive misconceptions, as well as intrusions of affective and attitudinal factors into the novices' mental representations. These diagnoses were then used to revise both texts to repair the misconceptions and insert the missing conceptions. Results of empirical tests of these revisions (N = 160 Air Force Recruits) showed that novices' correlations with the author's and experts' representations were shifted close to ceiling (r = + .8 - + .9). These results show that novices' mental representations can be shifted to correspond with experts by using our methods to diagnose and repair misand missing conceptions.
Toward a Unified Theory of Lexical Error Recovery
The ambiguity inherent in natural language requires us to make many decisions about the meaning of what we hear or read. Yet most studies of natural language understanding have assumed that although language may be ambiguous, we always make the right choice when faced with a decision about ambiguity. Consequently, very little is said about how to recover from incorrect decisions. In this paper we look at two rare examples of investigations into recovering from erroneous decisions in resolving lexical ambiguity. After examining the corresponding theories, we find that what at first appear to be competing theories can in fact be resolved into a unified theory of lexical error recovery based upon a highly parallel architecture for language understanding.
How Students Misunderstand Definitions: Some Evidence for Contextual Representations of Word Meanings
This study is concerned with the issue of whether word meanings are mentally represented in a decontextualized form, similar to dictionary definitions. If this assumption is correct then students should understand the meaning of an unfamiliar word when they read its definition. To test this hypothesis, German high school students were given unfamiliar English words and their monolingual English dictionary entries. Students used each target word in an English sentence, and then translated their sentences into German. The translations permitted the assessment of comprehension and the specification of its underlying components. The results indicate that students often did not understand the meaning of an unfamiliar word even though they did "understand" its definition. Information that specified in which contexts an entry word and its definition are synonymous promoted comprehension. Meaning representations are therefore best conceived of as contextual representations.
Learning Words: Computers and Kids
We present a computer-based model of acquisition of word meaning from context. The model uses semantic role assignments to search through a hierarchy of conceptual information for an appropriate meaning for an unknown word. The implementation of this approach has led to many surprising similarities with work in modeling human language acquisition. W e describe the learning task and the model, then present an empirical test and discuss the relationships between this approach and the work in psycholinguistics.
Ambiguity Resolution: Behavioral Evidence for a Delay
This paper presents experimental evidence for a model of human language processing in which ambiguity resolution is delayed when there is a conflict between semantic contextual bias and the syntactically preferred interpretation. If there is no conflict, an immediate decision is made. Decision is not delayed indefinitely; the length of the delay is limited by available processing resources.
Paper Presentations -- Category Formation and Similarity
Feature Diagnosticity as a Tool for Invesitgating Positively and Negatively Defined Concepts
Two methods of representing concepts are distinguished and empirically investigated. Negatively defined concepts are defined in terms of other concepts at the same level of abstraction. Positively defined concepts do not make recourse to other concepts at the same level of abstraction for their definition. In two experiments, subjects are biased to represent concepts underlying visual patterns in a positive manner by instructing subjects to form an image of the the learned concepts and by initially training subjects on minimally distorted concept instances. Positively defined concepts are characterized by a large use of nondiagnostic features in concept representations, relative to negatively defined concepts. The distinction between positively and negatively defined concepts can account for the dual nature of natural concepts - as directly accessed during the recognition of items, and as intricately interconnected to other concepts.
Context-Sensitive, Distributed, Variable-Representation Category Formation
This paper describes INC2, an incremental category formation system which implements the concepts of family resemblance, contrast-model-based similarity, and context-sensitive, distributed probabilistic representation. The system is evaluated in terms of both the structure of categories/hierarchies it generates and its categorization (prediction) accuracy in both noise-free and noisy domains. Performance is shown to be comparable to both humans and existing leaming-from-example systems, even though the system is not provided with any category membership information during the category formation stage.
Constraints on Analogical Mapping: The Effects of Similarity and Order
One of the central problems in analogical mapping is overcoming the ambiguities which can occur when matching up corresponding concepts in two domains of knowledge; specifically, to ensure that one-to-many and many-to-one matches are resolved to be one-to-one matches. Various analogy theories have attempted to deal with these problems by maintaining that analogical matching is constrained in various ways. For example, that only predicates of the same structural-type are matched, that primacy is given to matches that are similar or identical, and that a match which comes before an alternative match is preferred. Two experiments are reported, involving an attributemapping problem, which isolate the effects of similarity and order. The first shows that the semantic similarity of predicates in the two domains has a facilitating effect on analogical mapping when other constraints are held constant. The second experiment shows that analogical mapping is sensitive to the order in which matches are made. The implications of these results for current computational models of analogy are discussed, with a special emphasis on the consequences that order effects have for connectionist models.
Dimensional Attention Learning in Model of Human Categorization
When humans learn to categorize multidimensional stimuli, they learn which stimulus dimensions are relevant or irrelevant for distinguishing the categories. Results of a category learning experiment are presented, which show that categories defined by a single dimension are much easier to learn than categories defined by the combination of two dimensions. Three models are fit to the data, ALCOVE (Kruschke 1990a,b, in press), standard back propagation (Rumelhart, Hinton & Wilhams 1986), and the configural-cue model (Gluck & Bower 1988). It is found that alcove, with its dimensional attention learning mechanism, can capture the trends in the data, whereas back propagation and the configural-cue model cannot. Implications for other models of human category learning are discussed.
Commonalities, Differences and the Alignment of Conceptual Frames During Similarity Judgements
Tversky demonstrated that the similarity of two objects increases with their commonalities and decreases with their differences. W e believe that determining commonalities and differences is a complex task. Using analogical mapping as a guide, we propose the process of frame-alignment which can be employed to find the commonalities and differences of structured representations. W e then test the predictions of this approach by asking subjects to list the commonalities and differences of word pairs that vary in their degrees of similarity. The results of this study support the predictions of the frame-alignment view.
Paper Presentations -- Perception and Visual Search
Efficient Visual Search: A Connectionist Solution
Searching for objects in scenes is a natural task for people and has been extensively studied by psychologists. In this paper we examine this task from a connectionist perspective. Computational complexity arguments suggest that parallel feed-forward networks cannot perform this task efficiently. One difficulty is that, in order to distinguish the target from distractors, a combination of features must be associated with a single object. Often called the binding problem, this requirement presents a serious hurdle for connectionist models of visual processing when multiple objects are present Psychophysical experiments suggest that people use covert visual attention to get around this problem. In this paper we describe a psychologically plausible system which uses a focus of attention mechanism to locate target objects. A strategy that combines top-down and bottom- up information is used to minimize search time. The behavior of the resulting system matches the reaction time behavior of people in several interesting tasks.
Perceptual Simplicity and Modes of Structural Generation
This paper describes a formal framework for perceptual categorization that can account for the salient qualitative predicates human observers are willing to ascribe to a closed class of objects, and consequently the simple groupings they can induce from small sets of examples. The framework hinges on the idea of a generative process that produces a given set of objects, expressed as a sequence of group-theoretic operations on a primitive element, thus ascribing algebraic structure to perceptual organization in a manner similar to Leyton (1984). Putatively, perceivers always seek to interpret any stimulus as a formally generic result of some sequence of operations; that is, they interpret each object as a typical product of some generative process. The principle formal structure is a "mode lattice," which a) exhaustively lists the qualitative shape predicates for the class of shapes, and b) defines the inferential preference hierarchy among them. The mechanics are worked out in detail for the class of triangles, for which the predicted qualitative features include such familiar geometric categories as "scalene," "isosceles," and "right," as well as more "perceptual" ones like "tall" and "short." Within the theory it is possible as well to define "legal" vs. "illegal" category contrasts; a number of examples suggest that our perceptual interpretations tend to regularize the latter to the former.
Featural Priming: Data Consistent With the Interactive Activation Model of Visual Word Recognition
The effect of featural priming on word identiiication was investigated as a test of the interactive activation model of word perception put forth by McClelland and Rumelhart (1981). Observers were presented with a 250 msec featural prime, which was either consistent with or inconsistent with the letters in the target word that immediately followed. Reading latencies were recorded for 96 trials per subject. A neutral prime condition consisting of a random dot pattern was used as a control in order to obtain baseline identification times. The prediction of the interactive activation model that mean reading latency would be significantly longer for words that were primed with inconsistent features than for those that were primed with consistent features was confirmed, adding to the empirical support for the model.
The Analysis of Resource-limited Vision Systems
This paper explores the ways in which resource limitations influence the nature of perceptual and cognitive processes. A framework is developed that allows early visual processing to be analyzed in terms of these limitations. In this approach, there is no one "best" system for any visual process. Rather, a spectrum of systems exists, differing in the particular trade-offs made between performance and resource requirements.
What Do Feature Detectors Detect? Features That Encode Context and the Binding Problem
The representation of visual features is investigated by examining the types of information that are encoded at the feature level which are used for feature binding. Features are often assumed to be bound together by virtue of their common location, but the current study shows that shared context, as well as location, acts to constrain the feature binding process and the formation of illusory conjunctions. T w o different sorts of context manipulations are reported In one manipulation, the context of each item in the display is established by flanking bars, and binding errors are examined as a function of this shared context Also examined is a more global context manipulation in which the items presented form either a word or nonword Both sorts of contexts affect feature binding, although in different ways. Finally, some of the computational difficulties in implementing a feature representation that encodes context are considered.
Paper Presentations -- Phonology and Word Recognition
The Role of Input and Target Similarity in Assimilation
We investigate the situation in which some target values in the training set for a neural network are left unspecified. After training, unspecified outputs tend to assimilate to certain values as a function of features of the training environment. The roles of the following features in assimilation are analyzed: similarity between input vectors in the training set, similarity between target vectors, linearity versus non-linearity of the mapping, training set size, and error criterion. All are found to have significant effects on the assimilation value of an unspecified output node.
Learning the Past Tense in a Recurrent Network: Acquiring the Mapping From Meaning to Sounds
The performance of a recurrent neural network in mapping a set of plan vectors, representing verb semantics, to associated sequences of phonemes, representing the phonological structure of verb morphology, is investigated. Several semantic representations are explored in attempt to evaluate the role of verb synonymy and homophony in deteriming the patterns of error observed in the net's output performance. The model's performance offers several unexplored predictions for developn mental profiles of young children acquiring English verb morphology.
What a Perceptron Reveals about Metrical Phonology
Metrical phonology is a relatively successful theory that attempts to explain stress systems in language. This paper discusses a perceptron model of stress, pointing out interesting parallels between certain aspects of the model and the constructs and predictions of metrical theory. The distribution of learning times obtained from perceptron experiments corresponds with theoretical predictions of "markedness." In addition, the weight patterns developed by perceptron learning bear a suggestive relationship to features of the linguistic analysis, particularly with regard to iteration and metrical feet. Our results suggest that simple statistical learning techniques have the potential to complement, and provide computational validation
A Connectionist Model of Auditory Word Perception in Continuous Speech
A connectionist model of auditory word perception in continuous speech is described. The aim is to model psycholinguistic data, with particular reference to the establishment of lexical percepts. There are no local representations of individual words: feature-level representations are mapped onto phoneme-level representations, with the training corpus reflecting the distribution of phonemes in conversational speech. Two architectures are compared for their ability to discover structure in temporally presented input. The model is applied to modelling the phoneme restoration effect and phoneme monitoring data.
A Computational Model of Connotation-Based Revision
Previous studies on text generation and revision seldom consider emotions and social relations as motivations for linguistic variations such as word choice and syntactic structure arrangement. This paper proposes a computational model that uses four attributes affect, activity, power, and emphasis to revise texts. These attributes express ideological beliefs, the connotations of lexical items, and connotation propagation properties of sentence structures. With these formal tools, an algorithm of backward chaining revises sentences with sensible choices of word and sentence structures. The model provides a basis for future research that can lead to a fully automated text revision system.
Paper Presentations -- Problem Solving and Transfer
Richard Catrambor
A proposal is made for representing the knowledge learners acquire from examples in terms of subgoals and methods. Furthermore, it is suggested that test problems can also be represented in terms of the subgoals and methods needed to solve them. Manipulations of examples can influence the particular subgoals and methods learned. Thus, transfer can be predicted by the overlap in the learned subgoals and methods and those required to solve a novel problem. A subgoal is an unknown entity, numerical or conceptual, that needs to be found in order to achieve a higher-level goal of a problem. A method is a series of steps for achieving a particular subgoal. The experiment presented here suggests that elaborations in example solutions that emphasize subgoals may be an efficient way of helping a learner to recognize and achieve those subgoals in a novel problem, that is, to improve transfer. It is argued that conceptualizing problem-solving knowledge in terms of subgoals and methods is a psychologically plausible approach for predicting transfer and has implications for teaching and design of examples.
Strategy Shifts Without Impasses: A Computational Model of the Sum-to-Min Transition
The SuM-to-MiN transition that children exhibit when learning to add provides an ideal domain for studying naturally occurring discovery processes. We discuss a computational model that accounts for this transition, including the appropriate intermediate strategies. In order to account for all of these shifts, the model must sometimes learn without the benefit of impasses. Our model smoothly integrates impasse-driven and impasse free learning in a single, simple learning mechanism.
Learning, Memory, and Search in Planning
This paper describes D^DALUS, a system that uses a variant of means-ends analysis to generate plans and uses an incremental learning algorithm to acquire probabilistic search heuristics from problem solutions. W e summarize DjEDALUS' approach to search, knowledge, organization, and learning, and examine its behavior on multi-column subtraction. W e then evaluate the system in terms of its consistency with known results on human problem solving, comparing it to other psychological models of learntng and planning.
Memory for Problem Solving Steps
A widely adopted theory of procedural learning claims that people construct new problem solving rules through induction over past problem solving steps. The underlying assumption that people store information about problem solving steps in memory was tested by measuring subjects' memory of their own problem solving steps in four different ways. The results support the assumption that people store enough information in memory to enable induction of new problem solving rules.
Knowledge Transfer among Programming languages
Two experiments were conducted to investigate knowledge transfer from learned programming languages to learning new ones. The first experiment concerned transfer from knowing LISP to learning PROLOG; the results showed that subjects who knew LISP had significant advantages over subjects who did not. Moreover, among the subjects who knew LISP those who knew LISP better seemed to learn PROLOG faster. The second experiment studied transfer from knowing either PASCAL or PROLOG to learning LISP; attention was specifically focused on transfer of knowledge of writing recursive and iterative programs in these languages. The results indicated that PROLOG programmers, who were usually more knowledgeable on recursion, were more ready to learn the recursive part of the LISP language. Some general theoretical discussion about knowledge transfer among programming languages is also presented in the paper.
Paper Presentations -- Distributed Cognition
A Decision Support System for Generalized Negotiations
This paper reports on the development of a Decision Support System GENIE that aids participants in crisis negotiation simulations. GENIE is an instance of a general DSS which can be used to support a large class of decision problems. The major function of GENIE is to provide the user with on-line information about a complex decision scenario. To this end, GENIE utilizes a combination of graphic and textual information presentation formats to create an environment in which a user can develop a mental picture of the decision problem facing him/her and then dynamically formulate an effective negotiating strategy. The design and development of GENIE are described along with an explanation of the major features of the system. Experimental results from user evaluations and system log files are also discussed. These results allowed us to gain insight into the decision processes of the users and rate the effectiveness of our DSS design strategy. Experimental results indicate that simulation participants who had access to the system performed on average better than participants without access to the system.
The Structure of Reasoning in Converstation
The findings of philosophers, linguists, and psychologists are conjoined here in an effort to develop a descriptive/analytic account of reasoning as it actually occurs in social settings. The primary focus of this paper is the reasoning that occurs in discussions of controversial social issues by groups of peers. Preliminary analysis indicates that rather sophisticated argument structures emerge from these informal settings, and that conversational interaction stimulates the development of arguments. Although participants did not always fully state their arguments, the investigators felt justified in attributing implicit premises because these were referred to later in the conversation as if they had been stated. In this respect at least, instead of merely assuming that subjects are good at reasoning, this study provides evidence for the claim. In this paper, a system for coding elements of reasoning and a method for displaying the interactive structure of reasoning in conversations are developed. A long-term goal is to use the information gained from this study to help understand and correct two problems that arise in the teaching of reasoning: the difficulty many students have with acquiring the principles of reasoning in standard logic courses and the difficulty of transferring whatever reasoning skills are acquired in the classroom to new situations.
Active Language in the Collaborative Development of Cooking Skill
It is crucial to approach a cognitive account of development with an accurate picture of the parentchild system in hand, otherwise one will tend to underestimate the richness of support and dynamics of that system «ind so will tend to overestimate the complexity of the learning processes of the child. In order to understand the developmental functions of collaborative action and its accompanying linguistic activity we examine the verbal and physical activity in parent-child cooking. We present an analysis of the physical collaborative structure of a baking soda measurement task from 36 parent-child dyads in three (child) age groups: 3, 4, and 5-years old, and a qualitative analysis of some phenomena of active language in this setting. Active language is discussed in terms of its function in providing clues to lexical semantics, to the structure of the task, and to contextual cues and non-obvious aspects of the situation.
Forming Shared Mental Models
As problems increase in complexity it becomes impossible for any one person lo know all the things necessary to make good decisions. A group of specialists, even if ihey possess the requisite knowledge, remain a collection of individuals until their expertise can be jointly brought to bear. The problem of fusing expertise where individuals have very detailed knowledge in their own areas and much weaker understanding of others is that no one knows what anyone else needs to know. This impasse cannot be broken until shared mental models are developed to provide the common perception needed to focus the activity of the group. This p^)er presents characteristics of shared menial models and a model of the effect, nature, and process of the formation of shared menial models in cooperative problem solving by a team of specialists.'
Avoiding Mis-communication in Concept Explanations
This paper offers a mechanism for the generation of expressions that convey a concept to a hearer. These expressions often include rhetorical devices, such as Descriptions and Instantiations. Our mechanism is based on the representation of the preconditions for the understanding of a concept in terms of failures in communication. W e distinguish between two types of communication failures based on their cause: failure due to a hearer's inability to evoke an intended concept, and failure due to the hearer's lack of expertise with respect to the aspects of a concept which are necessary for the comprehension of the discourse. This categorization supports the principled selection of rhetorical devices which are tailored to the prevention of undesirable situations that result in the failure of a communicative goal. These ideas are being implemented in a discourse planner in the domain of high-school algebra.
Paper Presentations -- Hybrid Representational Systems
A Connectionist Model of Intermediate Representations for Musical Structure
The communication of musical thoughts and emotions requires that some musical knowledge is shared by composers, performers, and listeners. Computational models of musical knowledge attempt to specify the intermediate representations required to generate adequate predictions of musical behavior. W e describe a connectionist model that encodes the rhythmic organization and pitch contents of simple melodies. As the network learns to encode melodies, structurally more important events tend to dominate less important events, as described by reductionist theories of music (Lerdahl & JackendofF, 1983; Schenker, 1979). W e describe an empirical study in which improvisations on a tune by a skilled music performer are compared with the encodings produced by the network. The two are examined in terms of the relative importance of the musical structure they posit at intermediate levels of representation.
Combining a Connectionsit Type Hierarchy with a Connectionist Rule-Based Reasoner
This paper describes an efficient connectionist knowledge representation and reasoning system that combines rule-based reasoning with reasoning about inheritaince and classification within an IS-A hierarchy. In addition to a type hierarchy, the proposed system can encode generic facts such as 'Cats prey on birds' and rules such as 'if x preys on y then y is scared of x' and use them to infer that Tweety (who is a Canary) is scared of Sylvester (who is a cat). The system can also encode qualified rules such as 'if an animate agent walks into a solid object then the agent gets hurt'. The proposed system can answer queries in time that is only proportional to the length of the shortest derivation of the query and is independent of the size of the knowledge base. The system maintains and propagates variable bindings using temporally synchronous - i.e., in-phase - firing of appropriate nodes.
The Connectionist Scientist Game: Rule Extraction and Refinement in a Neural Network
Scientific induction involves an iterative process of hypothesis formulation, testing, and refinement. People in ordinary life appear to undertake a similar process in explaining their world. W e believe that it is instructive to study rule induction in connectionist systems from a similar perspective. W e propose an approach, called the Connectionist Scientist Game, in which symbolic condition-action rules are extracted from the learned connection strengths in a network, thereby forming explicit hypotheses about a domain. The hypotheses are tested by injecting the rules back into the network and continuing the training process. This extraction-injection process continues until the resulting rule base adequately characterizes the domain. By exploiting constraints inherent in the domain of symbolic string-to-string mappings, w e show how a connectionist architecture called RuleNet can induce explicit, symbolic condition-action rules from examples. RuleNet's performance is far superior to that of a variety of alternative architectures we've examined. RuleNet is capable of handling domains having both symbolic and subsymbolic components, and thus shows greater potential than purely symbolic learning algorithms. The formal string manipulation task performed by RuleNet can be viewed as an abstraction of several interesting cognitive models in the connectionist literature, including case role assignment and the mapping of orthography to phonology-
Hybrid Encoding: The Addressing Problem
Locality constraints are generally assumed to have their source at the hardware, or neuronal, level. However, the paper shows that the way symbols address constituent structure represented at the connectionist level limits their access to the encoded information. These limitations are expressed as the constructs of local and address domain and provide an explanatory basis for a wide range of cognitive constraints.
Connectionist Models of Rule-Based Reasoning
We investigate connectionist models of rule-based reasoning, and show that while such models usually carry out reasoning in exactly the same way as symbolic systems, they have more to offer in terms of commonsense reasoning. A connectionist architecture for commonsense reasoning, C O N - S Y D E R R , is proposed to account for commonsense reasoning patterns and to remedy the brittleness problem in traditional rule-based systems. A dual representational scheme is devised, utilizing both localist and distributed representations and exploring the synergy resulting from the interaction between the two. C O N S Y D E R R is therefore capable of accounting for many difficult patterns in commonsense reasoning. This work shows that connectionist models of reasoning are not just "implementations" of their symbolic counterparts, but better computational models of commonsense reasoning.
Paper Presentations -- Language Understanding
Incremental Learning, or The Importance of Starting Small
Most work in learnability theory assumes that both the environment (the data to be learned) and the learning mechanism are static. In the case of children, however, this is an unrealistic assumption. First-language learning occurs, for example, at precisely that point in time when children undergo significant developmental changes. In this paper I describe the results of simulations in which network models are unable to learn a complex grammar when both the network and the input remain unchanging. However, when either the input is presented incrementally, or—more realistically—the network begins with limited memory that gradually increases, the network is able to learn the grammar. Seen in this light, the early limitations in a learner may play both a positive and critical role, and make it possible to master a body of knowledge which could not be learned in the mature system.
An On-Line Model of Human Sentence Interpretation
This paper presents a model of the human sentence interpretation process, concentrating on modeling psycholinguistic data through the use of rich semantic and grammatical knowledge and expectations. The interpreter is an on-line model, in that it reads words left-to-right, maintaining a partial interpretation of the sentence at all times. It is strongly interactionist in using both bottom-up evidence and topdown suggestions to access a set of constructions to be used in building candidate interpretations. It uses a coherencebased selection mechanism to choose among these candidate interpretations, and allows temporary limited parallelism to handle local ambiguities. The interpreter is a unified one, with respect to both representation and process. A single kind of knowledge structure, the grammatical construction, is used to represent lexical, syntactic and semantic knowledge, and a single processing module is used to access and integrate these structures.
Why Do Children Say "Me do it"
A common feature of early speech is that children use case marking incorrectly. Several researchers have proposed that the child's mistakes are limited to the misuse of nominative case, and are corrected once the child acquires verbal morphology. In this paper I will show that this characterization of the problem is incorrect: children misuse all case forms, not just nominative case. In addition, I will show that the child's use of case is related to the acquisition of nominal morphology, not verbal. Case marking can be better understood as a result of the child learning the productive agreement processes of his language. This characterization accounts for the acquisition of case and the "waffling" which children exhibit, and does so within a unified theory of lexical and syntactic acquisition.
Integrating Knowledge Sources in Language Comprehension
Multiple types of knowledge (syntax, semantics, pragmatics, etc.) contribute to establishing the meaning of an utterance. Immediate application of these knowledge sources is necessary to satisfy the real-time constraintof 200 to 300 words per minute for adult comprehension, since delaying the use of a knowledge source introduces computational inefficiencies in the form of overgeneration. O n the other hand, ensuring that all relevant knowledge is brought to bear as each word in the sentence is understood is a difficult design problem. As a solution to this problem, w e present N L - Soar, a language comprehension system that integrates disparate knowledge sources automatically. Through experience, the nature of the understanding process changes from deliberate, sequential problem solving to recognitional comprehension that applies all the relevant knowledge sources simultaneously to each word. The dynamic character of the system results directly from its implementation within the Soar architecture.
A Graph Propagation Architecture for Massively-Parallel Constraint Processing of Natural language
We describe a model of natural language understanding based on the notion of propagating constraints in a semantic memory. This model contains a massively-parallel memory-network in which constraint graphs that represent syntactic and other constraints that are associated awith the nodes that triggered activations are propagated. The propagated constraint graphs of complement nodes that coUide with the constraint graphs postulated by the head nodes are unified to perform constraint applications. This mechanism handles linguistic phenomena such as case, agreement, binding and control in a principled manner in effect equivalent to the manner that they are handled in modern linguistic theories.
Articles
Representing Aspects of Language
We provide a conceptual framework for understanding similarities and differences among various schemes of compositional representation, emphasizing problems that arise in modelling aspects of human language. W e propose six abstract dimensions that suggest a space of possible compositional schemes. Temporality turns out to play a key role in defining several of these dimensions. From studying how schemes fall into this space, it is apparent that there is no single crucial difference between AI and connectionist approaches to representation. Large regions of the space of composition^ schemes remain unexplored, such as the entire class of active, dynamic models that do composition in time. These models offer the possibility of parsing real-time input into useful segments, and thus potentially into linguistic units like words and phrases.
Paper Presentations -- Philosophical Perspectives
Cognitive Plausibility of a Conceptual Framework
In this paper we investigate the cognitive plausibility of an integrated framework for qualitative prediction of behaviour. The framework is based on the K A D S expertise modeling approach and integrates different approaches to qualitative reasoning. The framework is implemented in a program called GARP. To test the cognitive plausibility a physics problem involving qualitative prediction of behaviour was constructed. The behaviour prediction of this problem generated by G A R P was compared to think-aloud protocols of human subjects performing the same problem solving task.
Why Do Thought Experiments Work?
Thought experiments have played a central role in historical cases of major conceptual change in science. They are important in both constructing new representations of nature and in conveying those representations to others. It is proposed that research into the role of mental modelling in narrative comprehension can illuminate how and why thought experiments work. In constructing and "running" the thought experiment, we make use of inferencing mechanisms, existing representations, and general world knowledge to make realistic transformations from one possible physical state to the next and this process reveals impossibility of applying existing concepts to the world and pinpoints the locus of needed conceptual reform.
Intentions, Commitments and Rationality
Intentions are an important concept in Cognitive Science and Artificial Intelligence (AI). Perhaps the salient property of (futuredirected) intentions is that the agents w h o have them are committed to them. If intentions are to be seriously used in Cognitive Science and AI, a rigorous theory of commitment must be developed that relates it to the rationality of limited agents. Unfortunately, the available theory (i.e., the one of Cohen & Levesque) defines commitment in such a manner that the only way in which it can be justified reduces it to vacuity. I present an alternative model in which commitment can be defined so as to have more of the intuitive properties we expect, and be closely connected to agent rationality. This definition is intuitively obvious, does not reduce to vacuity, and has useful consequences, e.g., that a rational agent ought not to be more committed to his means than to his ends.
Connectionism and Dynamical Explanation
A distinctive feature of connectionism as a research paradigm in psychology is use of a form of scientific explanation here termed dynamical explanation. In dynamical explanation, the behavior of a system is explained by reference to points and trajectories in an abstract state space. This paper contrasts dynamical explanation with some other major forms of scientific explanation, and discusses how dynamical explanation of the behavior of artificial neural networks can constitute genuine psychological explanation.
Paper Presentations -- Reminding and Case Retrieval
MAC/FAC: A Model of Similarity-based Retrieval
We present a model of similarity-based retrieval which attempts to capture three psychological phenomena: (1) people are extremely good at judging similarity and analogy when given items to compare. (2) Superficial remindings are much more frequent than structural remindings. (3) People sometimes experience and use purely structural analogical remindings. Our model, called MAC/FAC (for "many are called but few are chosen") consists of two stages. The first stage (MAC) uses a computationally cheap, non-structural matcher to filter candidates from a pool of memory items. That is, we redundantly encode structured representations as content vectors, whose dot product yields an estimate of how well the corresponding structural representations will match. The second stage (FAC) uses SME to compute a true structural match between the probe and output from the first stage. MAC/FAC has been fully implemented, and we show that it is capable of modeling patterns of access found in psychological data.
A Functional Perspective on Reminding
This paper explores the relationship between human activity and remindings. W e argue that the type of activity in which a person is engaged influences the kinds of features that trigger remindings, and, conversely, that remindings can change situated behavior over time. W e argue further that the types of indices used in memory are not uniform, but depend upon the nature of the task that a person is engaged in as well as his history of interaction with the world.
Improving Case Retrieval Through Observing Expert Problem Solving
As case-based reasoners gain experience in a domain, they need to improve their case retrieval so that more useful cases are retrieved. One problem in doing this is that the reasoner who most needs to learn is least able to explain successes or failures. A second problem is that uncontrolled pursuit of an explanation could be very expensive. There are three keys to the approach presented. First, the student observes expert problem solving and sets up expectations for what the expert will do next. When expectations fail, the reasoner has its failure isolated to a single step, and the correct action for the situation has been provided. Second, if the student can retrieve part of a case that would have suggested a correct prediction, then that case snippet can be used to limit the explanation process, making the process more efficient. Third, when no explanation can be found, the reasoner resorts to empirical adjustment of feature importance.
Explanation-Based Retrieval in a Case-Based Learning Model
Retrieving previous similar cases from a memory of cases is central to case-based reasoning systems. In most systems, this retrieval is done by a detailed indexing mechanism. Thagard and Holyoak argue that indexing is the wrong way to retrieve analogues. They propose a retrieval model (ARCS) based on a competing constraint satisfaction approach. In this paper, an explanation-based retrieval method (EBR) for retrieving analogues from a case-base with cases stored with respect to an interpretation of these cases as analyzed by a cognitive diagnostic component is described. The system is designed to the domain of problem solving in LISP. In a simulation study, it can be shown that the EBR-method performs equally well or even better than the ARCS-method.
Retrieval Competition in Memory for Analogies
An important question for cognitive models of human memory is the question of how analogical similarity affects memory retrieval. While the importance of surface lexical and semantic similarities between reminding cues and memory targets has been well-documented, clear empirical evidence that human memory retrieval is influenced by analogy has proven difficult to demonstrate. W e report two experiments in which subjects used a series of single sentences as reminding cues for previously-seen mini-texts. Some cue sentences contained noims and verbs that were hyponyms (i.e., words subordinate to the same category) of those in corresponding target sentences presented in one or two earlier passages. The role of analogical similarity in reminding was examined by varying the correspondence of noun case-role assignments of cue/target homonyms. Results indicate that retrieval competition and analogical similarity influence reminding. Recall of semanticallyrelated passages was significantly greater for structurally consistent (i.e., analogical) cues. Retrieval access was impaired when two semantically related passages were present in memory. Access to the passage with analogical resemblance to the cue was decreased by retrieval competition to an extent consistent with a ratio rule.
Paper Presentations -- Attention and Learning
Attention, Automaticity, and Priority Learning
It is widely held that there is a distinction between attentive and automatic cognitive processing. In research on attention using visual search tasks, the detection performance of human subjects in consistent mapping paradigms is generally regarded as indicating a shift, with practice, from serial, attentional, controlled processing to parallel, automatic processing, while detection performance in varied mapping paradigms is taken to indicate that processing remains under attentional control. This paper proposes a priority learning mechanism to model the effects of practice and the development of automaticity, in visual search tasks. A connectionist simulation model implements this learning algorithm. Five prominent features of visual search practice effects are simulated. These are: 1) in consistent mapping tasks, practice reduces processing time, particularly the slope of reaction times as a ftinction of the number of comparisons; 2) in varied mapping tasks, there is no change in the slope of the reaction time function; 3) both the consistent and varied effects can occur concurrently; 4) reversing the target and distractor sets produces strong interference effects; and 5) the benefits of practice are a function of the degree of consistency.
Constraints on the Interaction Between Context and Stimulus Information
A central issue in the development of models of context effects, such as the interactive activation model, concerns the relationship between contextual and stimulus information. Empirical evidence regarding perception of spoken and printed words indicates that context and stimulus information make independent contributions to perceptual identification. Recent research on visual object recognition, however, suggests that context may have a direct influence on the rate or accuracy of visual analysis. These results imply that contextual influences in language comprehension and object recognition may operate in fundamentally different ways. A series of experiments is described that lead to a reinterpretation of the object recognition results. It is concluded that contextual information contributes to the interpretation of stimulus input without altering its form or the rate of its acquisition.
A Connectionist Simulation fo Attention and Vector Comparison: The Need For Serial Processing in Parallel Hardware
Given the massively parallel nature of the brain an obvious question is why are so many information proccssing functions serial? In particular, this ftapter addresses the issue of the comparison process. Behavioral data show that in perceptual matching tasks (such as menwry scanning and visual search) performance is systematically affected by stimulus load, in that required processing time increases with each additional comparison item It is arguable whether this indicates a processing system that performs serial comparisons, or a system for which comparisons are done in parallel but reaction time is affected by load because of other system limitations. In this simulation we show that in a modular connectionist system vector transmission is possible in parallel, but the comparison process within a module must be done serially unless accuracy is sacrificed.
An Operator-Based Attentional Model of Rapid Visual Counting
In this paper we report on the use of our operatorbased model of human covert visual attention [Wiesmeyer and Laird, 1990] to account for reaction times in counting tasks in which a stimulus is presented and left undisturbed until a response is made. Previous explanations have not employed &n attentionally-driven model. Our model, which is based on the Model Human Processor [Card et a/., 1983], is an early selection model in which an attentional "zoom lens" [Eriksen and Yeh, 1985] operates under the control of cognition in order to both locate features in visual space and improve the quality of featural information delivered to short-term memory by perception. W e have implemented our model and the control structures to simulate rapid counting tasks in the Soar cognitive architecture [Laird et al., 1987], which has been suggested as the basis for a unified theory of cognition [Newell, 1990]. Reaction times in the counting task are explained using operator traces that correspond to sequences of deliberate acts having durations in the 50 msec range.
Paper Presentations -- Computer Interfaces
Educational Tools for What You Wanted to do Anyway
Approach, Technologies and Goals This paper describes a set of educational tools designed to support central cognitive skills such as argument analysis and construction, cooperative negotation, collaborative writing and scientific inquiry. In building these tools we drew upon multi-media technologies, interactive video and object-oriented programming techniques. Our approach however, was not motivated by the technologies but rather by a desire to embed our educational objectives in situations which were intrinsically interesting to the students. This approach, alluded to in the title of the paper, came out of our intention to have the student feel good about her personal interests, and further to have her feel good about her intellectual competence as a vehicle for furthering her interests.
The VCR Tutor: Evaluating Instructional Effectiveness
People use a wide variety of devices. Operation of a device can usually be described in terms of knowledge of specific procedural sequences. However, execution of procedures may also depend upon knowledge of the device, its behaviour, and the relationships between device features and device actions. A video cassette recorder (VCR) is one commonly used device. Programming a VCR to automatically record a chosen television program is an example of a device manipulation task. In designing a device tutor, it is relevant to ask how instruction about device operation should be designed, and to ask whether knowledge engineering for a device tutor should focus on procedural knowledge or involve factual and referential knowledge as well. Four versions of a tutoring system for the VCR device and programming task have been implemented, incorporating different tutorial approaches using different types of knowledge. The effectiveness of these versions has been examined experimentally. Subjects who used the knowledgeable tutoring version learned to program a VCR simulation using fewer steps and with fewer errors and error types than subjects who used a prompting version of the tutor.
ASK TOM: An Experimental Interface for Video Case Libraries
ASK TOM represents a new approach to structuring access to a newly emergent kind of knowledge base, the video case library. It is based on two premises: First, that cases and stories in the form of video clips can provide much of the viscerality and memorability that is lacking from textual forms of presentation; and second, that AI theories, specifically the approaches to memory organization derived from work in case-based reasoning, can provide the structure that is essential to achieving the shared context that makes communication possible. The aspect of AI research that is crucial in providing this structure does not concern algorithms. Rather, it is the content of domains and tasks that is paramount.
Language Differences in Face-to-Face and Keyboard-to-Keyboard Tutoring Sessions
Face-to-face and keyboard-to-keyboard tutoring sessions were recorded and analyzed as a first step in building a machine tutor that can understand and generate natural language dialogue. There were striking differences between these modes of interaction. The number of turns in an hour-long session dropped and so did the length of the sentences, although the number of sentences per turn stayed roughly the same. Students contributed 3 7 % of the words in the face-to-face sessions; their share dropped to 2 5 % in the keyboard sessions. Sentence structure is simpler in the keyboard sessions. The tutors ask more questions in the keyboard sessions; they explain this as a deliberate strategy to keep the dialogue going. The tutors also use a much wider range of expressions of acknowledgement in the keyboard sessions in a deliberate attempt to communicate verbally the kind of encouragement that is often expressed nonverbally in a face-to-face situation.
Paper Presentations -- Decision Making
Decision Making Connectionist Networks
A connectionist architecture is proposed and provides representations for probabilities and utilities, the basic elements of formal decision making theories. The outputs of standard feed forward feature-extraction networks then become inputs to this decision making network. A formalism shows how the gradient of expected utility can be hack propagated through the decision making network "down" to the feature extraction network. The formalism can be adapted to algorithms which optimize total or minimum expected utility. Utilities can be either given or estimated during learning. When utility estimation and decision making behavior adapt simultaneously, learning dynamics show properties contrasted to "puzzhng" observations made in experimental situations with human subjects. The results illustrate the interest in computational properties emerging out of the integration of elements of decision making formalisms and connectionist learning modeb.
Emergency Decision-Making by Nurses in the Context of Telephone Interactions
Li Montreal, nurses respond to 9-1-1 emergency calls for medical help, backed up by physicians when needed. In this context, they have to make rapid decisions based on limited and sometimes unreliable information. The purpose of this study was to describe the decision-making processes used by nurses in telephone triage and to examine the relations among these processes in relation to nurses' characteristics and performance. The study was conducted in real emergency conditions. The sample included 34 nurses and 50 calls. Each call was transcribed and subjected to performance evaluation and content analysis. This paper focuses on the cognitive analyses of two protocols associated with different outcomes. The results show that nurses' decision-making in triage situations are often based on surface features (patterns of symptoms) rather than the underlying pathophysiology, particularly in high urgency cases. High performance was related to decisions based on the evaluation of the whole emergency situation. The contribution of training and the effects of experience on triage performance are discussed.
Goal-based Decision Strategies
We present a process model of decision making mediated by goals and relationships. The model is implemented in the V O T E computer program which simulates Congressional roll call voting. In this paper, we focus on VOTE's decision strategies, which are based on the need not only to arrive at a vote, but also to produce an explanation for each decision. W e describe severed typical strategies, as well as an indirect strategy, Deeper Analysis, that is invoked when the normal strategies fail to arrive at a decision.
Paper Presentations -- Discovery Learning
Prescoolers' Understanding of Gravity
Previous research suggests that preschoolers have logical or cognitive deficits that limit their understanding of gravity as an explanatory concept. Four experiments were designed to test whether, in contrast to the results from previous research, preschoolers have a coherent, consistent and theoretical understanding of gravity. In each study, preschoolers made judgments regarding objects' behavior in at least one gravity-related event (e.g., speed of falling objects, trajectory of thrown objects, the behavior of balance scales). Predictions were made about children's performance based on the hypothesis that preschoolers understand gravity to be a property of objects. Predicted age-related changes in causal judgments were found on each task, as were positive correlations in performances across the tasks. The results support the claim that preschoolers understand gravity as a property of objects, an understanding that undergoes conceptual change.
The Invention of the Airplane
The invention of the airplane spans a period of 110 years from 1799 when Cayley first described the design of fixed-wing aircraft to 1909 when practical craft were flown at the Reims Air Show. At least 100 different designs were built and tested during this period, often at great expense, and occasionally at the cost of the pilot's life. With the exception of the Wright Brothers, progress was slow and sporadic. The Wrights needed only four years to develop their first airplane. Their efficiency is unique: N o other inventor was able to duplicate the steady rapid progress of the Wright Brothers or duplicate their results until details of the Wright craft became available in 1906.
Language Evolution and Human-Computer Interaction
Many of the issues that confront designers of interactive computer systems ako appear in natural language evolution. Natural languages and human-computer interfaces share as their primary mission the support of extended "dialogues" between responsive entities. Because in each case one participant is a human being, some of the pressures operating on natural languages, causing them to evolve in order to better support such dialogue, also operate on human-computer "languages" or interfaces. This does not necessarily push interfaces in the direction of natural language— since one entity in this dialogue is not a human, this is not to be expected. Nonetheless, by discerning where the pressures that guide natural language evolution also appear in human-computer interaction, we can contribute to the design of computer systems and obtain a new perspective on natural languages.
Human Discovery of Laws and Concepts; An Experiment
In order to understand the relationship between human and machine discovery, it is necessary to collect data about human discoveries which can be compared with machine discovery performance. Historical data are difficult to obtain, are very sparse, and arguably do not reflect a typical human performance. W e contend that cognitive experiments can provide meaningful data, because the discovery tasks can be carefully defined and tailored to the comparison task, and the subject selection can be controlled in diff'erent ways. We describe an experimental study of the human processes of concept formation and discovery of regularities. In our experiments human subjects were allowed to interact with three world models on the computer. Our results demonstrate that humans use heuristics similar to those used in computer discovery systems such as B A C ON or FAHRENHEIT on comparable tasks which include finding one dimensional regularities, generalizing them to more dimensions, finding the scope of a regularity, and introduction of intrinsic concepts. Virtually aU our subjects made some relatively simple discoveries, while some of them were able to develop a complete theory of simple world models. The progress made by human subjects on comparably simple tasks was impeded significantly when the domain became richer, as measured by the number of regularities, their dimensionaUty, and the number of intrinsic concepts involved. This can be called a contextual complexity phenomenon. Our subjects demonstrated a definite pattern of chaotic experimentation and lack of theoretical progress when the level of complexity was too high.
Paper Presentations -- Heuristics in Reasoning
Hypothesis Generation and the Coordination of Theory and Evidence in Medical Diagnostic Reasoning
This paper investigates the process of hypothesis generation and the coordination of hypothesis and evidence in medical diagnostic tasks. Two issues are addressed: the generation of hypothesis and the directionality of reasoning. Two problems whose initial presentation suggested an initial hypothesis were presented to subjects with different degrees of expertise in clinical medicine. When faced with contradictory evidence against the initial hypothesis, 1) early novices either modified the initial hypothesis, or ignored, or reinterpreted the cues in the problem to fit the hypothesis; 2) intermediate novices generated concurrent hypotheses to account for different sets of data; and 3) advanced novices generated several initial hypotheses and subsequently narrowed the hypothesis space by generating a single coherent diagnostic hypothesis. All subjects, used a mixture of forward reasoning and backward reasoning. A more forward-directed reasoning was related to diagnostic accuracy. These results on diagnostic reasoning are discussed in relation to findings on scientific reasoning.
The Heuristics of Spatial Cognition
Distance estimation has been used extensively in the investigation of cognitive maps, yet it is not well understood as a cognitive process in its own right and, as a result, has been viewed as a simple read-out from a spatial representation. In contrast, this paper considers distance estimation to be a complex mental process in which heuristics guide the choice of strategies. Specifically, verbal protocols were collected on a distance estimation task for 20 undergraduates using a variety of city pairs in U.S. and Canada. On the basis of these data, distance estimation is shown to be a constructive process, using a relatively limited number of heuristics, such as addition, hedges and ratios. The choice of heuristics and the time to make a judgment are shown to be related to variables such as the familiarity of locations and the distance to be judged. The advantage of viewing distance estimation as a constructive process rather than a passive readout off an intemal map is argued.
A Cascade-Correlation Model of Balance Scale Phenomena
The Cascade-Correlation connectionist architecture was used to model human cognitive development on balance scale problems. The simulations were characterized by gradual expansion of the training patterns, training bias in favor of equal distance problems, and test problems balanced for torque distance. Both orderly rule stages and torque difference effects were obtained. Analyses of the development of network structure revealed progressive sensitivity to distance information. It was noted that information salience effects, such as that for torque difference, are particularly difficult to capture in symbolic level models.
Poster Presentations
Dynamic Fact Communication Mechanism: A Connectionist Interface
Shastri and Ajjanagadde have proposed a biologically plausible connectionist rule-based reasoning system (hereafter referred to as a knowledge base, or K B ) , that represents a dynamic binding as the simultaneous, or in-phase, activity of the appropriate units [Shastri & Ajjanagadde 199'" "Hie work presented in this paper continues this effort at proviang a computational account of rapid, common-sense reasoning. The Dynamic Fact Communication Mechanism (DFCM) is a biologically plausible connectionist interface mechanism that extracts a temporally-encoded fact (i.e. a collection of dynamically-encoded bindings) from a source K B and incorporates the fact into a destination K B in a manner consistent with the knowledge already represented in the latter. By continually interpreting source K B activity in terms of target K B activity, D F C M is able to transfer facts between distinct KBs on the same time scale needed to perform a single rule application within a single KB. Thus, D F C M allows the benefits of decomposing a phase-based reasoning system into multiple KBs, each wi3» its own distinct phase structure, while rendering the inter-module communications costs negligible. A simple modification to D F C M allows the unit of transfer to be groups of facts. Finally, the number of units that compose DFCM is linear in the size of the K B .
Reading Instructions
This paper describes a model for reading instructions. The basic framework is that an agent engages in an activity and resorts to using instructions only when "all else fails". That is, by reading the instructions during the period of engagement, the meaning of the instructions can be clarified by feedback from the world. This model has been implemented in a computer program, IIMP. IIMP is the instruction reading component of FLOABN (Alterman et al., 1991), an integrated architecture whose domain is reasoning about the usage of mechanical and electronic devices.
The Time Course of Metaphor Processing: Effects of Subjective Familiarity and Aptness
A cross-tncxlal priming paradigm was used to investigate the time course of figurative activation for metaphors which varied in famiharity. In Experiment 1 the target was presented immediately at the offset of the vehicle. For high familiar metaphors, both literal and figurative interpretations showed evidence of immediate availability. For low familiar metaphors, the Uteral interpretation was available but the figurative target showed inhibition. Experiment 2 delayed presentation of the target 300 ms. and similar results were found, aldiough inhibition of the figurative target decreased. Together, Experiments 1 and 2 showed the figurative meaning is more readily available in highly famiUar metaphors. The results of Experiment 3 suggest metaphor aptness is especially important for low familiar metaphors. The implications of these findings for models of non-literal language are discussed.
The Effects of Feature Necessity and Extrinsicity on Category Contrast in Natural Language Categories
This experiment tested two hypotheses: 1) that categories represented by features that many people believe to be necessary will demonstrate stronger category contrast than those represented by features that few people believe to be necessary, and 2) that categories that people believe are represented primarily by intrinsic features (i.e., features true of an entity in isolation) will have stronger category contrast than those that people believe are represented primarily by extrinsic features (i.e., features that represent relations between an entity and other entities). The findings support only the second hypothesis.
Double Dissociation and Isolable Cognitive Processes
Data from Neuropsychology have been widely used in order both to lest pre-existing cognitive theories and to develop new accounts. Indeed, several theorists have used dissociations, and in particular double dissociations, both in theory testing and in developing new theoretical accounts Double dissociations are indeed believed to be a key tool in revealing the gross structure or "modularity" of cognitive processes. In this paper, in the light of a case study in which a simple electrical system is systematically lesioned, we argue that double dissociation in an arbitrary modular system need not, and typically will not, reveal that modularity. These results suggest that the observation of a double dissociation implies little about the structure of the underlying system. W e finish arguing that the weakness of the methods described involves that neurobiological data have to be seriously taken into account in order to uncover the real structure of the cognitive system.
Neuro-Soar: A Neural-Network Architecture for Goal-Oriented Behavior
The ability to set and achieve a wide range of goals is one of the principal hallmarks of intelligence. The issue of goals, and of how they can be achieved, has been one of the major foci of Artificial Intelligence (AI), and the understanding of how to construct systems that can accomplish a wide range of goals has been one of the major breakthroughs provided by the study of symbolic processing systems in AI. Neural networks, however, have not shared this focus on the issue of goals to any significant extent. This article provides a progress report on an effort to incorporate such an ability into neural networks. The approach we have taken here is to implement a symbolic problem solver within a neural network; specifically we are creating Neuro-Soar, a neural-network reimplementation of the Soar architecture. Soar is particularly appropriate for this purpose because of its well-established goal-oriented abilities, and its mapping onto levels of human cognition — in particular, the ways in which it already either shares, or is compatible with, a number of key characteristics of neural networks.
Interpretation of Definite Reference with a Time-Cosntrained Memory
In this paper, I demonstrate how complex cases of definite reference resolution can be processed within the independently motivated framework of time-constrained memory, and, most importantly, without having to resort to the complex mechanisms assumed by Haddock (1987). The key idea of the solution is to initiate a referent search for the complex definite N P formed by the attachment of a prepositional phrase to a definite noun phrase.
Action Planning: The Role of Prompts in UNIX Command Production
Our goal is to provide empirical support for assumptions of the Doane, Kintsch, & Poison (1989; 1990) consuuciion integration model for generating complex commands in UNIX. In so doing we designed a methodology that may be used to examine the assumptions of other cognitive models. The planning task studied was the generation of complex sequences of UhfIX commands. The sequences were novel, and as such could not be recalled from memory. W e asked users whose VNIX experience varied to produce complex UNIX commands, and then provided help prompts when the commands they produced were erroneous. The help prompts were designed to assist the subjects with both knowledge and processes which our UNIX modeling efforts have suggested were lacking in less expert users. There are two major findings. First, it appears that experts respond to different prompts than do novices. Expert performance is helped by the presentation of abstract information, while novice and intermediate performance is modified by presentation of concrete information. Second, while presentation of specific prompts aids the less expert, it does not appear to be sufficient information to obtain optimal performance. To do this, the less expert subjects require information about the ordering of the items in a command. Our analyses suggest that information about the ordering of prompts helps the less expert with memory load problems in a manner coiuistent with skill acquisition theories.
Recovering Structure from Expression in Music Performance
Mental representations of structural content in music can be communicated to listeners by expressive variations in performance. W e attempt to recover structural content from patterns of expression in skilled music performance, and we contrast possible mappings between structure and expression that allow communication of musical ideas. Three types of musical structure are investigated: metric, rhythmic grouping, and melodic accent structures. Skilled pianists performed musical sequences which were examined for expressive variations that coincide with each accent structure. The mapping of structure to expression is compared for music in which the accent structures are presented singly, are combined to coincide or conflict, or naturally co-occur. The findings suggest that associated sets of expressive variations in performance provide an unambiguous and flexible system for communicating musical structure.
Diagnostic Reasoning of High- and Low-Domain Knowledge
Thinking aloud protocols previously obtained by Joseph and Patcl were re-analyzed to determine the extent to which their conclusions could be replicated by independently developed coding schemes. The data set consisted of protocols from 4 cardiologists (low domain knowledge = LDK) and 4 endocrinologists (high domain knowledge = H D K ) , individually woricing on a diagnostic problem in endocrinology. Both analyses found that H D K physicians related daU to potential diagnoses more than the L D K group, and that there were trends for H D K physicians to be more focused on the correct diagnostic components and to employ more single-cue inference and less multiple-cue inference. However, the re-analysis found no meaningful differences between groups in diagnostic accuracy, speed of diagnosis, or in the breadth of the search space used to seek a solution. The generalizability of results of protocol-analysis studies can be assessed by using several complementary coding schemes.
Symbolic Action, Behavioral Control and Active Vision
This paper is about the interface between continuous and discrete robot control. W e advocate encapsulating continuous actions and their related sensing strategies into structures called situation specific activities, which can be manipulated by a symbolic reactive planner. The approach addresses the problem of turning symbolic actions into continuous activities, and the problem of mapping continuous input into discrete symbols for use in planning and modeling.
Modeling Human Memory Retrieval and Computer Informatino Retrieval: What Can the Two Fields Learn from each other
Models of human memory and computer information retrieval have many similarities in the methods they use for representing and accessing information. This article examines the methods and representations used in both human memory modeling and computer information retrieval and discusses similarities and differences From these similarities and differences, the features that lead to successful retrieval in both human memory and computer information retrieval domains can be determined. An analysis of these features can then help in the future design of both human and computer retrieval models.
Tabletop: An Emergent, Stochastic Model of Analogy-Making
This paper describes Tabletop, a computer progam that models human analogy-making in a micro-world consisting of a small table covered with ordinary table objects. We argue for the necessity, even in this simple domain, of an architecture that builds its own representations by means of a continual interaction between an associative network of fixed concepts (the Slipnet) and simple low-level perceptual agents (codelets), that relies on local processing and (simulated) parallelism, and that is fundamentally stochastic. Several problems solved by the Tabletop program are used to illustrate these principles.
Can Double Dissociation Uncover the Modularity of Cognitive Processes?
Neuropsychological evidence has proved influential both in testing pre-existing cognitive theories and in developing new accounts. It has been argued that dissociations, and, in particular, double dissociation are particularly valuable in developing new theoretical accounts, since they may reveal the gross structure or "modularity" of cognitive processes. In this paper, we show that even fully distributed systems -i.e. systems with no modularity can give rise to double dissociations. W e give the example of a recurrent neural network which draws loops and spirals which shows a double dissociation between the two tasks when lesioned. This result suggests that the observation of a double dissociation implies little about the modularity of the underlying system. In the final section we argue that a dual task technique can give additional hints about the structure of the underlying system because the class of distributed systems we describe are not able, in general, to perform two tasks at the same time. Finally, we argue that neurobiology has to be taken into account in order to interpret purely behavioral data.
The Development of the Notion of Sameness: A Connectionist Model
Comparison is of two types, the implicit sort that is behind all categorization and the explicit sort by which two object representations are compared in short-term memory. Children learn early on both to categorize and to compare explicitly, but they only learn to use dimensions in these processes considerably later. In this paper we present a connectionist model which brings together categorization and comparison, focusing on the development of the use of dimensions. The model posits (1) a general comparison mechanism which is blind to the nature of its inputs and (2) the sharing of internal object and dimension representations by categorization and comparison processes. Trained on the two processes, the system learns to use dimension inputs as filters on its representations for objects; it is these filtered representations which are matched in comparison. The model provides an account of the tendency for early comparison along one dimension to be disrupted by similarities along other dimensions and of the process by which the child might overcome this deficiency.
Concept Formation and Attention
In this paper, I combine the ideas of attention from cognitive psychology with concept formation in machine learning. M y claim is that the use of attention can lead to a more efficient learning system, without sacrificing accuracy. Attention leads to a savings in efficiency because it focuses only on the relevant attributes, retrieves less information from the environment, and is therefore less costly than a system that uses every piece of information available. I present a working dgorithm for attention, built onto the Classit concept formation system, and describe results from three domains.'
On the Problems with Lexical Rule Accounts of Argument Structure
It has recently been suggested that the different valence possibilities of a single verb stem can be accounted for by postulating lexical rules that operate on the semantic structure of verbs, producing different verb senses. Syntactic expression is then taken to be predicted by general linking rules that m a p semantic structure onto syntactic form (Alsina and M c h o m b o 1990, Bresnan and Moshi 1989, Levin 1985, Pinker 1989, Rappaport, Laughren, and Levin 1987). In this pauper, general problems with such approaches are discussed, including the following: a) such theories require a large number of both distinct verb senses and lexical rules, b) ad hoc and often implausible verb senses are required, c) an unwarranted asymmetry between different argument structures is posited, and d) many generalizations are obscured. An alternative is suggested that involves considering the various valences as templates or constructions that are paired with semantics independently of the verbs that may occur with them. For example, abstract semantics such as "X causes Y to receive Z," "X causes Y to become Z" etc. are associated directly with the skeletal syntactic ditransitive and resultative constructions, respectively, allowing the verbal predicates to be associated with richer frame-semantic representations.
Interactive Reasoning about Spatial Concepts
Spatial relations and spatial language form an important part of everyday reasoning. This paper describes SPATR, a system which addresses the labeUing of components of objects and the interpretation of spatial relations between objects within the framework of adaptive planning. S P A T R implements a model of spatial reasoning, which mediates among language, memory, and perception. Using a case-based approach for reasoning from past experience, S P A T R makes use of spatial relationships corresponding to closed class terms, as well as a 3D, hierarchical representation of objects for retrieving relevant past experience.
The Microgenetic Analysis of an Origami Task
An experiment was conducted in which subjects repeatedly constructed Origami boxes, following example models displayed by the experimenters. A microgenetic analysis was performed on videotapes of the experiment. Results show an increase in speed generally following the power law of practice, and the rearrangement and combination of operations into larger units. They give evidence for the importance of external information in the tasks people perform, and contain a possible example an occasion of insight prompted by earlier breakdowns. Most importantly, the experiment shows how an apparently straightforward improvement in performance can be dissected to uncover the myriad factors and effects that underlie it.
Partial Match and Search Control via Internal Analogy
In a previous study (Hickman & Laikin, 1990), we introduced a within-trial analogy mechanism called internal analogy that transfers both success and failure experiences between corresponding parts of the search tree for a single problem. In this paper, we describe powerful extensions to the learning procedure and their consequences on problem solving behavior First, we explain how our similarity metric can be naturally augmented to provide a more flexible partial match. To overcome the need for a static measure, however, we propose a mechanism that learns the appropriate level of partial match through feedback from previous analogical reasoning. Second, we show how this partial match mechanism controls the problem solver's search. Protocol dau from a subject working in a geometry theorem-proving domain provide support for the psychological fidelity of the extended internal analogy model.
Some Principles for Route Descriptions Derived from Human Advisers
Through a study with experienced driver-navigators, we have deduced some principles as to how route descriptions are constructed and expressed by humans. Some of these principles are implementable, and a rough outline of a program is presented. Given a plan of how to go from A to B in a city, the program produces a non-linguistic object that represents all the route information needed to present the route to a specific driver. A verbal description of that object is then producedl. The goal is to incorporate verbal descriptions in route guidance systems, primarily aimed at driver navigators with some knowledge of the city. Furthermore, we speculate into what kind of cognitive processes are involved when humans choose and describe routes.
Adult Age Differences in Visual Mental Imagery: Evidence for Differentially Age-Sensitive Components
A set of tasks developed in accordance with Kosslyn. Van Kleeck. & Kirby's (1990b) neurologically plausible model of visual mental imagery was used to explore effects of aging on specific component processes involved in image generation and maintenance. Contrary to the widely held belief that age differences in cognition are attributable to a single mechanism of global effect (e.g., a reduction in critical processing resources, Salthouse, 1988), results indicate that processes involved in visual mental imagery are differentially age-sensitive. More precisely, components required to actively maintain images are particularly sensitive to effects of aging, while those that access visual information from memory are not especially affected. Advantages of a componential approach to understanding age differences in cognitive processing are disscussed, as well as the potential for such agerelated changes to be a readily exploitable source of information regarding the functional architecture of cognition.
A Framework for Opportunistic Abductive Strategies
Any single algorithm for abduction requires specific kinds of knowledge and ignores other kinds of knowledge. A knowledge- based system that uses a single abductive method, is restricted to using the knowledge required by that method. This makes the system britde, because the single Fixed method can only respond appropriately in a limited range of situations and can only make use of a subset of the potentially relevant knowledge. In this paper, we describe a framework from which abductive strategies can be opportunistically constructed to reflect the problem being solved and the knowledge available to solve the problem. W e also describe ABDSoar, a Soar-based implementation
Button Theory: A Taxononmy of Student-Teacher Communication for Interface Design in Computer-Based Learning Environment
This paper introduces Button Theory whose two principle goals are, first, to provide a taxonomy of the ways that students might usefully interact with and control a computerbased teacher, and second, to provide a natural mechanism by which they may exercise that control. W e have developed a small but comprehensive set of messages that students would find it useful to convey to a teacher during a tutorial interaction, and have associated each message with a button presented iconically on the computer screen W e describe our experience with the use of Button Theory in a prototype computer-based teaching system, and demonstrate how, even with rather simple mechanisms, this framework enables surprisingly rich interactions.
Human Performance in Visually Directed Reaching Results in Systematic, Idiosyncratic Error
We study the performance of human subjects in a task which requires multi-jointed reaches to be made to targets spaced over a wide area. In accordance with established research, we find that subjects' reaches are not accurate when they carmot see either their hands or the targets. The errors subjects make are different at different targets, suggesting that they are due to an error in the plaiming of movements. However, contrary to existing models of this error, we find that it is highly idiosyncratic. This leads to the rejection of the most straightforward model of how reaching is learned, and poses problems which a future model must address.
Assessing Transfer of a Complex Skill
While recent studies have demonstrated various ways that transfer might be achieved in a domain, the measures used to assess transfer rarely stray from lime and error data. This paper examines transfer in the complex skill of computer programming in order to explore more flexible and sensitive methods of assessing transfer. In the experiment, subjects wrote both a P A S C A L and a LISP version of two programming problems. Although a simple accuracy measure provides evidence for knowledge transfer between the two programming languages, measures based on analyses of the task domain (i.e., partialcredit accuracy, strategy use) provide much stronger evidence. Curiously, these measures target different subjects as exhibiting transfer, suggesting that more than one type of knowledge may be available for transfer.
Interaction of Deductive and Inductive Reasoning Strategies in Geometry Novices
This paper is part of an effort to extend research on mathematical problem solving beyond the traditional focus on formal procedures (both in the classroom and in problem solving research). W e are beginning to investigate students' inductive discovery-oriented strategies and the interaction between these and formal deductive strategies. In contrast to typical classroom problems in math and science which demand the application of a learned formal procedure (e.g., prove X), we gave students more open-ended problems (e.g., is X true?) for which the formal deductive procedure is useful, but other, possibly informal or inductive, strategies are also potentially useful. The normative approach for solving these problems, in fact, requires the use of both a deductive strategy, which is definitive only when X is true, and an inductive search for examples, which is definitive only when X is not universally true. When presented with these problems we found that geometry students have some limited facility to perform the deductive strategy (though, less so in this context than when they are directly asked to vwite a proof) and use a degenerate version of the inductive strategy. Instead of considering multiple examples and looking for a counter-example, students tend to read off the conclusion from the single example (or model) we provided.
Multiassociative Memory
This paper discusses the problem of how to implement many-to-many, or multi-associative, mappings within connectionist models. Traditional symbolic approaches wield explicit representation of all alternatives via stored links, or implicitly through enumerative algorithms. Classical pattern association models ignore the issue of generating multiple outputs for a single input pattern, and while recent research on recurrent networks is promising, the field has not clearly focused upon multi-associativity as a goal. In this paper, we define multiassociative memory M M , and several possible variants, and discuss its utility in general cognitive modeling. W e extend sequential cascaded networks (Pollack 1987, 1990a) to fit the task, and perform several initial experiments which demonstrate the feasibility of the concept.
Processing Constraints and Problem Difficulty: A Model
In this paper we examine the role played by working memory demands in determining problem difficulty during the solution of Tower of Hanoi Problem isomorphs. W e do so by describing a production system model that accounts for subjects' performance on these problems via a dynamic analysis of the memory k>ad imposed by the problem and of changes in that load during the problem solving episode. W e also present the results of detailed testing of the model against human subject data. The model uses a highly constrained working memory to account for a number of features of the problem solving behavior, including the dichotomous (exploratory and final path) nature of the problem solving, the relative difficulty of the problems, the particular moves made in each state of the problem space, and the temporal patterning of the final path moves.
A Revision-Based Model of Instructional Multi-Paragraph Discourse Production
To communicate effectively, intelligent tutoring systems should be able to generate clear explanations of phenomena in their domain. To explain complex phenomena in scientific domains, they must be able to produce extensive multiparagraph discourse. Traditionally, discourse planners have taken a monoiomc approach to generation: once they make a decision, that decision is never revoked. Because these approaches make no provision for evaluating and modifying a plan after it has been constructed, their flexibility is limited. Tiiis inflexibility is particularly acute when attempting to generate multi-paragraph discourse. W e propose a revision-based model of discourse planning that constructs instructional multiparagraph discourse plans, evaluates them, and restructures them. This is accomplished by delaying organizational commitments as long as possible and interleaving the planner's content determination and organization activities. This model accords well with research on writing. It has been implemented in an experimental system, K N I G H T , a discourse generator for intelligent tutoring systems. K N I G H T generates multiparagraph explanations in the domain of biology. A domain expert hais analyzed KNIGHT's explanations and found them to be clear and accurate.
The Role of Conventionality in the Real-time Processing of Metaphor
This project is intended to ascertain the role of conventionality in the use of metaphors in natural language processing. It examines the relationship between the degree of conventionality of a metaphor and the degree of difficulty in processing metaphorical meanings. The overall purpose is to obtain evidence regarding the metaphoric knowledge approach (Martin 1990) which asserts that the interpretation of novel metaphors can be accomplished through systematic extension, elaboration, and combination of knowledge about already well-understood metaphors. Subjects were tested on parsing sentences with different degrees of metaphorical novelty. Reaction times along with their responses were analyzed. The results suggest that a) degrees of conventionality in metaphorical use have a significant effect on the processing of the metaphor, b) degrees of novelty are proportionally related to the degrees of difficulty in processing, and c) conventional metaphors are as privileged in sentence processing as the "literal meaning" uses.
Perception-mediated Learning and Reasoning in the CHILDLIKE System
Intelligent agents interacting with their environments combine information from several sense modalities and indulge in tasks that have components of perception, reasoning, learning and planning. Traditional AI systems focus on a single component. This paper highlights the importance of the integrated perceive-reason-act-learn loop, and describes a system designed to capture this loop. As a first step, it learns about simple objects, their qualities, and the words that name and describe them. The visual-linguistic associations formed serve as a bias in acquiring further knowledge about actions, which in turn aids the system in satisfying its internal needs (e.g., hunger, thirst, sleep, curiosity). Learning mechanisms that extract, aggregate, generate, de-generate and generalize build a hierarchical network (that serves as internal models of the environment) with which the system perceives and reasons.
ACKnowledge: An integrated workbend supporting Knowledge Acquisition.
Knowledge Acquisition is a crucial and time-consuming phase in the development of Knowledge Based Systems. The AC-Knowledge project aims to improve the efficiency of the knowledge acquisition process. The approach is to analyze and evaluate a variety of existing knowledge acquisition techniques, including machine learning methods. Taking into account their complementarities, we integrate these techniques into a Knowledge Engineering Workbench that supports the Knowledge Engineer in his various tasks. This approach is tested on real life applications, simple ones (e.g. analysis in metal fractures) and more complex ones(e.g. failures in the Spanish data communications network).
Analogical Transfer by Constraint Satisfaction
The robustness of analogical transfer based on the A C M E modeling of mapping by constraint satisfaction (Holyoak & Thagard, 1989) was investigated in a series of computational experiments using Hinton's (1986) "family tree" problem. Propositions were deleted randomly from the full representations of either both analogs (descriptions of an English and an Italian family) or just the target, and after mapping a "copy with substitutions" procedure was used to generate transfer propositions intended to restore the full representational structures. If as many as 5 0 % of the propositions in the target analog were deleted, the system was able to recreate all of the missing information without error; significant recovery was obtained even if as many as 8 0 % of the target propositions were deleted. Robustness was only slightly reduced when the two analogs lacked any similar predicates, so that mapping depended solely on structural constraints. Transfer was much more impaired when deletions were made from both analogs, rather than just the target. The results indicate that for isomorphic representations, analogical transfer by constraint satisfaction can exceed the regenerative capacity of general learning algorithms, such as back-propagation.
A Constraint-Motivated Model of Concept Formation
A cognitive model for learning associations between words and objects is presented. We first list basic constraints to which the model must adhere. The constraints arise from two sources. First they stem from observed psychological phenomena including typicality effects, extension errors observed from children and belief-dependent behavior. Secondly they arise from our choice to integrate the model in a unified theory of cognition. In presenting the constraints to the model's construction, w e motivate our design decisions while describing our algorithm that takes a symbolic, production-based approach. The model's adherence to the constraints is further supported by some empirical results.
A Computational Basis for Brown's Results on Morpheme Order Acquisition
This paper presents the result that a computer program can mimic the acquisition by children of a sdected set of grammatical morphemes. Roger Brown [Brown, 1973] studied the acquisition of 14 morphemes, and showed how a set of partial order relations describes this aspect of child language learning. W e show that these relations can be given a computational basis. They follow directly from a class of Boolean learning algorithms which have three simple constraints in the manner in which they consider hypotheses. I will call these three constraints the C A M constraints. C A M constraint 1 is to increase the length of the conjuncts one term at a time. The second C A M constraint is to consider all hypotheses of the same length simultaneously. Finally, C A M constraint 3 is to collect all single-term hypotheses involving noun features into a single conjunction prior to Boolean learning.
An Alternative to Deduction
Deductive representaticais are well defmed, easily inspected, and precise. However, they are also brittle, inflexible and difficult to debug. W e propose a plausible representation whose inference mechanism is weaker than its deductive counterpart. This will allow it to reason with knowledge which is less precise, and replaces the notion of global consistency, with the weaker constraint of local consistency in its explanations ^
Searching an Hypothesis Space When Reasoning About Buoyant Forces: The Effect of Feedback
This study addressed the following three questions: (1) To what extent can people's naive, complex, and idiosyncratic knowledge about a real physical domain be captured in a formal representation of an hypothesis space? (2) How does exposure to increasingly complex instances affect subjects' search through the hypothesis space? (3) What is the effect of feedback on hypothesis revision? Six adult subjects solved a series of physics problems involving buoyant forces and liquid displacement. An analysis of subjects' verbal protocols suggests: (1) Naive, complex and idiosyncratic knowledge can characterized by an hypothesis space and changes in that knowledge can then be described as a search through the hypothesis space; (2) People who receive feedback from experimental outcomes change their hypotheses and reach a higher level in the hypothesis space. Mere problem exposure, without feedback, did not lead to hypothesis revision.
Classifying Faces by Race and Sex Using an Autoassociative Memory
We examine the ability of an autoassociative memory trained with faces to classify faces by race and by sex. The model learns a low-level visual coding of Japanese and Caucasian male and female faces. Since recall of a face from the autoassociative memory is equivalent to computing a weighted sum of the eigenvectors of the memory matrix, faces can be represented by these weights and the set of corresponding eigenvectors. W e show that reasonably accurate classification of the faces by race and sex can be achieved using only these weights. Hence, race and sex information can be extracted in the model without explicitly learning the classification itself.
Effect of Format on Information and Problem Solving
This study reports the effect of differences in format of Prolog tracers on Prolog problem solving tasks. Three different tracers (Spy, T P M , and EPTB) in different formats were tested to check for their relative effectiveness in solving five different Prolog problems. 43 subjects attempted to solve each problem with each trace (15 problems in total). Preliminary analysis of solution times and response data indicate that E P T B performed best across all problems. An account for this finding is presented, as is one for a number of interesting interactions between the effects of problem type and trace format, which supports the general conclusion that while format is a significant determiner of access to information, it can also constrain the sorts of problems that could be solved readily with that information.
A Modular Natural Language Processing Architechture to Aid Novel Interpretation
Successful and robust natural language processing must efficiently integrate multiple types of information to produce an interpretation of input. Previous approaches often rely heavily on either syntax or semantics, verbspecific or highly general representations. A careful task analysis identifies principled subsets of information from across these spectra are needed. This presents challenges to efficient and accurate processing. W e present a modular architecture whose components reflect the distinct types of information used in processing. Its control mechanism specifies the principled manner in which components share information. W e believe this architecture provides benefits for processing sentences with novel verbs, ambiguous sentences, and sentences with constituents placed outside their canonical position.
Information Gathering as a Planning Task
Existing planners fall into two broad categories: reactive planners that can react quickly to changes in the world, but do not project the expected results of a proposed sequence of actions, and classical planners that perform detailed projections, but make assumptions that are unrealistic when operating in a complex and dynamic world. Ideally, a planning agent in such a world should be able to do both. In order to do this, the agent has to be able to differentiate between those situations in which detailed information would aid it in making its decisions, and and those in which such information would not materially improve its performance. W e propose an approach to this problem, using well-characterized heuristics to decide what information would be useful, whether to gather it and if so, how.
Evaluation of Explanatory Hyptotheses
Abduction is often viewed as inference to the "best" explanation. However, the evaluation of the goodness of candidate hypotheses remains an open problem. Most artificial intelligence research addressing this problem has concentrated on syntactic criteria, applied uniformly regardless of the explainer's intended use for the explanation. We demonstrate that syntactic approaches are insufficient to capture important differences in explanations, and propose instead that choice of the "best" explanation should be based on explanations' utility for the explainer's purpose. We describe two classes of goals motivating explanation: knowledge goals reflecting internal desires for information, and goals to accomplish tasks in the external world. We describe how these goals impose requirements on explanations, and discuss how we apply those requirements to evaluate hypotheses in two computer story understanding systems.
Goal Inference in Information-seeking Environments
In cooperative information-seeking environments, we have observed that the dialogues have the following characteristics: (1) they contain sufficient relevant information, (2) they are coherent, and (3) they are well-structured. In this paper, we describe a mechanism for plan inference which takes advantage of these observed features to reduce the number of alternate interpretations of a user's statements. This reduction is achieved as follows: initially, we take advantage of the relevant information trait by using guiding principles and meta predicates to constrain the number of possible interpretations of a single statement. Discourse coherence considerations are then applied to integrate subsequent statements and drop incoherent interpretations. The retained interpretations are evaluated using a measure of information content, which is used to prefer the interpretations that have more relevant information. The entire mechanism is based on an approach that takes advantage of the well-structured nature of information-seeking dialogues to arrive at the intended interpretation as efficiently as possible.
Towards Fair Comparisons of Connectionist Algorithms through Automatically Optimized Parameter Sets
The learning rate and convergence of connectionist learning algorithms are often dependent on their parameters. Most algorithms, if their parameters have been optimized at all, have been optimized by hand. This leads to absolute and relative performance problems. In absolute terms, researchers may not be getting optima] performance from their networks. In relative terms, comparisons of unoptimized or hand optimized algorithms may not be fair. (Sometimes even one is optimized and the other not.) This paper reports data suggesting that comparisons done in this manner are suspect. An example algorithm is presented that finds better parameter sets more quickly and fairly. Use of this algorithm (or similar techniques) would improve performance in absolute terms, provide fair comparisons between algorithms, and encourage the inclusion of parameter set behavior in algorithmic comparisons.
Indexing Cases for Planning and Acting in Dynamic Environments: Exploiting Hierarchical Goal Structures
We examine how acting in dynamic, complex, not entirely predictable environments affects the indexing, storage and reuieval of cases in a memory-based system. W e discuss how a hierarchical goal sUiicture can be exploited to provide indices for searching and storage when planning and acting in everyday environments under time pressure. The tradeoffs between the costs and utility associated with attempting to prevent repeating a failure or missing an opportunity are briefly examined. Considering these tradeoffs leads to distinguishing between when failures can be allowed to recur and when they should be anticipated and avoided. The amount of effort expended when handling failures differs for the two situations, but in both cases a hierarchical goal structure can be used to choose effective indices efficiently. This paper describes the approach taken in our EXPEDITER^ system and briefly compares it to other approaches.
Belief Relativity
This paper describes a model of belief systems called belief relativity (BR), which addresses the relationships and structure of knowledge held by multiple interacting agents. This paradigm uses belief reference frames (b-frames) as the main unit of belief spaces, within which an agent's beliefs are stored. B R is concerned with how beliefs are created and revised, how they influence each other within or between b-frames, and how one searches for b-frames that are useful (e.g., that remove contradictions). B R also deals with degrees of belief, propagated along the influences that relate beliefs and b-frames to each other. B R attempts to combine the best features of these idejis into a unified, synergistic framework.
Modeling an Experimental Study of Explanatory Coherence
The problem of evaluating explanatory hypotheses is to choose the hypothesis or theory that best accounts for, or explains, the given evidence. Thagard (e.g.. 1989) and Ranney (in press; Ranney & Thagard, 1988) describe a theory of explanatory coherence intended to account for a variety of explanatory evaluations; this theory has been implemented in a coimectionist computer model, ECHO. In this study, we examine three questions regarding the relationship between human explanatory reasoning and echo's explanatory evaluations: Does E C H O predict subjects' evaluations of interrelated propositions? Are local temporal order differences (not explicitly modeled by ECHO) important to the subjects? Does E C H O predict subjects' inflectional reasoning? W e found that subjects often entertain competing hypodieses as nonexclusive and presume an implied backing for certain (superordinate) hypotheses. These tendencies were modeled in E C H O by assigning a fraction of data priority (usually reserved for evidence) to the superordinate hypotheses. In simi, the E C HO model helps to interpret subjects' reasoning patterns, and shows continued potential for simulating explanatory coherence processes.
Empirical and Analytical Performance of Iterative Operators
Macro-operators and chunks have long been used to model the acquisition and refinement of procedural knowledge. However, it is clear that human learners use more sophisticated techniques to encode more powerful operators than simple linear macro-operators: specifically, linear macro-operators cannot represent arbitrary repetitions of operators. This paper presents a process-model for the acquisition of iterative macrooperators, which are an efficient representation of repeating operators. W e show that inducing iterative macro-operators from empirical problem-solving traces provides dramatically better efficiency results than simple linear macro-operators. This domain-independent learning mechanism is integrated into the FERMI problemsolver, giving more evidence that humans have a similar learning capability.
Where am I? Similarity Judgement and Expert Localization
How do skilled map-readers use topographic maps to figure out where in the world they are? Our research addresses this question by studying the problem solving of experienced map-readers as they solve localization Where am I? - problems. Localization relies upon judgments of similarity and difference between the contour information of the map and the topographic information in the terrain. In this paper we discuss experiments that focus on how map-readers use attributes and structural relations to support judgments of similarity and difference. In our field and laboratory experiments, experienced map-readers implicitly define attributes to be detailed descriptors of individual topographic features. They use structural relations that link two or more topographic features as predicates. The time-course of their problem solving suggests that attributes and relations are psychologically distinct. Attributes like slope, e.g., "steep (hill)", support only initial judgments of difference. Relations like "(this hill) falls steeply down into (a valley)" are more powerful, supporting both judgments of difference and judgments of similarity. Judgments based on relations are used to test hypotheses about location. Experienced map readers exploit the distinction between attributes and relations as they solve localization problems efficiently.
Syntactic Category Formation with Vector Space Grammars
A method for deriving phrase structure categories from structured samples of a context-free language is presented. The learning algorithm is based on adaptation and competition, as well as error backpropagation in a continuous vector space. These connectionist-style techniques become applicable to grammars as the traditional grammar formalism is generalized to use vectors instead of symbols as category labels. More generally, it is argued that the conversion of symbolic formalisms to continuous representations is a promising way of combining the connectionist learning techniques with the structures and theoretical insights embodied in classical models.
Dynamic Inferencing in Parallel Distributed Semantic Networks
The traditional approach to dynamic inferencing is to represent knowledge in a symbolic hierarchy, find the most specific information in the hierarchy that relates to the input, and apply the attached inferences. This approach provides for inheritance and parallel retrieval but at the expense of very complex learning and access mechanisms. Parallel Distributed Processing (PDP) systems have recently emerged as an alternative. PDP systems use a very simple processing mechanism, but can only eiccess high-level knowledge sequentially and require an enormous amount of training time. This paper presents Parallel Distributed Semantic (PDS) Networks, an approeich that integrates the best features of symbolic and PDP systems by storing the content of symbolic hierarchies in ensembles of P D P networks, connecting the networks in the manner of a semantic network, and using Propagation Filters to determine how information is passed between networks. Simulation results are presented which indicate that P D S Networks and Propagation Filters are able to perform pattern completion from partial input, generate dynamic inferences, and propagate role bindings.
Accessing Meaning vs. Form at Different Levels of Comprehension Skill
We examined adult and 10-13 year old skilled and average comprehenders' representation of spoken sentences. In immediate probe tasks, skilled adults were better than average adults at accessing word order information, but they were poorer at accessing sentence meaning. After hearing a text, skilled adults were more accurate than average adults in recognizing meaning, but they were less accurate in recognizing the wording of test sentences. Speeded speech increased the differences between skill groups more for memory for wording than for memory for meaning. The results suggest that comprehenders compute representations of surface form and meaning independently and simultaneously. These representations compete for attention.
Providing Natural Representations To Facilitate Novices' Understanding in a New Domain: Forward and Backward Reasoning in Programming
In many domains, novices exhibit a bias in the direction in which they reason about problems. Earlier studies of LISP programmers using a graphical representation suggested that novice LISP programmers tend to reason forward, working from initial input data toward the goal. W e examined novice programmers learning LISP using the GIL programming tutor and manipulated the direction subjects were allowed to reason (forward, backward, or free). Subjects who were required to work backwards (from goal toward givens) exhibited more difficulty solving the problems than subjects working forward or subjects left free to chose their direction. Backward subjects required more time to solve problems, made more errors, and required more time to plan each solution. W e suggest that these effects and preferences occur because forward reasoning is more congruent with the way novices reason about computer programs, resulting in an increased working memory load for subjects required to work backward.
A Schema-based Approach to Cooperative Behavior
Agents can rely on the patterns in the world to make their problem solving more efficient. When working with others, agents can also rely on patterns - patterns for communication and group behavior. We discuss how these patterns may be captured in schemas. W e present two types of schemas: procedural schemas which suggest a course of action for a specific situation, and contextual schemas which contain knowledge about specific kinds of problem solving. Both of these types of schemas affect an agent's ability to solve problems and communicate. Both types of schemas also guide the coordination of the groups working together to solve problems. In this paper, we focus particularly on the ways in which a schema-based approach can help agents to work together by integrating their individual problem solving with the constraints of coordinated behavior.
A Case-Based Model of Creativity
Creating new solutions to problems is an integral part of the problem-solving process. This paper presents a cognitive model of creativity in which a case-based problem-solver is augmented with a set of creativity heuristics. N e w solutions are discovered by solving a slightly different problem and adapting that solution to the original problem. This model has been implemented in a computer program called MINSTREL.
Efficient Nonlinear Problem Solving using Casual Commitment and Analogical Replay
Complex interactions among conjunctive goals motivate the need for nonlinear planners. Whereas the literature addresses least commitment approaches to the nonlinear planning problem, we advocate a casual-commitment approach that finds viable plans incrementally. In essence, all decision points are open to introspection, reconsideration, and learning. In the presence of background control knowledge - heuristic or definitive - only the most promising parts of the search space are explored to produce a solution plan efficiently. An analogical replay mechanism is presented that uses past problem solving episodes as background control guidance. Search efforts are hence amortized by automatically compiling and reusing past experience by derivational analogy. This paper reports on the full implementation of the casual-commitment nonlinear problem solver of the prodigy architecture. The principles of nonlinear planning are discussed, the algorithms in the implementation are described in some detail, and empirical results are presented that illustrate the search reduction when the nonlinear planner combines casual commitment and analogical replay.
A Neural Model of Temporal Sequence Generation with Interval Maintenance
Based on an interference theory of forgetting in short-term memory (STM), we model STM by a network of neural limits with mutual inhibition. Sequences are acquired by combining a Hebbian learning rule and a normalization rule with sequential system activation. As long as sequences are acquired, they can be recognized without being affected by speeds in presentation. The model of sequence reproduction consists of two reciprocally connected networks, one of which behaves as sequence recognizers. Reproduction of complex sequences is shown to be able to maintain interval lengths of sequence components. A mechanism of degree self-tuning based on a global inhibitor is proposed for the model to optimally learn required context lengths in order to disambiguate associations in complex sequence reproduction.
A Continum of Induction Methods for Learning Probability Distributions with Generalization
Probabilistic models of pattern completion have several advantages, namely, ability to handle arbitrary conceptual representations including compositional structures, and explicitness of distributional assumptions. However, a gap in the theory of induction of priors has hindered probabilistic modeling of cognitive generalization bitises. W e propose a family of methods parameterized along a value 7 that controls the degree to which the probability distribution being induced generalizes from the training set. The extremes of the 7-continuum correspond to relative frequency methods and extreme maximum entropy methods. The methods apply to a wide range of pattern representations including simple feature vectors as well as frame-like feature DAGs.
The Interaction of Internal and External Representations in a Problem Solving Task
In these studies I examine the role of distributed cognition in problem solving. The major hypothesis explored is that intelligent behavior results from the interaction of internal cognition, external objects, and other people, where a cognitive task can be distributed among a set of representations, some internal and some external. The Tower of Hanoi problem is used as a concrete example for these studies. In Experiment 1 I examine the effects of the distribution of internal and external representations on problem solving behavior. Experiments 2 and 3 focus on the effects of the structural change of a problem on problem solving behavior and how these effects depend on the nature of the representations. The results of all studies show that distributed cognitive activities are produced by the interaction among the internal and external representations. External representations are not simply peripheral aids. They are an indispensable part of cognition. Two of the factors determining the performance of a distributed cognitive system are the structure of the abstract problem space and the distribution of representations across an intemal mind and the external world.