About
The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.
Volume 12, 1990
Paper Presentations -- Group 1: Reasoning
Effect of Structure of Analogy and Depth of Encoding on Learning Computer Programming
This research addresses ihc need for effective ways of teaching computer programming. It focuses on two aspects of instruction. First, the research investigates the use of analogy in teaching programming. It extends existing research by investigating what constitutes a good analogy. Second, the research investigates the effect of depth of encoding on programming performance. The factors analogy and encoding were manipulated in a 3 X 2 factorial design. Analogy was operalionalized by varying iJie clarity and systcmaticity/absu-actncss of the analogies used. Encoding was operationalized by varying the frequency with which deep encoding and elaboration of learned material were invoked by the presentation of questions on the learned material. The dependent variables were score obtained on program comprehension and program composition tasks and the time taken to perform the tasks. Research subjects were 15- to 17-year-olds without prior exposure to computer programming. Differences in mathematics ability and age were controlled. The results provide empirical support for a predictive theory of the relative goodness of competing analogies. They provide only marginal support for depth of encoding (as opcrationalizcd) in learning computer programming effectively. Post hoc data analysis suggests that good analogies assist the learning of semantics but not syntax. Furthermore, the effect of encoding was only apparent in learning syntax but not semantics.
Evaluating and Debugging Analogically Acquired Models
We describe elements of a cognitive theory of analogical reasoning. The theory was developed using protocol data and has been implemented as a computer model. In order to constrain the theory, it has been developed within a problem-solving context, reflecting the purpose of analogical reasoning. This has allowed us to develop: A purpose-constrained mapping process which makes learning and debugging more tractable; An evaluation process that actively searches for bugs; And a debugging process that maintains functional aspects of base models, while adding target-appropriate causal explanations. The active, knowledge-based elements of our theory are characteristic of mechanisms needed to model complex problem-solving.
Analogical Process Performance
Analogy is one of the primary mechanisms of cognition, particularly in problem-solving and learning. However, people do not use analogies very effectively. I postulate seven separate processes for analogy that could be responsible for weak analogical reasoning and test those processes independently. The results suggest that performance on analysis of the problem and performance on confirmation of the appropriateness of the analogy both might be suspect in analogical deficits.
The Effects of Familiar Labels on Young Children's Performance in an Analogical Mapping Task
This research investigates the role of language in children's ability to perform an analogical mapping task. We first describe the results of a simple mapping task in which preschool children performed poorly. In the current study, we taught the children to apply relational labels to the stimuli and their performance improved markedly. It appears that relational language can call attention to domain relations and hence improve children's performance in an analogical mapping task. A computer simulation of this mapping task was performed using domain representations that differed in their degree of elaboration of the relational structure. The results of the simulation paralleled the experimental results: that is, given deeply elaborated representations, SME's preferred interpretation produced the correct mapping response, while when given shallow representations its preferred interpretation produced an object similarity response. Taken together, the empirical and computational findings suggest that development of analogy and similarity may be explainable in large measure by changes in domain representation, as opposed to maturational changes in processing. They further suggest that relational
Representational Issues in Analogical Transfer
Lack of transfer may result in part from a critical, though often ignored factor: the form of the initial representation of information during the process of analogical transfer. Using a Gick and Holyoak (1980, 1983) replication, in which subjects read a story in the guise of a memory experiment, subjects were later required to solve a problem which could be solved using an analogous strategy suggested by the story. Transfer performance was measured by the presence or absence of this target solution in subjects' protocols. The text of the original General story (from Gick & Holyoak) was modified slightly in one condition, where one role in the story was replaced by another type of actor. The changes were minor, as shown by the fact that the story modification did not affect similarity ratings between the story and problem. However the changes did appear to affect subjects' initial representation of the story and, as a result. Improve subsequent transfer to the problem. The results indicate that forming an initial representation of the story that is congruent with important features of the problem is critical for analogical transfer. Subjects' abstraction of a general problem solving schema is an inadequate explanation of these results.
Analogical mapping During Similarity Judgements
We propose that carrying out a similarity comparison of two objects or scenes requires that their components be aligned in a manner akin to analogical mapping. W e present an experiment which supports this claim and then examine a computer simulation of these results which is consistent with the idea that a process of mapping and alignment occurs during similarity judgments.
Internal Analogy: A Model of Transfer within Problems
Understanding problem solving and methods for learning is a main goal of cognitive science. Analogical reasoning simplifies problem solving by transferring previously learned knowledge from a source problem to the current target problem in order to reduce search. To provide a more detailed analysis of the mechanisms of transfer, w e describe a process called internal analogy that transfers experience from a completed subgoal in the same problem to solve the current target subgoal. W e explain what constitutes an appropriate source problem and what knowledge to transfer from that source, in addition to examining the associated memory organization. Unlike case-based reasoning methods, this process does not require large amounts of accumulated experience before it is effective; it provides useful search control at the outset of problem solving. Data from a study of subjects solving DC-circuit problems designed to facilitate transfer supports the psychological validity of the mechanism.
Making SME greedy and pragmatic
The Structure-Mapping Engine (SME) has successfully modeled several aspects of human analogical processing. However, it has two significant drawbacks: (l) SME constructs all structurally consistent interpretations of an analogy. While useful for theoretical explorations, this aspect of the algorithm is both psychologically implausible and computationally inefficient. (2) SME contains no mechanism for focusing on interpretations relevant to an analogizer's goals. This paper describes modifications to SME which overcome these flaws. W e describe a greedy merge algorithm which efficiently computes an approximate "best" interpretation, and can generate alternate interpretations when necessary. W e describe pragmatic marking, a technique which focuses the mapping to produce relevant, yet novel, inferences. W e illustrate these techniques via example and evaluate their performance using empirical data and theoretical analysis.
Analogical Interpretation in Context
This paper examines the principles underlying analogical similarity and describes three important limitations with traditional views. It describes contextual structure-mapping, a more knowledge intensive approach that addresses these limitations. The principle insight is that each element of an analogue description has an identifiable role, corresponding to the dependencies it satisfies or its relevant properties in the given context. Analyzing role information provides a powerful framework for characterizing analogical similarity, relaxing the one-to-one mapping restriction prevalent in computational treatments of analogy, and understanding how such similarities may be used to assist problem solving. Second, it provides a unifying view of some of the central intuitions behind a number of converging efforts in analogy research.
Multiple Abstracted Representations in Problem Solving and Discovery in Physics
We discuss the process of mathematization in science, focusing on uses that theorists make of physical representations that w e refer to as abstracted models. W e review abstracted models constructed by Faraday and Maxwell in the mathematization of electromagnetic phenomena, including Maxwell's use of an analogy between continuum dynamics and electromagnetism. W e discuss ways in which this example requires major modifications of current cognitive theories of analogical reasoning and scientific induction, especially in the need to understand the use of abstracted models containing theoretically meaningful objects that can be manipulated and modified in the development of new concepts and mathematized representations.
Viewing Design as a Cooperative Task
Design can be modeled as a multi-agent planning task where several agents that possess different expertise and evaluation criteria cooperate to produce a design. The differences m a y result in conflicts that have to be resolved during design. The process by which conflict resolution is achieved is negotiation. In this paper, w e propose a model of group problem solving among cooperating experts that supports negotiation. The model incorporates accessing information from a case memory of existing designs, communication of design rationale, evaluation and critiquing of design decisions. Incremental design modifications are performed based on constraint relaxation and comparison of utilities.
The Temportal Nature of Scientific Discovery: The Roles of Priming and Analogy
One of the most frequently mentioned sources of scientific hypotheses is analogy. Despite attractiveness of this mechanism of discovery, there has been but a small success in demonstrating that people actually use analogies while solving problems. The study reported below attempted to foster analogical U-ansfer in a scientific discovery task. Subjects worked on two problems that had the same type of underlying mechanism. On day 1, subjects discovered the mechanism that controls virus reproduction. On day 2, subjects returned to work on a problem in molecular genetics that had a similar underlying mechanism. The results showed that experience at discovering the virus mechanism did facilitate performance on the molecular genetics task. However, the verbal protocols do not indicate that subjects analogically mapped knowledge from the virus to the genetics domain. Rather, experience with the virus problem appeared to prime memory. It is argued that analogical mapping can be used flexibly in scientific discovery contexts and that primed knowledge structures can also provide access to relevant information when analogical mapping fails.
Reasoning Directly from Cases in a Case-Based Planner
A good deal of the reasoning done in a case-based planning system can be done directly from (episodic) cases, as opposed to specialized memory structures. In this paper, we examine the issues involved in such direct reasoning including how this representation can support multiple uses, and what role execution plays in such a framework. We illustrate our points using COOKIE, a direct case-based planner in the food preparation domain.
An Internal Contradiction of Case-Based Reasoning
In a case-based reasoning system, one simple approach to assessment of similarity of cases to a given problem situation is to create a linear ordering of the cases by similarity according to each relevant domain factor. Using Arrow's Impossibility Theorem, a result from social welfare economics, a paradox is uncovered in the attempt to find a consistent overall ordering of cases by similarity that satisfactorily reflects these individual rankings. The implications of the paradox for case-based reasoning are considered.
Qualitative Reasoning about the Geometry of Fluid Flow
Understanding the interaction between dynamics and geometry is crucial to capturing commonsense physics. This paper presents a qualitative analysis of the direction of fluid flow. This analysis is dependent on qualitative descriptions of the surface geometry of rigid bodies in contact with the fluid and a pressure change in fluid. The key problem in designing an intelligent system to reason about fluid motion is how to partition the fluid at an appropriate level of representation. The basic idea of our approach is to incrementally generate the qualitatively different parts of fluid. We do this by dynamically analyzing the intereiction of geometry and pressure disturbance. Using this technique, we can derive all possible fluid flows.
Is There a Default Similarity Distance for Categories?
How do people decide whether or not an item belongs to a new category, the variability o-f which they do not know We postulate that people have a de-fault similarity distance (DSD) which they use when no other in-formation about the variability o-f a category is available. To test our claim, subjects were asked to tell how they would instruct a being from another world to distinguish members o-f a category, by shoMnng pictures. The categories were from different levels thus dif-fering in variability. For highly variable categories subjects tended to present multiple positive instances (thus indicating their extraordinary variability), whereas -for narrow categories they tended to present negative instances (thus explicitly delimiting them). These results indicated that a norm, relative to which additional in-formation is supplied, lay in between. Indeed, there was a level at which subjects apparently relied on DSD, -finding it sufficient to show but a single exemplar of the categor/. This happened with basic-level categories -for 3th graders and adults and with subordinate categories -for 2nd graders, thus demonstrating a developmental trend in what is considered a normal standard category.
Learning Attribute Relevance in Context in Instance-Based Learning Algorithms
There has been an upsurge of interest, in both artificial intelligence and cognitive psychology, in exemplar based process models of categorization, which preserve specific instances instead of maintaining abstractions derived from them. Recent exemplar-based models provided accurate fits for subject results in a variety of experiments because, in accordance with Shepard's (1987) observations, they define similarity to degrade exponentially with the distance between instances in psychological space. Although several researchers have shown that an attribute's relevance in similarity calculations varies according to its context (i.e., the values of the other attributes in the instance and the target concept), previous exemplar models define attribute relevance to be invariant across all instances. This paper introduces the G C M - I S W model, an extension of Nosofsky's G C M model that uses context-specific attribute weights for categorization tasks. Since several researchers have reported that humans make context-sensitive classification decisions, our model will fit subject data more accurately when attribute relevance is context-sensitive. W e also introduce a process component for G C M - I S W and show that its learning rate is significantly faster than the rates of previous exemplar-based process models when attribute relevance varies among instances. G C M - I S W is both computationally more efficient and more psychologically plausible than previous exemplar-based models.
Effects of Background Knowledge on Family Resemblance Sorting
Previous studies on category construction have shown that people have a strong bias of creating categories based only on a single dimension. Ahn and Medin (1989) have developed a two-stage model of category construction to explain why we have categories structured on the basis of overall similarity of members in spite of this bias. The current study investigates effects of background knowledge on category construction. The results showed that people created family resemblance categories more frequently when they had a priori knowledge on prototypes of potential family resemblance categories. It was also found that people created family resemblance categories much more frequently when they had knowledge on underlying dimensions which integrated surface features of examples. H o w the two-stage model should be extended is discussed.
Superordinate and Basic Level Categories in Discourse: Memory and Context
Representations for natural category systems and a retrieval-based framework are presented that provide the means for applying generic knowledge about the semantic relationships between entities in discourse and the relative salience of these entities imposed by the current context. An analysis of the use of basic and superordinate level categories in discourse is presented, and the use of our representations and processing in the task of discourse comprehension is demonstrated.
Learning Overlapping Categories
Models of human category learning have predominately assumed that both the structure in the world and the analogous structure of the internal cognitive representations are best modeled by hierarchies of disjoint categories. Strict taxonomies do, in fact, capture important structure of the world. However, there are realistic situations in which systems of overlapping categories can engender more accurate inferences than can taxonomies. Two preliminary models for learning overlapping categories are presented and their benefit is illustrated. The models are discussed with respect to their potential implications for theory-based category learning and conceptual combination.
The Right Concept at the Right Time How Concepts Emerge as Relevant in Response to Context-Dependent Pressures
A central question about cognition is how, faced with a situation, one explores possible ways of understanding and responding lo it In particular, how do concepts initially considered irrelevant, or not even considered at all, become relevant in response to pressures evoked by the understanding process itself We describe a model of concepts and high-level perception in which concepts consist of a central region surrounded by a dynamic nondeterministic "halo of potential associations, in which relevance and degree of association change as processing proceeds. As the representation of a situation is built, associations arise and are considered in a probabilistic fashion according to a parallel terraced scan, in which many routes toward understanding the situation are tested in parallel, each at a rate and to a depth reflecting ongoing evaluations of its promise. We describe a computer program that implements this model in the context of analogy-making, and illustrate, using screen dumps from a run, how the program's ability to flexibly bring in appropriate concepts for a given situation emerges from the mechanisms we are proposing.
Classification of Dot Patterns with Competitive Chunking
Chunking, a familiar idea in cognitive science, has recently been formalized by Servan- Schreiber and Anderson (in press) into a theory of perception and learning, and it successfully simulated the human acquisition of an artificial grammar through the simple memorization of exemplar sentences. In this article I briefly present the theory, called Competitive Chunking, or CC, as it has been extended to deal with the task of encoding random dot patterns. I explain how C C can be applied to the classic task of classifying such patterns into multiple categories, and report a successful simulation of data collected by Knapp and Anderson (1984). The tentative conclusion is that people seem to process dot patterns and artificial grammars in the same way, and that chunking is an important part of that process.
Can Causal Induction Be Reduced to Associative Learning
A number of researchers have recently claimed that higher-order human learning, such as categorization and causal induction, can be explained by the same principles as govern lower order learning, such as classical conditioning in animals. An alternative view is that people often impose abstract causal models on observations, rather than simply associating inputs with outputs. W e report three experiments using a multiple-cue learning paradigm in which models based on associative learning versus abstract causal models make opposing predictions. We show that different causal models can yield radically different learning from identical observations. In particular, we compared people's abilities to learn when the positive cases were defined by a linear cue-combination rule versus a rule involving a within-category correlation between cues. The linear structure was more readily learned when the cues were interpreted as possible causes of an effect to be predicted, whereas the correlated structure was more readily learned when the cues were interpreted as the effects of a cause to be diagnosed. The results disconfirm all associative models of causal induction in which inputs are associated with outputs without regard for causal directionality.
Decision Models: A Theory of Volitional Explanation
This paper presents a theory of motivational analysis, the construction of volitional explanations to describe the planning behavior of agents. We discuss both the content of such explanations, as well as the process by which an understander builds the explanations. Explanations are constructed from decision models, which describe the planning process that an agent goes through when considering whether to perform an action. Decision models are represented as explanation patterns, which are standard patterns of causality based on previous experiences of the understander. We discuss the nature of explanation patterns, their use in representing decision models, and the process by which they are retrieved, used and evaluated.
Knowledge Goals: A Theory of Interestingness
mbinatorial explosion of inferences has always been one of the classic problems in AI. Resources are limited, and inferences potentially infinite; a reasoner needs to be able to determine which inferences are useful to draw from a given piece of text. But unless one considers the goals of the reasoner, it is very difficult to give a principled definition of what it means for an inference to be "useful." This paper presents a theory of inference control based on the notion of interestingness. W e introduce knowledge goals, the goals of a reasoner to acquire some piece of knowledge required for a reasoning task, as the focussing criteria for inference control. W e argue that knowledge goals correspond to the interests of the reasoner, and present a theory of interestingness that is functionally motivated by consideration of the needs of the reasoner. Although we use story understanding as the reasoning task, many of the arguments carry over to other cognitive tasks as well.
The Dempster-Shafer Theory of Evidence as a Model of Human Decision Making
Many psychology researchers have shown that humans do not process probabilistic information in a manner consistent with Bayes' theory [9, 10, 16, 24, 23, 27]. Robinson and Hastie [24, 23] showed that humans made non-compensatory probability updates, produced super-additive distributions, and resuscitated zero probability possibilities. While most researchers have classified these behaviors as nonnormative, we found that the Dempster-Shafer theory could model each of these behaviors in a normative and theoretically sound fashion. While not claiming that the theory modeb human processes, we claim that the similarities should aid user acceptance of Dempster-Shafer based decision systems.
Feature Selection and Hypothesis Selection Models of Induction
Recent research has shown that the prior knowledge of the learner influences both how quickly a concept is learned and the types of generalizations that a learner produces. W e investigate two learning frameworks that have been proposed to account for these findings. Here, w e contrast/eamre selection models of learnmg wu\a hypothesis selection models. W e report on an experiment that suggests that human learners use prior knowledge both to indicate what features may be relevant and to influence how the features are combined to form hypotheses. W e present an extension to the PostHoc system, a hypothesis selection model of concept learning, that is able to account for differences in learning rates observed in the experiment.
A Rule Based Model of Judging Harm-doing
A rule based computational model of the judgment of harm-doing is presented that qualitatively simulates the major principles of an emerging psychological theory of common sense moral reasoning. Simulation results indicate that the model, called M R for Moral Reasoner, generates verdicts in substantial agreement with those reached in somewhat difficult court cases. A higher rate of agreement with outcomes produced in simpler cases from traditional cultures suggests that the model possesses a good deal of cultural universality. Systematic damaging of the rules in the model indicated that most of the rules are essential in producing a high rate of agreement with court decisions and identified some rules regarding the mental state of the accused that, individually, are less essential because they compensate for each other.
The Mechanism of Restructuring in Geometry
Restructuring consists of a change in the representation of the current search state, a process which breaks an impasse during problem solving by opening up new search paths. A corpus of 52 think-aloud protocols from the domain of geometry was scanned for evidence of restructuring. The data suggest that restructuring is accomplished by re-parsing the geometric diagram.
Noticing Opportunities in a Rich Environment
Opportunistic planning requires a talent for noticing plans' conditions of applicability in the world. In a reasonably complex environment, there is a great proliferation of features, and their relations to useful plans are very intricate. Thus, ''noticing" is a very complicated affair. To compound difficulties, the need to efficiently perceive conditions of applicability is simultaneously true for the thousands of possible plans an agent might use. We examine the implications of this problem for memory and planning behavior, and present an architecture developed to address it. Tools from signal detection theory and numerical optimization provide the model with a form of learning.
Recognizing Novel Uses for Familiar Plans
Analogical design and invention is a central task in human cognition. Often during the process the designer/inventor gets stuck; backs off from the problem; and only later, after having put the problem aside, discovers that some famihar plan can be used in a novel way to solve the problem. W e describe a system which uses a causal case memory to check the side effects, preconditions, etc. of incoming events in order to model this phenomenon. The method used makes this work relevant to case-based reasoning as well as design. It also forms a companion issue to execution-time planning.
Planning to Learn
The thesis of this paper is that learning is planful, goal-directed activity - that acquiring knowledge is intentional action. I present evidence that learning from one's experiences requires making decisions about what is worth learning, regardless of the specific mechanisms underlying the learning or of the degree of consciousness or automaticity or level of effort of the learning. Decisions about what is worth learning are the expressions of desires about knowledge. I then sketch a theory of whence desires for knowledge arise, how they are represented, and how they are used. A taxonomy of learning actions is also proposed. This theory has been partially implemented in two computer models, which are briefly described.
Brainstormer: A Model of Advice-Taking
Research on advice-taking in artificial intelligence is motivated by the promise of knowledge-based systems that can accept high-level, human-like instruction [11]. Examining the activity of human advice-taking is a way of determining the key computational problems that a fully automated advice taker must solve. In this paper, we identify three features of human advice-taking that pose computational problems, and address them in the context of brainstormer, a planning system that takes advice in the domain of terrorist crisis management.
Reasoning With Function Symbols In A Connectionist System
One important problem to be addressed in realizing connectionist reasoning systems is that of dynamically creating more complex structured objects out of simpler ones during the reasoning process. In rule based reasoning, this problem typically manifests as the problem of dynamically binding objects to the arguments of a relation. In [1,7], w e described a rule-based reasoning system based on the idea of using synchronous activation to represent bindings. A s done in almost all other connectionist reasoning systems developed so far, there, w e restricted our focus on the problem of binding only static objects to arguments. This paper describes how the synchronous activation approach can be extended to bind dynamically created objects to arguments, to form more complex dynamic objects. This extension allows the rule-based reasoning system to deal with function symbols. A forward reasoning system incorporating function terms is described in some detail. A backward reasoning system with similar capabilities is briefly sketched and the way of encoding long-term facts involving function terms is indicated. Several extensions to the work are briefly described, one of them being that of combining the rule based reasoner with a parallelly operating equality reasoner. The equality reasoner derives new facts by substituting equivalent terms for the terms occurring in the facts derived by the rule-based reasoner.
Not All Potential Cheaters are Equal: Pragmatic Strategies In Deductive Reasoning
This work briefly discusses one of the central problems in the current psychology of reasoning: that of explaining the effects of content. T w o competing theories recently proposed to explain such effects (pragmatic reasoning schemas and social contract theories) are illustrated with reference to an experiment on reasoning in children employing a selection problem, which requires a search for the potential counterexamples of a conditional rule. On the one hand, the theory of pragmatic schemas (i.e. clusters of rules related to pragmatically relevant actions and goals) predicts that correct selection performance derives from the activation of specific contractual schemas, such as obligation and permission, the production rules of which correspond to the logic of implication. On the other hand, according to the social contract theory, people are able to detect potential counterexamples only when they correspond to the potential cheaters of rules having the form 'If benefit A is received, then cost B must be paid'. The results of the experiment show that performance on tasks of this kind is not determined simply by the possibility of representing the rule in question in cost-benefit terms; to predict performance one necessary factor is knowledge of the nature of the possible cheating behaviour that one is requested to check.
Explanations in Cooperative Problem Solving Systems
It is our goal to build cooperative problem solving systems, knowledge-based systems that leverage the asymmetry between the user's and the system's strengths and thus allow the dyad of user and computer system to achieve what neither alone could achieve. Our experience has shown that in these cooperative systems, the need for explanations is even more evident than in traditional expert systems. This is due to the fact that these new systems are more open-ended and flexible and therefore allow for more possibilities in which a user can reach an impasse, a point at which it is not clear how to proceed. Observation of human-human problem solving shows that people are sensitive to the domain under discussion and the other's knowledge of that domain. People tend to construct explanations that are minimal in the number of concepts or chunks. These explanations are not comprehensive, and the communication partner is able to follow up on aspects which are still unclear.
Improving Explanatory Competence
Explanation plays an important role in acquiring knowledge, solving problems, and establishing the credibility of conclusions. One approach to gaining explanatory competence is to acquire proofs of the domain inference rules used during problem solving. Acquiring proofs enables a system to strengthen an imperfect theory by connecting unexplained rules to the underlying principles and tacit assumptions that justify their use. This paper formalizes the task of improving explanatory competence through acquiring proofs of domain inference rules and describes KI, a knowledge acquisition tool that discovers proofs of rules as it integrates new information into a knowledge base. KI's learning method includes techniques for controlling the search for proofs and evaluating multiple explanations of a proposition to determine when they cim be transformed into proofs of domain inference rules.
Incremental Envisioning: The Flexible Use of Multiple Representations in Complex Problem Solving
In this paper we describe two properties of most psychological and AI models of scientific problem solving: they are one-peiss, and feed forward. W e then discuss the results of an experiment which suggests that experts use problem solving representations more flexibly than these models suggest. W e introduce the concept of incremental envisioning to account for this flexible behavior. Finally, we discuss the implications of this work for psychological models of scientific problem solving and for AI programs which solve problems in scientific domains.
Task-Based Criteria for Judging Explanations
AI research on explanation has not seriously addressed the influence of explainer goals on explanation construction. Likewise, psychological research has tended to assume that people's choice between explanations can be understood without considering the explainer's task. W e take the opposite view: that the influence of task is central to judging explanations. Explanations serve a wide range of tasks, each placing distinct requirements on what is needed in an explanation. W e identify eight main classes of reasons for explaining novel events, and show how each one imposes requirements on the information needed from an explanation. These requirements form the basis of dynamic, goal-based explanation evaluation implemented in the program ACCEPTER . We argue that goal-based evaluation of explanations offers three important advantages over static criteria: First, it gives a way for an explainer to know what to elaborate if an explanation is inadequate. Second, it allows cross-contextual use of explanations, by deciding when an explanation built in one context can be applied to another. Finally, it allows explainers to make a principled decision of when to accept incomplete explanations without further elaboration, allowing explainers to conserve processing resources, and also to deal with situations they can only partially explain.
Are There Developmental Milestones in Scientific Reasoning
This paper presents a conceptual framework that integrates studies on scientific reasoning that have been conducted with different age subjects and across different experimental tasks. Traditionally, different aspects of scientific reasoning have been emphasized in studies with different aged subjects, and the different literatures are somewhat unconnected. However, this separation leads to a disjointed view of the development of scientific reasoning, and it leaves unexplained certain adult behaviors in very difficult scientific reasoning contexts. In this paper we attempt to integrate these three approaches into a single framework that describes the process of scientific reasoning as a search in an hypothesis space and an experiment space. We will present the results from a variety of studies conducted with preschool, elementary school, and adult subjects, and will show how differences in performance can be viewed as differences in the knowledge and strategies used to search the two spaces. Finally, we will present evidence showing that, in sufficiently challenging situations, adults exhibit deficits of the same sort that young children exhibit, even though one might have expected that these developmental milestones were long since passed.
Why Fodor and Pylyshyn Were Wrong: The Simplest Refutation
This paper offers both a theoretical and an expcrimcnial perspective on the relationship between connectionist and Classical (symbol-processing) models. Firstly, a serious flaw in Fodor and Pylyshyn's argument against connectionism is pointed out: if, in fact, a part of their argument is valid, then it establishes a conclusion quite different from that which they intend, a conclusion which is demonstrably false. The source of this flaw is traced to an underestimation of the differences between localist and distributed representation. It has been claimed that distributed representations cannot support systematic operations, or that if they can, then they will be mere implementations of traditional ideas. This paper presents experimental evidence against this conclusion: distributed representations can be used to support direct structure-sensitive operations, in a manner quite unlike the Classical approach. Finally, it is argued that even if Fodor and Pylyshyn's argument that connectionist models of compositionality must be mere implementations were correct, then this would still not be a serious argument against connectionism as a theory of mind.
Paper Presentations -- Group 2: Language
Phonological Rule Induction: An Architectural Solution
Acquiring phonological rules is hard, especially when they do not describe generalizations that hold for all surface forms. W e believe it can be made easier by adopting a more cognitively natural architecture for phonological processing. W e briefly review the structure of M^P, our connectionist Many Maps Model of Phonology, in which extrinsic rule ordering is virtually eliminated, and "iterative" processes arc handled by a parallel clustering mechanism. W e then describe a program for inducing phonological rules from examples. Our examples, drawn from Yawelmani, involve several complex rule interactions. The parallel nature of M ^ P rule application greatly simplifies the description of these phenomena, and makes a computational model of rule acquisition feasible.
Discovering Faithful 'Wickelfeature' Representations in a Connectionist Network
challenging problem for connectionist models is the representation of varying-length sequences, e.g., the sequence of phonemes that compose a word. One representation that has been proposed involves encoding each sequence element with respect to its local context; this is known as a Wickelfeature representation. Handcrafted Wickelfeature representations suffer from a number of limitations, as pointed out by Pinker and Prince (1988). However, these limitations can be avoided if the representation is constructed with a priori knowledge of the set of possible sequences. This paper proposes a specialized connectionist network architecture and learning algorithm for the discovery of faithful Wickelfeature representations — ones that do not lose critical information about the sequence to be encoded. The architecture is applied to a simplified version of Rumclhart and McCleiland's (1986) verb past-tense model.
Constraints on Assimilation in Vowel Harmony Languages
Over the last 10 years, the assimilation process referred to as vowel harmony has served as a test case for a number of proposals in phonological theory. Current autosegmental approaches successfully capture the intuition that vowel harmony is a dynamic process involving the interaction of a sequence of vowels; still, no theoretical analysis has offered a non-stipulative account of the inconsistent behavior of the so-called "transparent", or disharmonic, segments. The current paper proposes a connectionist processing account of the vowel harmony phenomenon, using data from Hungarian. The strength of this account is that it demonstrates that the same general principle of assimilation which underiies the behavior of the "harmonic" forms accounts as well for the apparently exceptional "transparent" cases, without stipulation.
Recency Preference and Garden-Path Effects
Following Fodor (1983), it is assumed that the language processor is an automatic device that maintains only the best of the set of all compatible representations for an input string. One way to make this idea explicit is to assume the serial hypothesis: at most one representation for an input string is permitted at any time (e.g.. Frazier & Fodor (1978), Frazier (1979), and Pritchett (1988)). This paper assumes an alternative formulation of local memory restrictions within a parallel framework. First of all, it is assumed that there exists a number of structural properties, each of which is associated with a processing load. One structure is preferred over another if the processing load associated with the first structure is markedly lower than the processing load associated with the second. Thus a garden-path effect results if the unpreferred structure is necessary for a grammatical sentence. This paper presents three structural properties within this framework: the first two- the Properties of Thematic Assignment and Reception- derivable from the ^-Criterion of Government and Binding Theory (Chomsky (1981)); and the third- the Property of Recency Preference- that prefers local attachments over more distant atuchments. This paper shows how these properties interact to give appropriate predictions- garden-path effects or not- for a large array of local ambiguities.
A Connectionist Treatment of Grammar for Generation
Connectionist language generation promises better interaction between syntactic and lexical considerations and thus improved output quality. To realize this requires a connectionist treatment of grammar. This paper explains one way to do so. The basic idea is that constructions and their constituents are nodes in the same network that encodes world knowledge and lexical knowledge. The principal novelty is reliance on emergent properties. This makes it unnecessary to make explicit syntactic choice or to build up representations of sentence strucuire. The scheme includes novel ways of handling constituency, word order and optional constituents; and a simple way to avoid the problems of instantiation and binding. Despite the novel approach, the syntactic knowledge used is expressed in a form similar to that often used in linguistics; this representation straightforwardly defines parts of the knowlege network. These ideas have been implemented in FIG, a 'flexible incremental generator.'
Harmonic Grammar - A Formal Multi-Level Connectionist Theory of Linguistic Wll-formedness: Theoretical Foundations
In this paper, we derive the formalism of harmonic grammar, a connectionist-based theory of linguistic well formedness. Harmonic grammar is a two-level theory, involving a low level connectionist network using a particular kind of distributed representation, and a second, higher level network that uses local representations and which approximately and incompletely describes the aggregate computational behavior of the lower level network. The central hypothesis is that the connectionist well-formedness measure Harmon)^ can be used to model linguistic well-formedness; what is crucial about the relation between the lower and higher level networks is that there is a harmony-preserving mapping between them: they are isoharmonic (at least approximately). In a companion paper (Legendre, Miyata, & Smolensky, 1990; henceforth, "LMSi"), we apply harmonic grammar to a syntactic problem, unaccusativity, and show that the resulting network is capable of a degree of coverage of difficult data that is unparallelled by symbolic approaches of which we are aware: of the 760 sentence types represented in our data, the network correctly predicts the acceptability in all but two cases. In the present paper, we describe the theoretical basis for the two level approach, illustrating the general theory through the derivation from first principles of the unaccusativity network of LMSj.
A Parallel Constraint Satisfaction and Spreading Activation Model for Resolving Syntactic Ambiguity
This paper describes a computational architecture whose emergent properties yield an explanatory theory of human structural disambiguation in syntactic processing. Linguistic and computational factors conspire to dictate a particular integration of symbolic and connectionist approaches, producing a principled cognitive model of the processing of structural ambiguities. The model is a hybrid massively parallel architecture, using symbolic features and constraints to encode structural alternatives, and numeric spreading activation to capture structural preferences. The model provides a unifying explanation of a range of serial and parallel behaviors observed in the processing of structural alternatives. Furthermore, the inherent properties of active symbolic and numeric information correspond to general cognitive mechanisms which subsume a number of proposed structural preference strategies.
Functional Constraints on Backwards Pronominal Reference
How does the syntax of a sentence constrain speakers' selection of pronominal referents? Drawing on work by functionalist grammarians, we describe the communicative effect of using a pronoun vs. a definite noun phrase, a matrix vs. a subordinate clause, and the simple past tense vs. anterior/imperfective aspect. Our analysis allowed us to predict differences in coreference judgements for the following three sentence types: He worked on a top-secret project when John was ordered to quit. He was working on a top-secret project when John was ordered to quit. When he worked on a top-secret project, John was ordered to quit. Coreference judgements from 70 speakers supported our predictions and our research program: A n adequate characterization of how syntax constrains sentence comprehension requires reference to the communicative functions performed by syntactic forms.
A Semantic Analysis of Action Verbs Based on Physical Primitives
We develop a representation scheme for action verbs and their modifiers based on decompositional analysis emphasizing the implemeiitability of the underlying semantic primitives. Our primitives pertain to mechanical characteristics of the tasks denoted by the verbs; they refer to geometric constraints, kinematic and dynamic characteristics, and certain aspectual characteristics such as repetitiveness of one or more sub-actions, and definedness of termination points.
Semantic Classifcation of verbs from their Syntactic Contexts: Automated Lexicography with Implications for Child Language Acquisition
Young children and natural language processing programs share an insufficient knowledge of word meanings. Children catch up by learning, using innate predisposition and observation of language use. However, no one hafi demonstrated artificial devices that robustly learn lexical semantic classifications from example sentences. This paper describes the ongoing development of such a device. A n early version discovers verbs with a non-stative sense by searching in unrestricted text for verbs in syntactic constructions forbidden to statives. Our program parses unrestricted text to the extent necessary for classification. Once the parsing is done recognizing the telltale constructions is so easy even a two-year-old child could do it. In fact, Landau and Gleitman (1985) and especially Gleitman (1989) argue that children must, can, and do use the syntactic constructions in which verbs appear to support meaning acquisition. In this paper we use our program to examine the difficulty of exploiting two particular syntactic constructions to discover the availability of non-stative senses, concluding that only very little sophistication is needed. This conclusion bolsters the general position of Gleitman (1989) that children can exploit syntactic context to aid in semantic classification of verbs.
Sense Generation or How to Make the Mental Lexicon Flexible
In this paper we address some key issues in the psychology of word meaning, and thereby motivate a Sense Generation approach to the diversity of senses that a word may have. We note that an adequate account must allow for the flexibility and specificity of senses, and must also make appropriate distinctions between default and non-default senses of a word, and between different senses for vague and ambiguous words. W e then discuss two central components of a theory of sense. Firstly, lexons, the stable representations, in a "mental lexicon", of word meanings; secondly, senses, the mentally represented descriptions associated with particular uses of words. W e argue that the crucial issues in accounting for the diversity of sense, are: the number of lexons we need to postulate, and the relationship between the contents of those lexons and their associated senses. Sense Selection accounts, of which we distinguish Strong and Weak versions, both of which find considerable support in the cognitive science Literature, fail to Account for the flexibility and specificity of senses in a way that is consonant with linguistic evidence regarding the ambiguity of words, and psychological evidence regarding the coherence which underlies their use. W e will show how the Sense Generation approach, by positing a nonmonotonic relationship between lexons and their senses, respects these considerations. W e sketch this approach, and finally note some of its promising implications for other aspects of word meaning.
A Distributed Feature Map Model Of The Lexicon
DISLEX models the human lexical system at the level of physical structures, i.e. maps and pathways. It consists of a semantic memory and a number of modality-specific symbol memories, implemented as feature maps. Distributed representations for the word symbols and their meanings are stored on the maps, and linked with associative connections. The memory organization and the associations are formed in an unsupervised process, based on co-occurrence of the physical symbol and its meaning. DISLEX models processing of ambiguous words, i.e. homonyms and synonyms, and dyslexic errors in input and in production. Lesioning the system produces lexical deficits similar to human aphasia. DISLEX-1 is an AI implementation of the model, which can be used as the lexicon module in distributed natural language processing systems.
Efficient Learning of Language Categories: The Closed-Category Relevance Property and Auxiliary Verbs
This paper describes the mechanism used by the A L A C K language acquisition program for identification of auxiliary verbs. Pinker's approach to this problem (Pinker, 1984) is a general learning algorithm that can learn any Boolean function but takes time exponential in the number of feature dimensions. In this paper, we describe an approach that improves upon Pinker's method by introducting the Closed-Category Relevance Property, and showing how it provides the basis of an algorithm that learns the cleiss of Boolean functions that is believed suffcient for natural language, and does not require more than linear time as feature dimensions are added.
The Representation fo Word Meaning
This article shows that a substantial portion of the empirical evidence regarding the representation of word meaning can be explained by the definitional semantic theory of Jerrold J. Katz (henceforth ST) . First, we look at the relative complexities of four types of negatives which, according to Fodor, Fodor, and Garrett (1975; henceforth FFG) , show that definitions are not psychologically real. The ST definitional structures explain the FFG results in terms of the number of disjuncts generated by the negative elements in their ST representations. Next we look at the arguments of Centner for componential structure from evidence from a recall experiment that considers connectedness relationships between the noun phrases of sentences with three types of verbs. Centner's results can be explained in terms of the number of argument places in the ST representations, and the same explanation can be used with respect to evidence from studies by
Lexical Cooccurrence Relations in Text Generation\
In this paper we address the organization and use of the lexicon giving special consideration to how the salience of certain aspects of abstract semantic structure may be expressed. We propose an organization of the lexicon and its interaction with grammar and knowledge that makes extensive use of lexical functions from the Meaning-Text-Theory of Mel'cuk. W e integrate this approach with the architecture of the PENMAN text generation system, showing some areas where that architecture is insufficient, and illustrating how the lexicon can provide functionally oriented guidance for the generation process.
Learning Lexical Knowledge in Context: Experiments with Recurrent Feed Forward Networks
Recent work on representation in simple recursive feed forward connecdonist networks suggests that a computational device can learn linguistic behaviors without any explicit representation of linguistic knowledge in the form of rules, facts, or procedures. This paper presents an extension of these methods to the study of lexical ambiguity resolution and semantic parsing. Five specific hypotheses are discussed regarding network architectures for lexical ambiguity resolution and the nature of their performance: (1) A simple recurrent feed forward network using back propagation can learn to predict correctly the object of ambiguous verb "take out" in specific contexts; (2) Such a network can likewise predict a pronoun of the correct gender in the appropriate contexts; (3) The effect of specific contextual features increases with their proximity to the ambiguous word or words; (4) The training of hidden recurrent networks for lexical ambiguity resolution improves significantly when the input consists of two words rather than a single word; and (5) The principal components of the hidden units in the trained networks reflect an internal representation of linguistic knowledge. Experimental results supporting these hypotheses are presented, including analysis of network performance and acquired representations. The paper concludes with a discussion of the work in terms of computational neuropsychology, with potential impact on clinical and basic neuroscience.
How to Describe What? Towards a Theory of Modality Utilization
this paper we outline the first steps of an investigation of the nature of representations information, an investigation that uses as a starting point the various ways in which people tend to communicate different kinds of information. Our hope is that by identifying the regularities of presentation, in particular by finding out when people decide to switch presentation modalities and what they tlien tend to do, we will be able to shed light on the nature of the underlying representations and processes of communication between people.
Towards a Failure-driven Mechanism for Discourse Planning: a Characterization for Learning Impairments
In the process of generating discourse, speakers generate utterances which directly achieve the communicative goal of conveying an information item to a hearer, and they also generate utterances which prevent the disruption of correct beliefs maintained by a hearer or the inception of incorrect beliefs. In this paper, we propose a representation scheme which supports a discourse planning mechanism that exhibits both behaviors. Our representation is based on a characterization of commonly occurring impairments to the knowledge acquisition process in terms of a model of a hearer's beliefs. As a testbed of these ideas, a discourse planner called WISHFUL is being implemented in the domain of high-school algebra.
Coherence Relation Reasoning in Persuasive Discourse
One major element of discourse understcinding is to perceive coherence relations between portions of the discourse. Previous computational approaches to coherence relations reasoning have focused only on expository discourse, such as task-oriented dialog or database querying. For these approaches, the main processing concern is the clarity of the information that is to be conveyed. However, in a persuasive discourse, such as debates or advertising, the emphasis is on the adequacy of presenting the information, not just on clarity. This paper proposes a formaUsm and a system in which coherence relations corresponding to speech actions such as clarify, make adequate and remind are represented. Furthermore, in relating to human reasoning in general where studies have revealed that implicational and associative reasoning schema are prevalent across various domains, this formalism demonstrates that coherence relation reasoning is similar to this human reasoning, in the sense that coherence relations can be defined by domain independent implicational and associative schema. A prototype system based on this formalism is also demonstrated in this paper in which real world advertisements are processed.
Analyzing Research papers using citation sentences
By focusing only on the citation sentences in a research document, one can get a good feel for how the paper relates to other research and its overall contribution to the field. The main purpose of a citation is to explicitly link one research paper to another. We present a taxonomy of citation types based upon empirical data and claim that we can recognize these citation types using domain-independent predictive parsing techniques. Finally, an experiment based on a corpus of research papers in the field of machine learning demonstrates that this is a promising new approach for processing expository text.
Participating in Plan-Oriented Dialogs
Participants in plan-oriented dialogs often state beliefs about plan applicability conditions, enablements, and effects. Often, they provide these beliefs as pieces of mostly unstated chains of reasoning that justify their holding various beliefs. Understanding a dialog response requires recognizing which beliefs are being justified and inferring the unstated but necessary beliefs that are part of the justification. And producing a response requires determining which beliefs need to be justified and constructing the reasoning chains that justify holding these beliefs. This paper presents a knowledgestructure approach to these tasks. It shows how participants can use general, conunonsense planning heuristics to recognize which reasoning chains are being used, and to construct the reasoning chains that justify their beliefs. Our work differs from other work on understanding dialog responses in that we focus on recognizing justifications for beliefs about a participant's plans and goals, rather than simply recognizing the plans and goals themselves. And our work differs from other work on producing dialog responses in that we rely solely on domain-independent knowledge about planning, rather than on domain- or task-specific heuristics. This approach allows us to recognize and formulate novel belief justifications.
Fictional Narrative Comprehension
An analysis of the structure of sentences found in fictional text, and the interpretation that one gives to them has led to the proposal that all fictional text is written from a perspective within the fictional world of the story. In a like manner, readers read the story from a similar perspective. The author "pretends" that he is in the story by locating an image of herself somewhere within the space-time of the story (even at times within characters of the story) and creates the sentences from that vantagepoint. The story and its sentences must contain cues so that readers can use the text to discover the perspectival sources of the sentences. They can then pretend that they are "in" the story and can read it from those perspectives. The perspective from which the sentences are read is called the "Deictic Center." This proposal is associated with ongoing research to implement a cognitive model which reads fictional text according to these principles.
Thematic Roles and Pronoun Comprehension
Two experiments tested the view that thematic roie information triggers the rapid retrieval of general knowledge in pronoun comprehension. Pairs of thematic roles were contrasted as antecedents of a subsequent pronoun. The results showed that intepretation of the pronoun depended on the thematic role of the antecedent. Experiment one measured reading rates for the clause which contained the pronoun. Rates were faster when the antecedent was an Agent subject, a Patient object, a Goal, or an Experiencer. Rates were slower when the antecedent was an Agent object, a Patient subject, a Source, or a Stimulus. Experiment two required subjects to write completions to sentence fragments such as Jill like Sue and she and the number of references to each antecedent was recorded. The results confirmed the findings from Experiment one, although there was also an antecedent position effect (first vs. second mention) in some of the sentences. We suggest that these results are consistent with the view that thematic role information triggers the retrieval of canonical events in the real world, and may thus be responsible for the rapid retrieval of general knowledge in language comprehension.
Parallel Processes During Question Answering
Question answering involves several processes: representation of the question concept, identification of the question type, menwry search and/or inference generation, and output. Researchers tend to view these processes as stages and have developed primarily serial models of question-answering. Word-by-word reading times of questions, however, suggest that some processing is done in parallel. Questions were read more slowly but answered quicker when the question type was apparent from the first question word (the usual English construction of a question) when compared to cases when the question word came last Serial models can not explain such data easily. It is argued that the processes associated with a particular question type are active during processing of the question concept and that they can direct memory search during question parsing. Some parallel models of question answering consistent with the data are discussed.
Spreading Activation in PDP Networks
One argument in favor of current PDP models has been that the availability of "hidden units" allows the system to create an internal representation of the input domain, and to use this representation in producing output weights. The "microfeatures" learned by sets of hidden units, it has been argued, provide an alternative to symbols for certain reasoning tasks. In this paper we try to further this argument by demonstrating several results that indicate that such representations are formed. We show that by using a spreading activation model over the weights learned by networks trained via backpropagation, we can model certain cognitive effects. In particular, we show some results in the areas of modeling phoneme confusions and handling word-sense disambiguation, and some preliminary results demonstrating that priming effects can be modeled by this activation spreading approach.
Paper Presentations -- Group 3: Vision
A Connectionist Model of Attentional Enahancement and Signal Buffering
The connectionist/control simulation of attentional enhancement, signal maintenance, and buffering of information is described. The system implements a hybrid connectionist architecture incorporating auto-association in the hidden layer and gain control on the hidden and output layer. The structure of the model parallels major features of modular cortical structure. The attentional selection simulations show that as one channel is attenuated, the system exhibits attentional capture in which only the more intense stimulus is transmitted to higher levels. The signal maintenance simulations show that small levels of auto-associative feedback can faithfully maintain short bursts of input for extended periods of time. With high auto-associative feedback, one module can buffer information from a previous transmission while the module blocks the interference resulting from concurrent transmissions. The combination of auto-associative feedback and gain control allow extensive control of information flow in a modular connectionist architecture.
Visual Search as Constraint Propagation
handful of prominent theories have been proposed to explain a large quantity of experimental data on visual attention. We are developing a connectionist network model of visual attention which provides an alternative theory of attention based on computational principles. In this paper, we describe aspects of the model relevant to the dependence of visual search times on display size (number of objects in the stimulus image). Duncan's stimulus similarity theory provides the characterization of the experimental data which we use in simulating and evaluating our model. The characteristics of the network model that support the continuously varying dependence of search time on display size are the constraint propagation search implemented by a winner-take-all mechanism in the attention layer, and the lateral inhibition network within each primitive feature map, which provides the feature contrast needed to filter out background textures. W e report the results of simulations of the model, which agree with experimental data on visual attention in human subjects.
On the Evolution of a Visual Percept
n processing systems face a continual challenge of extracting only the most important information from the environment, resulting in awareness of some but not all of the available Information. The? current study investigates the psychophysical determinants of awareness. It examines the hypothesis that relative energy level of a stimulus is the critical factor in determining what comes to consciousness. In Experiments 1 and 3 two conceptually incompatible stimuli (one lexical, one pictorial) are presented successively in the same position of a tachistoscope for only 1 msec each, with a zero interstimulus interval. Observers report seeing one or the other, or nothing at all, at a more-or-less chance level when the stimulus durations are equal. As soon as one stimulus is given as little as onequarter to one-half a millisecond more duration than the other, the longer stimulus is reported on 68% to 84% of the test trials. In Experiment 2 an advantage for word perception at these low energy levels was investigated and measured. The results of these experiments indicate that extremely small differences in duration (about .25 or .50 msec) were sufficient to bring one concept to the consciousness of the observer at the expense of the other. Small stimulus intensity differences were investigated in Experiment 4, yielding similar results. The results can be accounted for by contemporary parallel-distributed-processing, connectionist network models of perception and cognition which use a winner-take-all decision rule.
A Computational Model of Visual Pattern Discrimination in Toads
It has been found behaviorally that visual habituation in toads exhibits locus specificity and partial stimulus specificity. Dishabituation among different configurations of visual worm stimuli forms an ordered hierarchy. This paper presents a computational model of the toad visual system involved in pattern discrimination, including retina, tectum, and anterior thalamus. In the model w e propose that the toad discriminates visual objects based on temporal responses, and anterior thalamus has differing representations of different stimulus configurations. This theory is developed through a large scale neural simulation. With a minimum number of hypotheses, we demonstrate that anterior thalamus in response to different worm stimuli shows the same hierarchy as shown in the behavioral experiment. The successful simulation allows us to provide an explanation of neural mechanisms for visual pattern discrimination. This theory predicts that retinal R 2 cells play a primary role in the discrimination via tectal small pear cells (SP) while R 3 cells refine the feature analysis by inhibition. The simulation also demonstrates that the retinal response to the trailing edge of a stimulus is as crucial for pattern discrimination as to the leading edge. N e w dishabituation hierarchies are predicted by shrinking stimulus size and reversing stimulus-background contrast.
Binding and Type-Token Problems in Human Vision
Two computational problems which are trivial for symbol-manipulating systems but which pose serious challenges to connectionist networks are the binding problem and the type-token problem. These difficulties arise because representations in connectionist networks do not automatically i) specify which features go with which object tokens, or ii^ distinguish between different tokens of the same type. Nevertheless, these processing shortcomings may constitute advantages when connectiomst networks are taken as models of human visual information processing. Perception research shows evidence not only of binding errors, for example in Treisman's illusory conjunctions (Treisman and Schmidt, 1982), but also of type-token errors, as seen in repetition blindness (Kanwisher, 1987) and other phenomena.
Dynamic Binding: A Basis for the Representation of Shape by Neural Networks
A neural network model for object recognition based on Biederman's (1987) theory of Recognition by Components (RBC) is described. R B C assumes that objects are recognized as configurations of simple volumetric primitives called geons. The model takes a representation of the edges in an object as input and, as output, activates an invariant, entry-level representation of the object that specifies the object's component geons and their interrelations. Local configurations of image edges first activate cells representing local viewpoint-invariant properties (VIPs). such as vertices and 2-D axes of parallelism and symmetry. Once activated, VIPs are bound into sets through temporal synchrony in the firing patterns of cells representing the VIPs and image edges belonging to a common geon. The synchrony is established by a mechanism which operates only between pairs of a) collinear, b) parallel, and c) coterminating edge and VIP cells. This design for perceptual organization through temporal synchrony is a major contribution of the model. A geon's bound VIPs activate independent representations of the geon's major axis and cross section, location in the visual field, aspect ratio, size, and orientation in 3-space. The relations among the geons in an image are then computed from the representations of the geons' locations, scales and orientations. The independent specification of geon properties and interrelations uses representational resources efficiently and yields a representation that is completely invariant with translation and scale and largely invariant with viewpoint. In the final layers of the model, this representation is used to activate cells that, through selforganization, learn to respond to individual objects
Caracature Recognition in a Neural Network
In a caricature drawing, the artist exaggerates the facial features of a person in proportion to their deviations from the average face. Empirically, it has been shown that caricature drawings are more quickly recognized than veridical drawings (Rhodes, Brennan, & Carey, 1987). Two competing hypotheses have been postulated to account for the caricature advantage. The caricature hypothesis claims that the caricature drawing finds a more similar match in memory than the veridical drawing because the underlying face representation is stored as an exaggeration. The distinctive features hypothesis claims that the caricature drawing produces speeded recognition by graphically emphasizing the distinctive properties that serve to individuate that face from other faces stored in memory. A computational test of the two hypotheses was performed by training a neural network model to recognize individual face vectors and then testing the model's ability to recognize both caricaturized and veridical versions of the face vectors. It was found that the model produced a higher level of activation to caricature face vectors than to the non-distorted face vectors. The obtained caricature advantage stems from the model's ability to abstract the distinctive features from a learned set of inputs. Simulation results were therefore interpreted as support for the distinctive features hypothesis.
Equilateral Triangles: A Challenge for Connectionist Vision
In this paper we explore the problem of dynamically computing visual relations in a connectionist system. The task of detecting equilateral triangles from clusters of points is used to test our architecture. W e argue that this is a difficult task for traditional feed-forward architectures although it is a simple task for people. Our solution implements a biologically inspired network which uses an efficient focus of attention mechanism and cluster detectors to sequentially extract the locations of the vertices.
The Computation of Emotion in Facial Expression Using the Jepson & Richards Perception Theory
Facial expressions are vital communicators of emotions, and it is in partial response to these expressions that we innately and accurately discern the emotional states of those around us. This paper identifies the activatable facial features that form the language of emotional expression in the face, and the set of emotions that each such feature tends to express. Finally, it is shown how the fault lattice perception theory [6] can be used to compute the emotion being registered on a face, given the configuration of the salient features. It is posited that the ability of a computer to make such interpretations would significantly enhance human-computer interaction.
Imagery and Problem Solving
In this paper we discuss the role of imagery in understanding problems and the processes of using images to solve problems. On the basis of two experiments and computer simulation, we show how subjects, in solving a particular problem, form mental images to represent a changing physical state. By "running" and watching the mental Image they can draw qualitative conclusions about the situation, then derive a quantitative equation to solve the problem.
Psychological Simulation and Beyond
In this paper, we examine the suggestion that inferences about another person's state of mind can proceed by simulation. According to that suggestion, one performs such reasoning by imagining oneself in that person's state of mind, and observing the evolution of that imagined cognitive state. However, this simulation-based theory of psychological inference suffers from a number of limitations. In particular, whilst one can perhaps observe the probable effects of an given cognitive state by putting oneself in that state, one cannot thus observe its probable causes. The purpose of this paper is to propose a solution to this problem, within the spirit of the simulation-based theory of psychological inference. According to the indexing thesis, certain cognitive mechanisms required for non-psychological inference can be re-used for hypothesising psychological causes. The paper concludes by discussing some of the possible implications of the indexing thesis.
Formal Models for Imaginal Deduction
Systems with inherently spatial primitives have advantages over traditional sentential ones for representing spatial structure. The question is how such representations might be used in reasoning. This paper explores a simple kind of deductive reasoning where picture-like entities, instead of symbol-strings, are given first-class status. It is based on a model of deduction as the composition of mappings between sets, and allows generalized notions of unification and binding, which in turn permit the definition of various formal, "imaginal" deduction systems. The axioms zind rules of inference are all pictures or fimdamentally picture-based, and are used to derive pictorial "theorems". After sketching the generalized theory needed, several possible strategies are mentioned, and a prototype, the BITPICT computation system, is described in some detail.
Paper Presentations -- Group 4: Learning and Memory
Language Acquisition via Strange Automata
Sequential Cascaded Networks are recturent higher order connectionist networks which are used, like fiuoite state automata, to recognize lansuages. Such networks may be viewed as discrete dynamical systems (Dynamical Recognizers) wnose states are points inside a multi-dimensional hypercube, whose transitions are defined not by a list of rules, out by a parameterized non-linear function, and whose acceptance decision is defined by a threshold applied to one dimension. Learning proceeds by the ad^tation of weight parameters under error-driven feedback from performance on a teacher-suppbed set of exemplars. The weights give rise to a landscape where input tokens cause transitions between attractive points or regions, and induction in this framework corresponds to the clustering, splitting and joining of these regions. Usually, the resulting landscape settles into a finite set of attractive regions, and is isomorphic to a classical finite-state automaton. Occasionally, however, the landscape contains a "Strange Attractor" (e.g fig 3g), to which there is no direct analogy ia finite automata theory.
Miniature Language Acquisition: A touchstone for cognitive science
Cognitive Science, whose genesis was interdisciplinary, shows signs of reverting to a disjoint collection of fields. This paper presents a compact, theory-free task that inherently requires an integrated solution. The basic problem is learning a subset of an arbitrary natural language from picture-sentence pairs. W e describe a very specific instance of this task and show how it presents fundamental (but not impossible) challenges to several areas of cognitive science including vision, language, inference and learning.
The Disruptive Potential of Immediate Feedback
Three experiments investigate the influence of feedback timing on skill acquisition in the context of learning LISP. In experiment 1 subjects receiving immediate feedback went through the training material in 4 0 % less time than did those receiving delayed feedback, but learning was not impaired. A second experiment involved the use of an improved editor and less supportive testing conditions. Though subjects in the immediate condition went through the training problems 1 8% faster (han did those in the delay condition, they were slower on the test problems and made twice as many errors. The results of experiment 3, a partial replication of the first two experiments, indicated a general advantage for delayed feedback in terms of errors, time on task, and the jjercentage of errors that Subjects self-corrected. A protocol analysis suggests that immediate feedback competes for wodcing memory resources, forcing out information necessary for operator compilation. In addition, more delayed feedback appears to foster the development of secondary skills such as error detection and self-correction, skills necessary for successful performance once feedback has been withdrawn (Schmidt, Young, Swinnen, & Shapiro, 1989).
Learning The Structure of Event Sequences
How is complex sequential material acquired, processed, and represented wlien there is no intention to learn ? Recent research (Lewicki, Hill & Bizot, 1988) has demonstrated that subjects placed in a choice reaction time task progressively become sensitive to the sequential structure of the stimulus material despite their unawareness of its existence. This paper aims to provide a detailed information-processing model of this phenomenon in an experimental situation involving complex and probabilistic temporal contingencies. We report on two experiments exploring a 6-choice serial reaction time task. Unbeknownst to subjects, successive stimuli followed a sequence derived from "noisy" finite-state grammars. After considerable practice (60,000 exposures), subjects acquired a body of procedural knowledge about the sequential structure of the material, although they were unaware of the manipulation, and displayed little or no verbalizable knowledge about it. Experiment 2 attempted to identify limits on subjects' ability to encode the temporal context by using more distant contingencies that spanned irrelevant material. Taken together, the results indicate that subjects become progressively more sensitive to the temporal context set by previous elements of the sequence, up to three elements. Responses are also affected by carryover effects from recent trials. A PDP model that incorporates sensitivity to the sequential structure and carry-over effects is shown to capture key
Explanation-based Learning of Correctness: Towards a Model of the Self-Explanation Effect
Two major techniques in machine learning, explanation-based learning and explanation completion, are both superficially plausible models for ChJ's self-explanation effect, wherein the amount of explanation given to examples while studying them correlates with the amount the subject learns from them. W e attempted to simulate Chi's protocol data with the simpler of the two learning processes, explanation completion, in order to find out how much of the self-explanation effect it could account for. Although explanation completion did not turn out to be a good model of the data, we discovered a new learning technique, called explanation-based learning of correctness, that combines explanation-based learning and explanation completion and does a much better job of explaining the protocol data. The new learning process is based on the assumption that subjects use a certain kind of plausible reasoning.
Indexing Libraries of Programming Plans
This study extends the work of Druhan et al. (1989) and Mathews et al. (1989b) by applying their computational model of implicit learning to the task of learning artificial grammars (AG) without feedback. The ability of two induction algorithms, the forgetting algorithm which learns by inducing new rules from presented exemplars and the genetic algorithm which heuristically explores the space of possible rules, to induce the grammar rules through experience with exemplars of the grammar is evaluated and compared with data collected from human subjects performing the same A G task. The computational model, based on Holland et al.'s (1986) induction theory represents knowledge about the grammar as a set of partially valid condition-action rules that compete for control of response selection. The induction algorithms induce new rules that enter into competition with existing rules. The strengths of rules are modified by internally generated feedback. Strength accrues to those rules that best represent the structure present in the presented exemplars. W e hypothesized that the forgetting algorithm would successfully learn to discriminate valid &t)m invalid exemplars when the set of exemplars was high in family resemblance. W e also proposed that the genetic algorithm would perform better than chaiKe but not as well as the forgetting algorithm. Results supported those hypotheses. Interestingly, the Mathews et al. (1989a) subjects performed no better than chance on the same A G learning task. W e concluded that this discrepancy between the simulation results and the human data is caused by interference from unconstrained hypotheses generation of our human subjects. Support for this conclusion is two-fold: (1) subjects are able to learn the A G when the task is designed so that hypothesis geno^tion is inhibited, and (2) informal inspection of verbal protocols bom human subjects indicates they are generating and maintaining hypotheses of little or no validity.
Associative Memory-Based Reasoning: Some Experimental Results
Deduction, induction and analogy are considered as slightly different manifestations of one and the same reasoning process. A model of this reasoning process called associative memory-based reasoning is proposed. A computer simulation demonstrates that deduction, induction and analogy in problem solving could be performed by a single mechanism which combines the neural network approach with symbol level processing. Psychological experiments on priming effects in problem solving tasks have been carried out in order to test the hypothesis about the uniformity of human reasoning. In particular it has been shown that there are priming effects in all three cases (deduction, induction and analogy) and these priming effects decrease in the course of time which corresponds to the model's predictions based on the retrieval mechanism. The computer simulation demonstrates the same type of priming effects as observed in the psychological experiments.
Uniformity of Associative Impairment in Amnesia
The fragmentation model of associative memory has the attraction of specifying neither a spatial metaphor nor a symbolic representation for remembering. It was used in order to compare the recall of groups of unrelated words by amnesic and normal people. Similarly, a schema model was used in order to compare their recall of groups of related words. It was found that the impairment in remembering with amnesia revealed by these models was remarkably uniform rather than selective. This suggests that the level at which the memory storage system is damaged in amnesia is a relatively low one. In a connectionist formulation, this would presumably correspond to widespread random damage to units and the connections between them.
Predictive Utility In Case-Based Memory Retrieval
The problem of access to prior cases in memory is a central Issue in the case-based reasoning Previous work on thematic knowledge structures has shown that using a complete exemplar of a thematic pattern allows access to the structure and related cases in memory. However, the knowledge and expectations provided by such structures can aid in planning and problem-solving. Therefore, to be most useful, the Information should become available before the Input pattern Is complete. Retrieval must therefore be possible based on only a subset of the features present In the full thematic pattern. This study investigated whether a pattern that contains elements predicting an outcome, but not the outcome itself, would result In access comparable to that found when a full pattern is used. The results showed that subjects were less successful accessing the thematic structure using partial patterns than they were when using full patterns. However, remindings based on partial patterns occurred more often than would be expected by chance. W e conclude that partial patterns contain some predictive features that can allow access to a thematic knowledge structure before the pattern Is complete.
Episodic Memory in Connectionist Networks
A major criticism of backprop-based connectionist models (CMs) has been that they exhibit "catastrophic interference", when trained in a sequential fashion without repetition of groups of items; in terms of memory, such C M s seem incapable of remembering individual episodes. This paper shows that catastrophic interference is not inherent in the architecture of these C M s , and may be avoided once an adequate training rule is employed. Such a rule is introduced herein, and is used in a memory modeling network. The architecture used is a standard, nonlinear, multilayer network, thus showing that the known advantages of such powerful architectures need not be sacrificed. Simulation data are presented, showing not only that the model shows much less interference than its backprop counterpart, but also that it naturally models episodic memory tasks such as frequency discrimination.
On the Domain Specificity of Expertise in an Ill-Structured Domain
An important issue concerns the relation between expertise in highly-structured domains and ill structured domains. This study explored the information processing abilities associated with expertise in literature, an ill-structured domain. Literary experts were superior to novices in gist level recall, the extraction of interpretations and the breadth of aspects addressed of literary texts but not of a scientific text. The results indicate that expertise in literature appears to share features with expertise in highly-structured domains, including domain-specificity and an absu^ct level of representation.
Paper Presentations -- Group 5: Cognition in Context
Learning from indifferent Agents
In many situations, learners have the opportunity to observe other agents solving problems similar to their own. While not as favorable as learning from fully explained solutions, this has advantages over solving problems in isolation. In this paper we describe the general situation of learning from indifferent agents and outline a preliminary theory of how it may afford improved performance. Because one of our long-term goals is to improve educational methods, we identify a domain that allows us to observe humans learning from indifferent agents, and we summarize verbal protocol evidence indicating when and how humans learn.
Using Knowledge Representation to Study Conceptual Change in Students for Teaching Physics
Our goal is to understand the development of physics concepts in students. We take the perspective that individuals construct their o w n understanding so as to 'fit' their experiences. This constructive activity results in conceptions about the physical world. The major challenges in physics instruction then are the tasks of identifying and inducing change in students' conceptions about the physical world. Our efforts to understand the nature of conceptual change are aided by knowledge representation techniques. W e present examples in which some of the finer structure of conceptual change is represented which illustrate the potential of knowledge representation for studying conceptual change.
The Effect of Feedback Control on Learning to Program with the Lisp Tutor
Control and content of feedback was manipulated as students practiced coding functions with the Lisp Four feedback conditions were employed: (1) immediate error feedback and correction, (2) immediate error flagging but immediate correction not required (3) feedback on demand aixl (4) no tutorial assistance. The wide range in feedback conditions did not affect mean learning rate as measured by individual production firing, time to complete the exercises or post-test performance. However, post-test results were more highly correlated with student ability as tutorial assistance decreased across conditions. Feedback conditions also affected students' monitoring of the learning process. Across groups, students found the material was easier and bcUeved they had learned it better as assistance decreased across conditions. However, students w h o received more assistance estimated their mastery of the material more accurately. Finally, students reported relatively little preference for one tutoring condition over the others. Students w h o could exercise the most control over feedback reacted fairly passively to the tutoring conditions; students in condition 3 tended not to ask for much help and students in condition 2 tended to correct error immediately although it was not required.
Supporting Linguistic Consistency and Idiosyncracy with an Adaptive Interface Design
Despite the goal to permit freedom of expression, natural language interfaces remain imable to recognize the full range of language that occurs in spontaneously generated user input Simply increasing the linguistic coverage of a large, static interface is a poor solution; as coverage increases, response time decreases, regardless of whether the extensions benefit any particular user. Instead, we propose that an adaptive interface be dedicated to each user. By automatically acquiring the idiosyncratic language of each individual, an adaptive interface permits greater freedom of expression while slowing system response only insofar as there is ambiguity in the individual's language. The usefulness of adaptation relies on the presence of three regularities in users' linguistic behaviors: within-user consistency, across-user variability, and limited user adaptability. W e show that these behaviors are characteristic of users under conditions of frequent use.
The Cognitive Space of Air Traffic Control: A Parametric Dynamic Topological Model
Recent observational work of controller behavior in simulations of air traffic control sessions suggests that controllers formulate and modify their plans in terms of clusters of aircraft, rather than individual aircraft, and that they cluster aircraft based on their closeness in an abstract cognitive space, rather than simple separation in physical space. A mathematical model of that space is presented as a background for further work to determine the cognitive strategies that controllers use to navigate that space. The model is topologicals that neighborhood constraints play a central role; it is dynamic in that more than one topology interact to define its essential characteristics; and it is parametric in that an entire class of spaces can be obtained by varying the values of some parameters.
Models of Neuromodulation and Information Processing Deficits in Schizophrenia
This paper illustrates the use of connectionist models to explore the relationship between biological variables and cognitive deficits. The models show how empirical observations about biological and psychological deficits can be catpured within the same framework to account for specific aspects of behavior. W e present simulation models of three attentional and linguistic tasks in which schizophrenics show performance deficits. At the cognitive level, the models suggest that a disturbance in the processing of context can account for schizophrenic patterns of performance in both attention and language-related tasks. At the same time, the models incorporate a mechanism for processing context that can be identified with the function of the prefrontal cortex, and a parameter that corresponds to the effects of dopamine in the prefrontal cortex. A disturbance in this parameter is sufficient to account for schizophrenic patterns of performance in the three cognitive tasks simulated. Thus, the models offer an explanatory mechanism linking performance deficits to a disturbance in the processing of context which, in turn, is attributed to a reduction of dopaminergic activity in prefrontal cortex.
A Functional Role for Repression in an Autonomous, Resource-constrained Agent
We discuss the capabilities required by intelligent "agents" that must carry out their activities amidst the complexities and uncertainties of the real world. W e consider important challenges faced by resource-constrained agents who must optimize their goal directed actions within environmental and intemal constraints. Any real agent confronts limits on the quality and amount of input information, knowledge of the future, access to relevant material in memory, availability of alternative strategies for achieving its current goals, etc. W e specify a "minimalist" architecture for a resource-constrained agent, based on Global Workspace Theory (Baars, 1988) and on the research of Fehling (Fehling & Breese, 1988). W e show how problem-solving and decision-making within such as system adapts to critical resources limitations that confront an agent. These observations provide the basis for our analysis of the functional role of repression in an intelligent agent. W e show that active repression of information and actions might be expected to emerge and play a constructive role in our model of an intelligent, resource-constrained agent.
A Goal-Based Model Of Interpersonal Relationships
Interpersonal relationships are a pervasive dimension of human behavior and decision making. Actors make choices based both on personal goals, and on goals derived from interpersonal relationships. We present a goal-based model of decision making that combines the motives of the actor with agendas adopted through relationships. A unifying feature of the model is the use of importance as a means of ranking both goals and relationships. We describe a computer simulation of the model in the domain of Congressional roll-call voting.
Volition and Advice: Suggesting Strategies for Fixing Problems In Social Situations
Just as an abstract causal analysis of a plaui's faults can suggest repair strategies that will eliminate those faults [6], so too, an abstract causEil account of how a problem arises in a social situation can suggest relevant advice to correct the problem. In the social world, most problems arise as results of agents' actions; the best way to fix such problems is to modify the behavior that produces the problem. The vocabulary of volition developed in this paper is proposed as an abstract level of motivational analysis useful for discriminating among strategies for changing behavior. Volitional analysis focuses on the agents involved in an action. In addition to the actor, there is often a motivator agent w h o influences the actor and sometimes a third-party agent used as a tool by the motivator. If any of these agents can be swayed, the problematic action m a y be avoided. By identifying these agents and classifying the influences working on them, volitional analysis can suggest relevant modifications. The influences most often depend on the social context that links agents and estabhshes goal-generating themes. Behavior, however, is not always directly goal-governed, and volitional analysis recognizes these exceptional cases as well.
Poster Presentations
A Connectionist Approach to High-Level Cognitive Modeling
In this paper a connectionist framework is outlined which combines the advantages symbolic and pajallel distributed processing. With regard to the acquisition of cognitive skills of adult humans, symbolic computation is stronger related to the early stages of performance whereas parallel distributed processing is related to later, highly practiced, performance. In order to model skill acquisition, two interacting connectionist systems are developed. The first system is able to implement symbolic data structures: it reliably stores and retrieves distributed activity patterns. It also can be used to match in parallel one activity pattern to all other stored patterns. This leads to an efficient solution of the variable binding problem and to parallel rule matching. A disadvantage of this system is that it can only focus on a fixed amount of knowledge at each moment in time. The second system - consisting of recurrent back-propagation networks - can be trained to process and to produce sequences of elements. After sufficient training with examples it possesses «dl advantages of parallel distributed processing, e. g., the direct application of knowledge without interpreting mechanisms. In contrast to the first system, these networks can learn to hold sequentially presented information of varying length simultaneously active in a highly distributed (superimposed) manner. In earlier systems - like the model of past-tense learning by Rumelhart and McClelland - such forms of encodings had to be done "by hand" with much human effort. These networks are also compared with the tensor product representation used by Smolensky.
Fuzzy Implication Formation in Distributed Associative Memory
An analysis is presented of the emergence of implicational relations within associative memory systems. Implication is first formulated within the framework of Zadeh's theory of approximate reasoning. In this framework, implication is seen to be a fuzzy relation holding between linguistic variables, that is, variables taking linguistic terms (e.g., "young", "very old") as values. The conditional expressions that obtain from this formulation may be naturally cast in terms of vectors and matrices representing the membership functions of the fuzzy sets that, in turn, represent the various linguistic terms and fuzzy relations. The resulting linear algebraic equations are shown to directly correspond to those that specify the operation of certain distributed associative connectionist memory systems. In terms of this correspondence, implication as a fuzzy relation can be seen to arise within the associative memory by means of the operation of standard unsupervised learning procedures. That is, implication emerges as a simple and direct result of experience with instances of events over which the implicational relationship applies. This is illustrated with an example of emergent implication in a natural coarsely coded sensory system. The percepts implied by sensory inputs in this example are seen to exhibit properties that have, in fact, been observed in the system in nature. Thus, the approach appears to have promise for accounting for the induction of implicational structures in cognitive systems.
Scenes from Exclusive-Or: Back Propagation is Sensitive to Initial Conditions
This paper explores the effect of initial weight selection on feed-forward networks learning simple functions with the badt-propagation technique. W e first demonstrate, through the use of Monte Carlo techniques, that the magnitude of the initial condition vector (in weight space) is a very significant parameter in convergence time variability. In order to further understand this result, additional deterministic experiments were performed. The results of these experiments demonstrate the extreme sensitivity of bade propagation to initial weight configuration.
A Computational Model Of Attentional Requirements In Sequence Learning
This paper presents a computational model of attentional requirements in sequence learning. The structure of a keypressing sequence affects subjects' abilities to learn the sequence in a dual task paradigm (Cohen, Ivry, & Keele, 1990). Sequences containing unique associations among successive positions (i.e., 1-5-4-2-3) are learned during distraction. Sequences containing repeated positions with ambiguous associations (i.e., 3-1-2-1-3-2) are not learned during distraction. Cohen, et al. proposed two fundamental operations in sequence learning. A n associative mechanism mediates learning of the unique patterns (1-5-4-2-3). These associations do not require attention to be learned. Such an associative mechanism is poorly suited for learning the sequence with repeated elements and ambiguous associations. These sequences must be parsed and organized in a hierarchical manner. This hierarchical organization requires attention. The simulations reported in this paper were run on an associative model of sequence learning developed by Jordan (1986). Sequences of differing structures were presented to the model under two conditions - unparsed, and parsed into subsequences. The simulations modeled closely the keypressing task used by Cohen, Ivry and Keele (1990). The simulations (1) replicate the empirical findings, and (2) suggest that imposing hierarchical organization on sequences with ambiguous associations significantly improves the model's ability to learn those sequences. Implications for the analysis of fundamental computations underlying a system of skilled movement are discussed.
Harmonic Grammar - A Formal Multi-Level Connectionist Theory of Linguistic Well-Formedness: An Application
We describe harmonic grammar, a connectionist-based approach to formal theories of linguistic well-formedness. The general approach can be applied to various kinds of linguistic well-formedness, e.g., phonological and syntactic. Here, we address a syntactic problem: unaccusativity. Harmonic grammar is a two-lcvcl theory, involving a distributed, lower level connectionist network whose relevant aggregate computational behavior is described by a local, higher level network. The central hypothesis is that the connectionist well-formedness measure called "harmony"^ can be used to model linguistic well-formedness; what is crucial about the relation between the lower and higher level networks is that there is a harmony-preserving mapping between them: they are isoharmonic (at least approximately). A companion paper (Legendre, Miyata, & Smolensky, 1990; henceforth "LMS2") describes the theoretical basis for the two level approach, starting from general connectionist principles. In this paper, we discuss the problem of unaccusativity, give a high level characterization of harmonic syntax, and present a higher level network to account for unaccusativity data in French. W e interpret this network as a fragment of the grammar and lexicon of French expressed in "soft rules." Of the 760 sentence types represented in our data, the network correctly predicts the acceptability in all but two cases. This coverage of real, problematic syntactic data greatly exceeds that of any other formal account of unaccusativity of which we are aware.
PDP Models for Meter Perception
A basic problem in music perception is how a listener develops a hierarchical representation of the metric structure of music of the sort proposed in the generative theory of Lerdahl & Jackendoff (1983).'This paper describes work on a constraint satisfaction approach to the perception of the metric structure of music in which many independent "agents" respond to particular events in the music, and where a representation of the metric structure emerges as a result of distributed local interactions between the agents. This approach has been implemented in two P D P simulation models that instantiate the constraints in different ways. The goal of this work is to develop psychologically and physiologically plausible models of meter perception.
Some Criteria for Evaluating Designs
Most non-trivial design tasks are under-specified, which makes evaluating designs subjective and problematic. In this paper, we address the evaluation criteria that are left implicit in problem specifications. W e propose that these criteria evaluate designs in terms of specific types of consistency and completeness. In particular, we divide consistency into constraint, representational, and goal consistency, and we decompose completeness into the specificity, depth, and breadth of a solution. These distinctions are useful because they organize criteria for evaluating designs. This model of evaluation is largely implemented in a program called JULIA that plans the presentation and menu of meals to satisfy multiple, interacting constraints.
Ad-Hoc, Fail-Safe Plan Learning
Artificial Intelligence research has traditionally treated planning, execution and learning as independent, sequential subproblems decomposing the larger problem of intelligent action. Recently, several lines of research have challenged the separation of planning and acting. This paper suggests that integration with planning and acting is also important for learning. W e present an integrated system SCAVENGER combining an adaptive planner with an ad-hoc learner. Situated plans are retrieved from memory; adaptation during execution extends these plans to cope with contingencies that arise and to tease out descriptions of situations to which these plans pertain. These changes are then integrated into the plan and incorporated into memory. Every situation of action is an opportunity for learning. Adaptive planning makes learning fail-safe by compensating for imperfections and omissions in learning and variability across situations. W e discuss a learning example in the domain of mechanical devices.
Networks Modeling The Involvement of the Frontal Lobes in Learning and Performance of Flexible Movement Sequences
Networks that model the planning and execution of goal directed sequences of movements are described, including the involvement of both the prefrontal cortex and the corpus striatum. These networks model behavioral data indicating that frontal damage does not disrupt the learning and performance of an invariant sequence of movements. If the order of performance of the movements is allowed to vary, however, frontal
Discovering Grouping Structure in Music
GTSIM, a computer simulation of Lerdahl and Jackendoff's (1983) A Generative Theory of Tonal Music, is a model of human cognition of musical rhythm. GTSIM performs left-to-right, single-pass processing on a symbolic representation of information taken from musical scores. A rule-based component analyzes the grouping structure, which is the division of a piece of music into units like phrases and the combination of these phrases into motives, themes, and the like. The resulting analysis often diverges from the analysis we would produce using our musical intuition; we explore some of the reasons for this. In particular, GTSIM needs to have an algorithm for determining parallel structures in music. We consider alphabet encoding (Deutsch and Feroe, 1981) and discrimination nets (Feigenbaum and Simon, 1984) as algorithms for parallelism.
Cross-Domain Transfer of Planning Strategirs: Alternative Approaches
We discuss the problem of transferring learned knowledge across domains, and characterize two possible approaches. Transfer through reoperationalization involves learning concepts in a domain-specific form and transferring them to other domains by recharacterizing them in each domain as necessary. Abstraction-based transfer involves learning concepts at a high level of abstraction to facilitate transferring them to other domains without recharacterization. W e discuss these approaches and present an example of the abstraction-based transfer of a method of projection, or selective lookahead, from the game of chess to the game of checkers, as implemented in our test-bed system for failure-driven learning in i)lanning domains. We then discuss a continuum of abstraction to characterize learned concepts, and propose a corresponding continuum characterizing the time at which the computation necessary for cross-domain transfer is accomplished.
The Effect of Alternative Representations of Relational Structure on Analogical Access
Retrieval of an appropriate analogy ftora memory is often difficult because the structure common to two analogous domains is embedded in specific contexts that differ at the surface level. The present study examines an aspect of domain representations that may affect the access of analogs in memory. Subjects were asked to identify analogies between new and previously learned passages. Passages varied in the manner in which analogous relations were described. In all passages the relations were embedded in a particular context that was dissimilar at the surface level between analogs. However, the expression of relations within a passage varied in level of abstraction. In "abstract" passages relations were lexicalized with relatively abstract terms and were described with litde domain specific detail. In "specific" passages more specific terms were used and extensive domain specific detail was given about how relations were instantiated within the domain. In "mixed" passages both abstract and specific descriptions of relations were given. Subjects reading abstract passages were best at identifying analogies. The present results suggest that even though analogous relations are embedded in dissimilar contexts, the way in which those relations themselves are represented can affect analogical access. Subjeas are relatively successfiil at analogical access when the relations are represented in a relatively general and sparse form.
What Should I Do Now? Using Goal Sequitur Knowledge to Choose the Next Problem Solving Step
Many problems require multi-step solutions. This is true of both planning and diagnosis. How can a problem solver best generate an ordered sequence of actions to resolve a problem? In many domains, complete pre-planning is not an option because the results of steps can vary, thus a large tree of possible sequences would have to be generated. W e propose a method that integrates the use of previous plans or cases with use of knowledge of relationships between goals, and the use of reasoning using domain knowledge to incrementally suggest the actions to take. The suggestion process is constrained by heuristics that specify the circumstances under which an instance of a particular reasoning goal can follow from an instance of other reasoning goals. W e discuss the general approach, then present the suggestion methods and the constraints.
A Case-Based Approach to Creativity in Problem Solving
One of the major activities creative problem solvers engage in is exploration and evaluation of alternatives, often adapting and merging several possibilities to create a solution to the new problem. We propose a process that models this activity and discuss the requirements it puts on representations and reasoning processes and present a program that solves problems by following this procedure.
Invited Symposia
Designing an Integrated Architecture
Artificial intelligence has progressed to the point where multiple cognitive capabilities are integrated into computational architectures, such as SOAR, PRODIGY, THEO, and ICARUS. This paper reports on the PRODIGY architecture, describing its planning and problem solving capabilities and touching upon its multiple learning methods. Learning in PRODIGY occurs at all decision points and integration in PRODIGY is at the knowledge level; the learning and reasoning modules produce mutually interpretable knowledge structures. Issues in architectural design are discussed, providing a context to examine the underlying tenets of the PRODIGY architecture.
Execution-time Response: Applying plans in a dynamic world
This panel is aimed at the issue of how to use and modify plans during the course of execution. The relationship between a plan and the actions that an agent taJces has generated a great deal of interest in the past few years. This is, in part, a result of the realization that planning in the abstract is an intractable problem and that much of the complexity of behavior is best understood in terms of the complexity of the environment in which that behavior occurs. This panel presents five distinct personalities and approaches to this problem: • Agre looks at replacing "planning" with situated activity. In particular, he has been considering the problems involved with the reference assumptions of classical planning. • Firby's hierarchical planner has primitive actions that are instantiated at execution-time. The execution of these primitives generates information that can be used to guide selection of later operators. • In Alterman's model of run-time adaptation, the executive responds to failures by using external cues to move between alternative steps or approaches stored in an existing network of semantic/episodic information. • Simmons has been exploring techniques to create robust, reactive systems that can handle multiple tasks in spite of the robot's limited sensors and processors. His approach takes full advantage of the resources that the robot does have. This includes using hierarchical coarse-to-fine control strategies, using concurrency whenever feasible, and explicitly focusing attention on the robot's tasks and monitored conditions. • H a m m o n d suggests a theory of agency which casts planning as embedded within a memorybased understanding system connected to the environment. Within this approach, the environment, plan selections, decisions, conflicts and actions are viewed through the single eye of situation recognition and response.