About
The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.
Volume 18, 1996
Invited Symposia
Imaging Studies of Vision, Attention and Language
Variation in second language acquisition is evident from earliest stages. This study examined effects of learning tasks (retrieval practice, comprehension, verbal repetition) on comprehension of Turkish as a new language. Undergraduates (N = 156) engaged with Turkish spoken dialogues in a computer-assisted language learning session via Zoom, with learning tasks manipulated between-subjects. Participants completed pre/posttests assessing comprehension of Turkish number and case marking, a vocabulary test, and open-response questions gauging explicit awareness. The retrieval-practice group showed highest performance overall, after controlling for significant effects of nonverbal ability and pretest. For comprehension of number/case marking, the comprehension group performed comparably to the retrieval-practice group. For vocabulary comprehension, the verbal-repetition group performed comparably to the retrieval-practice group. Differential performance associated with learning tasks indicates benefits of testing and production and aligns with transfer-appropriate processing. As predicted by the noticing hypothesis, explicit awareness of number and case marking correlated with comprehension accuracy.
Primitives as a basis for movement synthesis
Recent data from spinal frogs and mammals suggests that movements may be constiucted fiom a standard set of primitives which represent postures and force patterns around postures. These postural primitives may be combined for movement synthesis and may also interact non-linearly. New data shows that the set of primitives may also contain of a collection of members which encapsulate aspects of movement control and dynamics. The linear interactions, non-linear interactions, and dynamic controls provide a means of bootstrapping motor learning. The non-linear interactions enable a basic pattern generator and a reflex functionality which can be parameterized and modified for elaboration of more complex behaviors.
Reinforcement learning in Factories: The Auton Project
Factories are fascinating test-beds for integrated learning systems. In recent years their sensory capabilities have, in many cases, been advanced and integrated so that data from all over the plant is available in real time over a LAN. Here we discuss how reinforcement learning, and related machine learning methods, can take advantage of this information to learn to improve performance, to adapt to change, and to exploit databases of historical records or similar processes in different plants.
Submitted Symposia
Can Symbolic Algorithms Model Cognitive Development?
Symbolic decision-tree learning algorithms can provide a powerful and accurate transition mechanism for modeling cognitive development. They are valid alternatives to connectionist models.
Paper Presentations
Beyond Computationalism
By computationalism in cognitive science I mean the view that cognition essentially is a matter of the computations that a cognitive system performs in certain situations. The main thesis I am going to defend is that computationalism is only consistent with symbolic modeling or, more generally, with any other type of computational modeling. In particular, those scientific explanations of cognition which are based on (i) an important class of connectionist models or (ii) nonconnectionist continuous models cannot be computational, for these models are not the kind of system which can perform computations in the sense of standard computation theory. Arguing for this negative conclusion requires a formal explication of the intuitive notion of computational system Thus, if my thesis is correct, we are left with the following alternative. Either we construe computationalism by explicitly referring to some nonstandard notion of computation, or we simply abandon the idea that computationalism be a basic hypothesis shared by all current research in cognitive science. I will finally suggest that a different hypothesis, dynamicism, may represent a viable alternative to computationalism. According to it, cognition essentially is a matter of the state evolutions that a cognitive system undergoes in certain situations.
Qualia: The Hard Problem
One issue that has been raised time and again in philosophy of mind and more recently in cognitive science is the question of qualia, or "raw feels." What are qualia and how do they fit into the cognitive science conception of mind? We consider some of the classic qualia thought experiments and two proposed solutions to the qualia problem, eliminativism and content-dependence. While neither of these solutions are actually able to dismiss or explain qualia as claimed, the content-based solution does clarify the relation between cognitive science and qualia. Because qualia are precisely the part of our experiences that are not related to informational content (and therefore intersubjective), and cognitive science is primarily based on information content, qualia are not within the domain of cognitive science.
Connectionism, Systematicity, and Nomic Necessity
In their provocative 1988 paper, Fodor and Pylyshyn issued a formidable challenge to connectionists, viz., to provide a non-classical explanation of the empirical phenomenon of systematicity in cognitive agents. Since the appearance of F&P's challenge, a number of connectionist systems have emerged which prima facie meet this challenge. However, Fodor and McLaughhn (1990) advance an argument, based upon a general principle of nomological necessity, to show that one of these systems (Smolensky's) could not satisfy the Fodor-Pylyshyn challenge. Yet, if Fodor and McLaughlin's analysis is correct, it is doubtful whether any existing connectionist system would fare better than Smolensky's. In the view of Fodor and McLaughlin, humans and classical architectures display systematicity as a matter of nomological necessity (necessity by virtue of natural law), but connectionist architectures do not. However, I argue that the Fodor-Pylyshyn-McLaughhn appeal to nomological necessity is untenable. There is a sense in which neither classical nor connectionist architectures possess nomological (or 'nomic') necessity. However, the sense in which classical architectures do possess nomic necessity applies equally well to at least some connectionist architectures. Representational constituents can have causal efficacy within both classical and coimectionist architectures.
Fodor's New Theory of Content and Computation
In his new book, The Elm and the Expert, Fodor attempts to reconcile the computational model of human cognition with information-theoretic semantics, the view that semantic content consists of nothing more than causal or nomic relationships between words and the world, and intentional content of nothing more dian causal or nomic relationships between brain states and the world. We do not challenge the project, not in this paper. Nor do we show that Fodor has failed to carry it out. Instead, we urge that his analysis, when made explicit, turns out rather differently than he thinks. In particular, where he sees problems, he sometimes shows that there is no problem. And while he says two conceptions of information come to much the same thing, his analysis shows that they are very different.
Integrating World Knowledge with Cognitive Parsing
The work presented in this article builds on the account of cognitive parsing given by the SOUL system (Konieczny & Strube, 1995), an object-oriented implementation of Parameterized Head Attachment (Konieczny et al, 1991) based on Head-Driven Phrase-Structure Grammar (Pollard & Sag, 1994). We describe how the initial semantic representation proposed by the parser is translated into a logical form suitable for inference, thus making it possible to integrate world knowledge with cognitive parsing. As a semantic and knowledge representation system we use the most expressive implemented logic for natural language understanding. Episodic Logic (Hwang & Schubert, 1993), and its computational implementation, Epilog (Schaeffer et al, 1991).
The Role of Ontology in Creative Understanding
Successful creative understanding requires that a reasoner be able to manipulate known concepts in order to understand novel ones. A major problem arises, however, when one considers exactly how these manipulations are to be bounded. If a bound is imposed which is too loose, the reasoner is likely to create bizarre understandings rather than useful creative ones. On the other hand, if the bound is too tight, the reasoner will not have the flexibility needed to deal with a wide range of creative understanding experiences. Our approach is to make use of a principled ontology as one source of reasonable bounding. This allows our creative understanding theory to have good explanatory power about the process while allowing the computer implementation of the theory (the ISAAC system) to be flexible without being bizarre in the task domain of reading science fiction short stories.
Working Memory in Text Comprehension: Interrupting Difficult Text
We compare the effects of interrupting text dealing with familiar or unfamiliar domains with either arithmetic or sentence reading tasks. Readers were interrupted after each of the eight sentences, at the end of each sentence, or in the middle of each sentence. Previous findings of minimal effects of interruptive tasks on comprehension measures (eg . fllanzer & Nolan, 1986) were replicated in this study. Also, as found by Glanzer and his colleagues, interruptions after each sentence of a familiar text by an unrelated sentence increased reading times by approximately 400 ms per sentence. In contrast, for difficult, unfamiliar texts, mid-scnience interruptions significantly lengthened reading times by 1262 ms for sentence and 1784 ms for arithmetic interruptions. These findings arc explained in terms of Enesson and Kintsch's (1995) memory model which proposes that skilled memory performance relies on the use of long-term memory as an extension of working memory, or long-term working memory.
Reasoning from multiple texts: An automatic analysis of readers' situation models
In reading multiple texts, a reader must integrate information from the texts with his or her background knowledge. The resulting situation model represents a rich elaborated structure of events, actions, objects, and people involved in the text organized in a manner consistent with the reader's knowledge. In order to evaluate a reader's situation model, a reader's summary must be analyzed in relation to texts the subject has read as well as to more general knowledge such as an expert's knowledge. However, this analysis can be both time-consuming and difficult. In this paper, we use an automatic approach called Latent Semantic Analysis (LSA) for evaluating the situation model of readers of multiple documents. LSA is a statistical model of word usage that generates a high-dimensional semantic space that models the semantics of the text. This paper describes three experiments. The first two describe methods for analyzing a subject's essay to determine from what text a subject learned the information and for grading the quality of information cited in the essay. The third experiment analyzes the knowledge structures of novice and expert readers and compares them to the knowledge structures generated by the model. The experiments illustrate a general approach to modeling and evaluating readers' situation models.
Lexical Limits on the Influence of Context
This paper introduces an approach to modelng the interpretation of semantically underspecified logical metonymies, such as John began the book. A distinctive feature of the theory presented is its emphasis on accounting for their behavior in discourse contexts. The approach dependson the definition of a pragmatic component which interacts in the appropriate manner with lexicosyntactic information to establish the coherence of a discourse. The infelicity of certain logical metonymy constructions in some discourses is shown to stem from the non-default nature of the lexicosyntactically determined interpretation for such constructions. The extent of the influence of contextual information from the discourse on the interpretation of logical metonymies is therefore constrained by the lexical properties of the constituents of the metonymies. Contextually-cued interpretations are shown to be unattainable when indefeasible lexical information conflicts with these interpretations.
Dynamics of Rule Induction by Making Queries: Transition Between Strategies
The induction of rules by making queries is a dynamical process based on seeking information. Experimenters typically look for one dominant strategy that is used by subjects, which may or may not agree with normative models of this psychological process. In this study we approach this problem from a different perspective, related to work in learning theory (see for example Baum 1991, Freund et al. 1995). Using information theory in a Bayesian framework, we estimated the information gained by queries when the task is to find a specific rule in a hypothesis space. Assuming that at each point subjects have a preferred working hypothesis, we considered several possible strategies, and determined the best one so that information gain is maximized at each step. We found that when the confidence in the preferred hypothesis is weak, "Confirmation Queries" result in maximum information gain; the information gained by "Investigation Queries" is higher when the confidence in the preferred hypothesis is high. Considering the dynamical process of searching for the rule, starting with low confidence in the preferred hypothesis and gradually raising confidence, there should be a transition from the "Confirmation Strategy" to the "Investigative Strategy", as the search proceeds. If we assume that subjects update their beliefs regarding the task, while performing, we would expect that the "Positive Confirmation Strategy" would yield more information at low confidence levels while the "Negative Confirmation Strategy" (simple elimination) would be more informative at higher confidence levels.
We tested subjects performance in such a task, using a paradigm introduced by Wason (1960). All subjects first assumed a hypothesis and then made positive confirmation queries. Upon receiving confirmation, half the subjects presented negative confirmation queries and later, half switched into investigative queries before attempting to guess the experimenter's rule. Also, the frequency of queries in the more 'advanced' strategies went down as the confidence level required to evoke the strategy went up. We conclude that subjects appear to be using different strategies at different stages of the search, which is theoretically optimal when queries are guided by a paradigm that maximizes information gain at each step.
The Impact of Information Representation on Bayesian Reasoning
Previous research on Bayesian inference, reporting poor performance by students and experts alike, has often led to the conclusion that the mind lacks the appropriate cognitive algorithm. We argue that this conclusion is unjustified because it does not take into account the information format in which this cognitive algorithm is designed to operate. We demonstrate that a Bayesian algorithm is computationally simpler when the information is represented in a frequency rather than a probability format that has been used in previous research. A frequency format corresponds to the way information is acquired in natural sampling—sequentially and without constraints on which observations will be included in the sample. Based on the assumption that performance will reflect computational complexity, we predict that a frequency format yields more Bayesian solutions than a probability format. We tested this prediction in a study conducted with 48 physicians. Using outcome and process analysis, we categorized their individual solutions as Bayesian or non-Bayesian. When information was presented in the frequency format, 46 % of their inferences were obtained by a Bayesian algorithm, as compared to only 10% when the problems were presented in the probability format. We discuss the impact of our results on teaching statistical reasoning.
On Reasoning with Default Rules and Exceptions
We report empirical results on factors that influence how people reason with default rules of the form "Most x's have property P", in scenarios that specify information about exceptions to these rules and in scenarios that specify default-rule inheritance. These factors include (a) whether the individual, to which the default rule might apply, is similar to a known exception, when that similarity may explain why the exception did not follow the default, and (b) whether the problem involves classes of naturally occurring kinds or classes of artifacts. We consider how these findings might be integrated into formal approaches to default reasoning and also consider the relation of this sort of qualitative default reasoning to statistical reasoning.
Satisficing Inference and the Perks of Ignorance
Most approaches to modeling rational inference do not take inio account that in the real world, organisms make inferences under limited time and knowledge. In this tradition, the mind is treated as a calculating demon equipped with unlimited time, knowledge, and computational might. We propose a family of satisficing algorithms based on a simple psychological mechanism: one-reason decision making. These fast and frugal algorithms violate fundamental tenets of classical rationality, for example, they neither look up nor integrate all information. By computer simulation, we held a competition between the satisficing Take The Best algorithm and various more "optimal" decision procedures. The Take The Best algorithm matched or outperformed all competitors in inferential speed and accuracy. Most interesting was the flnding that the best algorithms in the competition, those which used a form of one-reason decision making, exhibited a startling "less-is-more" effect: they performed better with missing knowledge than with complete knowledge. We discuss the less-is-more effect and present evidence of it in human reasoning. This counter-intuitive effect demonstrates that the mind can satisfice and seize upon regularities in the environment to the extent that it can exploit even the absence of knowledge as knowledge.
A Connectionist Treatment of Negation and Inconsistency
A connectionist model capable of encoding positive as well as negated knowledge and using such knowledge during rapid reasoning is described. The model explains how an agent can hold inconsistent beliefs in its long-term memory without being "aware" that its beliefs are inconsistent, but detect a contradiction whenever inconsistent beliefs that are within a certain inferential distance of each other become co-active during an episode of reasoning. Thus the model is not logically omniscient, but detects contradictions whenever it tries to use inconsistent knowledge. The model also explains how limited attentional focus or action under time pressure can lead an agent to produce an erroneous response. A biologically significant feature of the model is that it uses only local inhibition to encode negated knowledge. The model encodes and propagates dynamic bindings using temporal synchrony.
Hearing with the eyes: A distributed cognition perspective on guitar song imitation
Many guitarists learn to play by imitating recordings. This style of learning allows guitarists to master both new songs and new techniques. To imitate a song, a guitarist repeatedly listens to a song recording until the entire song, or the desired portion of that song, can be reproduced by the guitarist. This kind of imitation can be a very difficult process particularly if the recorded guitarist plays fast and other instruments are involved. Besides the difficulty in hearing the guitar music, the many different ways to finger and articulate the same notes and chords on a guitar, can also make playing the music difficult. In this paper, we describe some of the knowledge guitarists use to minimize these difficulties. We then propose an external representation that guitarists can use to unload some of the cognitive burden imposed by the imitation process. This external representation — the bar chord — transform many of the imitation activities from those requiring both internal computations and memory to those that require the guitarist to merely look and see the desired results. Moreover, bar chords facilitate the social distribution of these individual benefits. This research contributes to the growing field of distributed cognition and to our understanding of both internal and external representations used during music learning and improvisation.
Constraints on the experimental design process in real-world science
The goal of the research reported in this paper is to uncover the cognitive processes involved in designing complex experiments in contemporary biology. Models of scientific reasoning often assume that the experimental design process is primarily theoretically constrained. However, designing an experiment is a very complex process in which many steps and decisions must be made even when the theory is fully specified. We uncover a number of crucial cognitive steps in experimental design by analyzing the design of an experiment at a meeting of an immunology laboratory. Based on our analysis, we argue that experimental design involves the following processes: unpacking and specifying slots in possible experimental designs, locally evaluating specific components of proposed designs, and coordinating and globally evaluating possible experimental designs. Four sets of criteria guide local and global evaluation: ensuring a robust internal structure to the experiment, optimizing the likelihood experiments will work, performing costs/benefits analyses on possible design components, and ensuring acceptance of results by the scientific community. Our analyses demonstrate that experimental design is constrained by many non-theoretical factors. In particular, the constant threat of error in experimental results lies behind many of the strategies scientists use.
Teaching/Learning Events in the Workplace: a Comparative Analysis of their Organizational and Interactional Structure
It is widely acknowledged that teaching and learning are organized quite differently in and out of school settings. This paper describes two strips of interaction, selected from a data corpus that documents naturally-occurring work in adult settings often considered to be targets for science and mathematics education. In the first strip (civil engineering), we follow how engineers with different levels of organizational responsibility use an evaluative term, "brutal," in relation to features of a proposed roadway design. In the second strip (field biology), we follow participants' initially conflicting uses of the register terms, "difference" and "distance," as they collaborate across disciplinary specialties. In both cases, disagreements about the use of terms are detected in ongoing interaction, alternative meanings are actively assembled across different types of media, and disagreements are resolved around pre-existing organizational asymmetries. We raise three general questions about teaching/learning in the workplace: (i) What is accessible to participants as teachers/learners under different organizational conditions; (ii) How are disagreements about shared meaning managed, given asymmetries between participants in these events; and (iii) What do these kinds of studies tell us about the acquisition of word meaning as an unproblematic relation between term and referent?
Distributed Reasoning: An Analysis of Where Social and Cognitive Worlds Fuse
The goal of this paper was to examine the influence of social and cognitive factors on distributed reasoning within the context of scientific laboratory meetings. We investigated whether a social factor, status, and cognitive factors such as discussion topic and time orientation of the research influenced distributed reasoning. The impact of status on distributed reasoning was examined using 3 lab meetings in which a technician presented (low status) and 3 lab meetings in which a graduate student presented (high status). Two cognitive variables were also examined; focus of discussion topic (theory, method, findings, and conclusions) and the time orientation of the distributed reasoning (past, current and future research). Pooled (cross sectional/time series) analysis, a regression technique, was used to perform the analyses. We found that status of the presenter influenced the structure of distributed reasoning: When the presenter was of high status, the principal investigator was an important influence on distributed reasoning. In contrast, when the presenter was of low status, other lab members were more likely to contribute to distributed reasoning. Our analyses also show that distributed reasoning is not influenced by the discussion topic but appears to focus on the discussion of future research.
The Impact of Letter Classification Learning on Reading
When people read, they classify a relatively long string of characters in parallel. Machine learning principles predict that classification learning with such high dimensional inputs and outputs will fail unless biases are imposed to reduce input and output variability and/or the number of candidate input/output mapping functions evaluated during learning. The present paper draws insight from observed reading behaviors to propose some potential sources of such biases, and demonstrates, through neural network simulations of letter-sequence classification learning that: (1) Increasing dimensionality does hinder letter classification learning and (2) the proposed sources of bias do reduce dimensionality problems. The result is a model that explains word superiority and word frequency effects, as well as consistencies in eye fixation positions during reading, solely in terms of letter classification learning.
Where Defaults Don't Help: the Case of the German Plural System
The German plural system has become a focal point for conflicting theories of language, both linguistic and cognitive. We present simulation results with three simple classifiers - an ordinary nearest neighbour algorithm. Nosofsky's 'Generalized Context Model' (GCM) and a standard, three-layer backprop network - predicting the plural class from a phonological representation of the singular in German. Though these are absolutely 'minimal' models, in terms of architecture and input information, they nevertheless do remarkably well. The nearest neighbour predicts the correct plural class with an accuracy of 72% for a set of 24,640 nouns from the CELEX database. With a subset of 8,598 (non-compound) nouns, the nearest neighbour, the GCM and the network score 71.0%, 75.0% and 83.5%, respectively, on novel items. Furthermore, they outperform a hybrid, 'pattem-associator + default rule', model, as proposed by Marcus et al. (1995), on this data set.
Selective attention in the acquisition of the past tense
It is well known that children generally exhibit a "U-shaped" pattern of development in the process of acquiring the past tense. Piunkett & Marchman (1991) showed that a connectionist network, trained on the past tense, would exhibit U-shaped learning effects. This network did not completely master the past tense mapping, however. Piunkett & Marchman (1993) showed that a network trained with an incrementally expanded training set was able to achieve acceptable levels of mastery, as well as show the desired U-shaped pattern. In this paper, we point out some problems with using an incrementally expanded training set. We propose a model of selective attention that enables our network to completely master the past tense mapping and exhibit U-shaped learning effects without requiring external manipulation of its training set.
Word Learning and Verbal Short-Term Memory: A Computational Account
Recent behavioral evidence suggests that human vocabulary acquisition processes and verbal short-term memory abilities may be related (Gathercole & Baddeley, 1993). Investigation of this relationship has considerable significance for understanding of human language, of working memory, and of the relationship between short- and long-term memory systems. This paper presents a computational model of word learning, nonword repetition, and immediate serial recall. By providing an integrated account of these three abilities, the model provides a specification of how the mechanisms of immediate serial recall may be related to mechanisms of language processing more generally. Furthermore, the model provides fresh insight into the observed behavioral correlations between word teaming and immediate serial recall. According to the model, these correlations can arise because of the common dependence of these two abilities on core phonological and semantic processing mechanisms. This contrasts with the explanation proposed in the working memory literature, viz., that word learning is dependent on verbal short-term memory (Gathercole et al., 1992). It is discussed how both explanations can be reconciled in terms of the present model.
Spatial cognition in the mind and in the world - the case of hypermedia navigation
We present the results of a study of spatial cognition and its relationship to hypermedia navigation. The results show that a distinction can be made between two kinds of spatial cognition. One that concerns the concomitant acting in the physical world, and on that is a pure internal mental activity. This conclusion is supported by two kinds of data. First, a factor analysis of the subtests used in this study groups them into these two categories, and second, it is shown that only the internal one of these factors is related to the subjects performance in using a hypertext-based on-line help system. In the fmal section we point to the theoretical connections between this work and work in areas of situated cognition and on different kinds of mental representations, and discuss various possibilities that the results from this study suggest for the development of interface tools that will help users with low spatial abilities to use hypermedia systems.
Individual differences in proof structures following multimodal logic teaching
We have been studying how students respond to multimodal logic teaching with Hyperproof. Performance measures have already indicated that students' pre-existing cognitive styles have a significant impact on teaching outcome. Furthermore, a substantial corpus of proofs has been gathered via automatic logging of proof development. We report results from analyses of final proof structure, exploiting (i) 'proofograms', a novel method of proof visualisation, and (ii) corpus-linguistic bigram analysis of rule use. Results suggest that students' cognitive styles do indeed influence the structure of their logical discourse, and that the effect may be attributable to the relative skill with which students manipulate graphical abstractions.
Functional Roles for the Cognitive Analysis of Diagrams in Problem Solving
This paper proposes that a novel form of cognitive analysis for diagrammatic representations is in terms of the functional roles that they can play in problem solving. Functional roles are capacities or features that a diagram may possess, which can support particular forms of reasoning or specific problem solving tasks. A person may exploit several functional roles of a single diagram in one problem. A dozen functional roles have been identified, which can be considered as a framework to bridge the gulf between (i) studies of the properties of diagrams in themselves and (ii) investigations of human reasoning and problem solving with diagrammatic representations. The utility of the framework is demonstrated by examining how the functional roles can explain why certain diagrams facilitate problem solving in thermodynamics. The thermodynamics diagrams are interesting, in themselves, as examples of complex cognitive artefacts that support a variety of sophisticated forms of reasoning.
A Study of Visual Reasoning in Medical Diagnosis
The purpose of this paper is to describe experimental work conducted in the area of diagnostic radiology, with an emphasis on how perception and problem solving interact in this type of task. This work was part of a larger project whose goals included the development of an infomiation-processing model of visual interaction, and the subsequent design of an intelligent cooperative assistant for this domain.
Verbal protocol data was collected from eight radiologists (six residents and two experts) while they examined seven different computer-displayed chest x-rays. A brief overview of the methodology and analysis techniques is presented, together with specific results from one x-ray case. More general results are then discussed in the framework of issues important to the later modeling effort.
The Interaction of Semantic and Phonological Processing
Models of spoken word recognition vary in the ways in which they capture the relationship between speech input and meaning. Modular accounts prohibit a word's meaning from affecting the computation of its form-based representation, whereas interactive models allow semantic activation to affect phonological processing. To test these competing hypotheses we manipulated word familiarity and imageability, using auditory lexical decision and repetition tasks. Responses to high imageability words were significantly faster than to low imageability words. Response latencies were also analysed as a function of cohort variables; cohort size and frequency of cohort members. High and low imageable words were divided into 2 sets: (a) large cohorts with many high frequency competitors, (b) small cohorts with few high frequency competitors. Analyses showed that there was only a significant imageability effect for the words which were members of large cohorts. These data suggest that when the mapping from phonology to semantics is difficult (when a spoken word activates a large cohort consisting of many high frequency competitors), semantic information can help the discrimination process. Because highly imageable words are "semantically richer" and/or more context-independent, they provide more activation to phonology than do low imageability words. Thus, these data provide strong support for interactive models of spoken word recognition.
The combinatorial lexicon: Priming derivational affixes
In earlier research we argued for a morphemically decomposed account of the mental representation of semantically transparent derived forms, such as happiness, rebuild, and punishment. We proposed that such forms were represented as stems linked to derivational affixes, as in {happy} + {-ness} or {re-} + {build}. A major source of evidence for this was the pattern of priming effects, between derived forms and their stems, in a cross-modal repetition priming task. In two new experiments we investigated the prediction of this account that derivational affixes, such as {-ness} or {re-}, should also exist as independent entities in the mental lexicon, and should also be primable. We tested both prefixes and suffixes, split into productive and unproductive groups (where "unproductive" means no longer used to form new words), and found significant priming effects in the same cross-modal task. These effects were strongest for the productive suffixes and prefixes, as in prime-target pairs such as darkness/toughness and rearrange/rethink, where the overall effects were as strong as those for derived/stem pairs such as absurdity/absurd, and where possible phonological effects are ruled out by the absence of priming in phonological control and pseudo-affix conditions. We interpret this as evidence for a combinatorial approach to lexical representation.
Lexical Ambiguity and Context Effects in Spoken Word Recognition: Evidence from Chinese
Chinese is a language that is extensively ambiguous on a lexical-morphemic level. In this study, we examined the effects of prior context, frequency, and density of a homophone on spoken word recognition of Chinese homophones in a cross-modal experiment. Results indicate that prior context affects the access of the appropriate meaning from early on, and that context interacts with frequency of the individual meanings of a homophone. These results are consistent with the context-dependency hypothesis which argues that ambiguous meanings of a word may be selectively accessed at an early stage of recognition according to sentential context. However, the results do not support a pre-selection process in which the contextually appropriate meaning can be activated prior to the perception of the relevant acoustic signal.
Phonological Reduction, Assimilation, Intra-Word Information Structure, and the Evolution of the Lexicon of English: Why Fast Speech isn't Confusing
Phonological reduction and assimilation are intrinsic to speech. We report a statistical exploration of an idealised phonological version of the London-Lund Corpus and describe the computational consequences of phonological reduction and assimilation. In terms of intra- word information structure, the overall effect of these processes is to flatten out the redundancy curve calculated over consecutive segment-positions. We suggest that this effect represents a general principle of the presentation of information to the brain: information should be spread as evenly as possible over a representational surface or across time. We also demonstrate that the effect is partially due to the fact that when assimilation introduces phonological ambiguity, as in fat man coming to resemble fap man, then the ambiguity introduced is always in the direction of a less frequent segment: /p/ is less frequent than /t/. We show that this observation, the "Move to Markedness", is true across the board for changes in segment identity in English. This distribution of segments means that the number of erroneous lexical hypotheses introduced by segment-changing processes such as assimilation is minimised. We suggest that the Move to Markedness within the lexicon is the result of pressure from the requirements of a very efficient word recognition device that is sensitive to changes of individual phonological features.
Color Influences Fast Scene Categorization
A critical aspect of early visual processes is to extract shape data for matching against memory representations for recognition. Many theories of recognition assume that this is being done on luminance information. However, studies in psychophysics have revealed that color is being used by many low-level visual modules such as motion, stereopsis, texture, and 2D shapes. Should color really be discarded from theories of recognition? In this paper, we present two studies which seek to understand the role of chromatic information for the recognition of real scene pictures. We used three versions of scene pictures (gray-levels, normally colored and abnormally colored) coming from two broad classes of categories. In the first category, color was diagnostic of the category (e.g., beach, forest and valley). In the second category color was not diagnostic (e.g., city, road and room). Results revealed that chromatic information is being registered and facilitates recognition even after a 30 ms exposure to the scene stimuli. Similar results were recorded with exposures of 120 ms. However, influences of color on speeded categorizations were only observed with the color-diagnostic categories. No influence of color was observed with the other categories.
Categorical Perception of Novel Dimensions
Categorical perception is a phenomenon in which people are better able to distinguish between stimuli along a physical continuum when the stimuli come from different categories than when they come from the same category. In a laboratory experiment with human subjects, we find evidence for categorical perception along a novel dimension that is created by interpolating (i.e. morphing) between two randomly selected bezier curves. A neural network qualitatively models the empirical results with the following assumptions: 1) hidden "detector" units become specialized for particular stimulus regions with a topologically structured competitive learning algorithm, 2) simultaneously, associations between detectors and category units are learned, and 3) feedback from the category units to the detectors causes the detectors to become concentrated near category boundaries. The particular feedback used, implemented in an "S.O.S. network," operates by increasing the learning rate of weights connecting inputs to detectors that are neighbors to a detector that produces an improper categorization.
Categorical Perception in Facial Emotion Classification
We present an automated emotion recognition system that is capable of identifying six basic emotions (happy, surprise, sad, angry, fear, disgust) in novel face images. An ensemble of simple feed-forward neural networks are used to rate each of the images. The outputs of these networks are then combined to generate a score for each emotion. The networks were trained on a database of face images that human subjects consistently rated as portraying a single emotion. Such a system achieves 86% generalization on novel face images (individuals the networks were not trained on) drawn from the same database. The neural network model exhibits categorical perception between some emotion pairs. A linear sequence of morph images is created between two expressions of an individual's face and this sequence is analyzed by the model. Sharp transitions in the output response vector occur in a single step in the sequence for some emotion pairs and not for others. We plan to us the model's response to limit and direct testing in detennining if human subjects exhibit categorical perception in morph image sequences.
MetriCat: A Representation for Basic and Subordinate-level Classification
An important function of human visual perception is to permit object classification at multiple levels of specificity. For example, we can recognize an object as a "car," (the basic level) a "Ford Mustang" (subordinate level), and "Joe's Mustang" (instance level). Although this capacity is fundamental to human object perception, most computational models of object recognition either focus exclusively on basic-level classification (e.g., Biederman, 1987; Hummel & Biederman, 1992; Hummel & Stankiewicz, 1996) or exclusively on instance-level classification (e.g., Ullman & Basri, 1991; Edelman & Poggio, 1990). A computational account that naturally integrates both levels of classification remains elusive. We describe a general approach to representing numerical properties (e.g., those that characterize object shape) that simultaneously supports both basic and subordinate/instance-level recognition. The account is based on a general nonlinear coding for numerical quantities describing both featural variables (such as degree of curvature and aspect ratio) and configural variables (such as relative position). Used as the input to a classifier with Gaussian receptive fields, this representation supports recognition at multiple levels of specificity, and suggests an account of the role of attention and time in the classification of objects at different levels of abstraction.
Similarity to reference shapes as a basis for shape representation
We present a unified approach to visual representation, addressing both the needs of superordinate and basic-level categorization and of identification of specific instances of familiar categories. According to the proposed theory, a shape is represented by its similarity to a number of reference shapes, measured in a high-dimensional space of elementary features. This amounts to embedding the stimulus in a low-dimensional proximal shape space. That space turns out to support representation of distal shape similarities which is veridical in the sense of Shepard's (1968) notion of second-order isomorphism (i.e., correspondence between distal and proximal similarities among shapes, rather than between distal shapes and their proximal representations). Furthermore, a general expression for similarity between two stimuli, based on comparisons to reference shapes, can be used to derive models of perceived similarity ranging from continuous, symmetric, and hierarchical, as in the multidimensional scaling models (Shepard, 1980), to discrete and non- hierarchical, as in the general contrast models (Tversky. 1977; Shepard and Arabic, 1979).
Integrating Discourse and Local Constraints in Resolving Lexical Thematic Ambiguities
We conducted sentence completion and eye-tracking reading experiments to examine the interaction of multiple constraints in the resolution of a lexical thematic ambiguity. The ambiguity was introduced with prepositional "by"-phrases in passive constructions, which can be ambiguous between agentive and locative interpretations (e.g., "built by the contractor" versus "built by the corner"). The temporarily ambiguous sentences were embedded in contexts that created expectations for one or the other interpretation. The constraints involved, including discourse-based expectations, verb biases, and contingent lexical frequencies, were independently quantified with a corpus analysis and a rating experiment. Our results indicate that there was an interaction of contextual and more local factors such that the effectiveness of the contexts was mediated by the local biases. Application of an explicit integration-competition model to the off-hne (sentence completion) and on-line (eye-tracking) results suggests that, during the processing of these ambiguous prepositional phrases, there was an immediate and simultaneous integration of the relevant constraints resulting in competition between partially active alternative interpretations.
Evidence for a Tagging Model of Human Lexical Category Disambiguation.
We investigate the explanatory power of very simple statistical mechanisms within a modular model of the Human Sentence Processing Mechanism. In particular, we borrow the idea of a 'part-of-speech tagger' from the field of Naniral Language Processing, and use this to explain a number of existing experimental results in the area of lexical category disambiguation. Not only can each be explained without the need to posit extra mechanisms or constraints, but the exercise also suggests a novel account for some established data.
Parallel Activation of Distributed Concepts: Who put the P in the PDP?
An investigation of the capacity of distributed systems to represent patterns of activation in parallel is presented. Connectionist models of lexical ambiguity have captured this capacity by activating the arithmetic mean of the vectors representing the relevant meanings to form a lexical blend. However, a more extreme test of this system occurs in a distributed model of lexical access in speech perception, which may require a lexical blend to represent transiently the meanings of hundreds of words. I show that there is a strict limit on the number of distributed patterns that can be represented effectively by a lexical blend. This limit is dependent to some extent on the structure and content of the distributed space, which in the case of lexical access corresponds to structure and content of the mental lexicon. This limitation implies that distributed models cannot be simple re-implementations of parallel localist models and offers a valuable opportunity to distinguish experimentally between localist and distributed models of cognitive processes.
Discrete Multi-Dimensional Scaling
In recent years, a number of models of lexical access based on attractor networks have appeared. These models reproduce a number of effects seen in psycholinguistic experiments, but all suffer from unrealistic representations of lexical semantics. In an effort to improve this situation we are looking at techniques developed in the information retrieval literature that use the statistics found in large corpora to automatically produce vector representations for large numbers of words. This paper concentrates on the problem of transforming the real-valued cooccurrence vectors produced by these statistical techniques into the binary- or bipolar-valued vectors required by attractor network models, while maintaining the important inter-vector distance relationships. We describe an algorithm we call discrete multidimensional scaling which accomplishes this, and present the results of a set of experiments using this algorithm.
Collaboration in Primary Science Classrooms: Learning about Evaporation
We have been studying collaboration in the context of children conducting science investigations in British primary classrooms. The classroom is the site of action where learning occurs and it is the teacher who plays the key role in manipulating the learning environment and selecting and structuring tasks to achieve the best learning effect for all children. In this paper we describe our general approach and focus in particular on the data we collect to explore how children's conceptual understanding of evaporation progresses. The paper highlights some of the messages emerging about how collaboration can sometimes enhance learning, and sometimes thwart it.
Transferring and Modifying Terms in Equations
Labeling and elaboration manipulations were used in examples to affect the likelihood of students learning to represent workers' rates and times in algebra word problems dealing with work. Learners studying examples with labels for rates and times were more likely to transfer and correctly modify the representations compared to learners who did not see the labels. An elaborative statement describing the possible representations for the different terms in the work equation did not reliably affect performance. These results extend prior work (Catrambone, 1994, 1995) on subgoal learning by demonstrating that representations, not just sets of steps, can be successfully transferred and modified through a manipulation (labeling) that has been shown to aid subgoal learning.
Understanding Constraint-Based Processes: A Precursor to Conceptual Change in Physics
Chi (1992; Chi and Slotu, 1993; Slotta, Chi and Joram, 1995) suggests that students experience difficulty in learning certain physics concepts because they inappropriately attribute these concepts with the ontology of material substances(MS). According to accepted physics theory, these concepts (e.g., light, heat, electric current) are actually a special type of process that Chi (1992) calls "Constraint-Based Interactions" (CBI). Students cannot understand the process-like nature of these concepts because of their bias towards substance-like conceptions, and also because they are unfamiliar with the CBI ontology. Thus, conceptual change can be facilitated by providing students with some knowledge of the CBI ontology before they receive the relevant physics instruction. This CBI training was provided by means of a computer-based instructional module in which students manipulated simulations as they read an accompanying text concerning four attributes of the CBI ontology. A control group simply read a (topically similar) text from the computer screen. The two groups then studied a physics textbook concerning concepts of electricity, and performed a post-test which was assessed for evidence of conceptual change. As a result of their training in the CBI ontology, the experimental group showed significant evidence of conceptual change with regards to the CBI concept of electric current.
The Role of Generic Models in Conceptual Change
We hypothesize generic models to be central in conceptual change in science. This hypothesis has its origins in two theoretical sources. The first source, constructive modeling, derives from a philosophical theory that synthesizes analyses of historical conceptual changes in science with investigations of reasoning and representation in cognitive psychology. The theory of constructive modeling posits generic mental models as productive in conceptual change. The second source, adaptive modeling, derives from a computational theory of creative design. The theory of adaptive modeling uses generic mental models to enable analogical transfer. Both theories posit situation independent domain abstractions, i.e. generic models. Using a constructive modeling interpreution of the reasoning exhibited in protocols collected by John Clement (1989) of a problem solving session involving conceptual change, we employ the representational constructs and processing structures of the theory of adaptive modeling to develop a new computational model, ToRQUE . Here we describe a piece of our analysis of the protocol to illustrate how our synthesis of the two theories is being used to develop a system for articulating and testing ToRQUE. The results of our research show how generic modeling plays a central role in conceptual change. They also demonstrate how such an interdisciplinary synthesis can provide significant insights into scientific reasoning.
Using orthographic neighborhoods of interlexical nonwords to support an interactive-activation model of bilingual memory
Certain models of bilingual memory based on parallel, activation-driven self-terminating search through independent lexicons can reconcile both interlingual priming data (which seem to support an overlapping organization of bilingual memory) and homograph recognition data (which seem to fevor a separate-access dual-lexicon approach). But the dual-lexicon model makes a prediction regarding recognition times for nonwords that is not supported by the data. The nonwords that violate this prediction are produced by changing a single letter of non-cognate interlexical homographs (words like appoint, legs, and mince that are words in both French and English, but have completely different meanings in each language), thereby producing regular nonwords in both languages (e.g., appaint, ligs, monce). These nonwords are then classified according to the comparative sizes of their orthographic neighborhoods in each language. An interactive-activation model, unlike the dual-lexicon model, can account for reaction times to these nonwords in a relatively straightforward manner. For this reason, it is argued that an interactive-activation model is the more appropriate of the two models of bilingual memory.
Conscious and Unconscious Perception: A Computational Theory
We propose a computational theory of consciousness and model data from three experiments in visual perception. The central idea of our theory is that the contents of consciousness correspond to temporally stable states in an interconnected networic of specialized computational modules. Each module incorporates a relaxation search that is concerned with achieving semantically well-formed states. We claim that being an attractor of the relaxation search is a necessary condition for awareness. We show that the model provides sensible explanations for the results of three experiments, and makes testable predictions. The first experiment (Marcel, 1980) found that masked, ambiguous prime words facilitate lexical decision for targets related to either prime meaning, whereas consciously perceived primes facilitate only the meaning that is consistent with prior context The second experiment (Fehrer and Raab, 1962) found that subjects can make detection responses in constant time to simple visual stimuli regardless of whether they are consciously perceived or masked by metacontrast and not consciously perceived. The third experiment (Levy and Pashler, 1996) foimd that visual word recognition accuracy is lower than baseline when an earlier speeded response was incorrect, and higher than baseline when the early response was correct, consistent with a causal relationship between conscious perception and subsequent processing.
In Search Of Articulated Attractors
Recurrent attractor networks offer many advantages over feedforward networks for the modeling of psychological phenomena. Their dynamic nature allows them to capture the time course of cognitive processing, and their learned weights may often be easily interpreted as soft constraints between representational components. Perhaps the most significant feature of such networks, however, is their ability to facilitate generalization by enforcing "well formedness" constraints on intermediate and output representations. Attractor networks which learn the systematic regularities of well formed representations by exposure to a small number of examples are said to possess articulated attractors. This paper investigates the conditions under which articulated attractors arise in recurrent networks trained using variants of backpropagation. The results of computational experiments demonstrate that such structured attractors can spontaneously appear in an emergence of systematicity, if an appropriate error signal is presented directly to the recurrent processing elements. We show, however, that distal error signals, backpropagated through intervening weights, pose serious problems for networks of this kind. We present simulation results, discuss the reasons for this difficulty, and suggest some directions for future attempts to surmount it.
A Recurrent Network that performs a Conext-Sensitive Prediction Task
We address the problem of processing a context-sensitive language with a recurrent neural network (RN). So far, the language processing capabilities of RNs have only been investigated for regular and context-free languages. We present an extremely simple RN with only one parameter z for its two hidden nodes that can perform a prediction task on sequences of symbols from the language {(ba^k)^n" | k >= 0, n > 0}, a language that is context-sensitive but not context-free. The input to the RN consists of any string of the language, one symbol at a time. The network should then, at all times, predict the symbol that should follow. This means that the network must be able to count the number of a's in the first subsequence and to retain this number for future use. We present a value for the parameter z for which our RN can solve the task for k = 1 up to k = 120. As we do not give any method to find a good value for z, this does not say anything about the learning capabilities of our network. It does, however, show that context-sensitive information (the count of a's) can be represented by the network; we analyse in detail how this is done. Hence our work shows that, at least from a representational point of view, connectionist architectures can handle more complex formal languages than was previously known.
Competition in Analogical Transfer: When Does a Lightbulb Outshine an Army
This study investigated competition in analogical transfer to a problem solution. In two experiments, subjects read two stories, then attempted to solve Duncker's (1945) radiation problem, which has both a convergence and an open-passage solution. Stories were constructed that suggested each of these solutions; a third story was irrelevant. Subjects in the competitive conditions read both solution-suggesting stories, and subjects in the two noncompetitive conditions read one of these and the irrelevant story. In Experiment 1, the noncompetitive conditions convergence solutions and open-passage solutions were produced at comparable rates, but in the competitive condition, convergence solutions overwhelmed open-passage solutions. This asymmetry is too large to be explained by unidimensional models of retrieval and reflects the multidimensional nature of retrievability. In Experiment 2, the source stories suggesting each solution type were reversed, and the open-passage solution rate was higher than the convergence solution rate in all three conditions. In both experiments, subjects were able to successfully apply both source stories once cued to do so, indicating that the competition is at the retrieval stage of transfer, not at the mapping stage. Computational models of analogical transfer (e.g., ARCS and MAC/FAC) predict some competition but may have difficulty explaining the extreme nature of these results.
Can a real distinction be made between cognitive theories of analogy and categorisation?
Analogy has traditionally been defined by use of a contrast definition: analogies represent associations or connections between things distinct from the 'normal' associations or connections determined by our 'ordinary' concepts and categories. Research into analogy, however, is also distinct from research into concepts and categories in terms of the richness of its process models. A number of detailed, plausible models of the analogical process exist (Forbus, Centner and Law, 1995; Holyoak and Thagard, 1995): the same cannot be said of categorisation. In this paper we argue that in the absence of an acceptable account of categorisation, this contrast definition amounts to little more than a convenient fiction which, whilst useful in constraining the scope of cognitive investigations, confuses the relationship between analogy and categorisation, and prevents models of these processes from informing one another. We present a study which addresses directly the question of whether analogy can be distinguished from categorisation by contrasting categorisational and analogical processes, and following from this, whether theories of analogy, notably Centner's structure mapping theory (Centner, 1983; Forbus et al, ibid.), can also be used to model parts of the categorisation process.
LISA: A Computational Model of Analogical Inference and Schema Induction
The relationship between analogy and schema induction is widely acknowledged and constitutes an important motivation for developing computational models of analogical mapping. However, most models of analogical mapping provide no clear basis for supporting schema induction. We describe LISA (Hummel & Holyoak, 1996), a recent model of analog retrieval and mapping that is explicitly designed to provide a platform for schema induction and other forms of inference. LISA represents predicates and their arguments (i.e., objects or propositions) as patterns of activation distributed over units representing semantic primitives. These representations are actively (dynamically) bound into propositions by synchronizing oscillations in their activation: Arguments fire in synchrony with the case roles to which they are bound, and out of synchrony with other case roles and arguments. By activating propositions in LTM, these patterns drive analog retrieval and mapping. This approach to analog retrieval and mapping accounts for numerous findings in human analogical reasoning (Hummel & Holyoak, 1996). Augmented with a capacity for intersection discovery and unsupervised learning, the architecture supports analogical inference and schema induction as a natural consequence. We describe LISA'S account of schema induction and inference, and present some preliminary simulation results.
Alignability and Attribute Importance in Choice
When people choose between two alternatives, like between two colleges, some of the available information is comparable across the alternatives (alignable) and some is noncomparable (nonalignable). For example, when comparing colleges, the academic reputation of both schools may be known (alignable), while the quality of teaching may only be known for one school (nonalignable). Recent research has shown that people use more alignable than nonalignable information in decision making. In this experiment, we consider whether alignable information is preferred even when nonalignable information is important. In the study, some participants rated the importance and valence of a series of statements about colleges that differed in alignability. Other participants made choices between pairs of colleges whose descriptions incorporated these statements. The results indicate that alignable information is preferred to nonalignable information even when the nonalignable information is important. Results also showed that the interpretation of attribute valence depends on alignability. These observations suggest that alignability is more influential than attribute importance in the processing of choice information and that the use of alignable information may facilitate the interpretation of attribute information.
Computational Bases of Two Types of Developmental Dyslexia
The bases of developmental dyslexia were explored using connectionist models. The behavioral literature suggests that there are two dyslexic subtypes: "phonological" dyslexia involves impairments in phonological knowledge whereas in "surface " dyslexia phonological knowledge is apparently intact and the deficit may instead reflect a more general developmental delay. We examined possible computational bases for these impairments within connectionist models of the mapping from spelling to sound. Phonological dyslexia was simulated by reducing the capacity of the models to represent this type of information. The surface pattern was simulated by reducing the number of hidden units. Performance of the models captured the major behavioral phenomena that distinguish the two subtypes. Phonological impairment has a greater impact on generalization (reading nonwords such as NUST); the hidden unit limitation has a greater impact on learning exception words such as PINT. More severe impairments produce mixed cases in which both nonwords and exceptions are impaired. Thus, the simulations capture the effects of different types and degrees of impairment within a major component of the reading system.
Integrating Multiple Cues in Word Segmentation: A Connectionist Model using Hints
Children appear to be sensitive to a variety of partially informative "cues" during language acquisition, but little attention has been paid to how these cues may be integrated to aid learning. Borrowing the notion of learning with "hints" from the engineering literature, we employ neural networks to explore the notion that such cues may serve as hints for each other. A first set of simulations shows that when two equally complex, but related, functions are learned simultaneously rather than individually, they can help bootstrap one another (as hints), resulting in faster and more uniform learning. In a second set of simulations we apply the same principles to the problem of word segmentation, integrating two types of information hypothesized to be relevant to this task. The integration of cues in a single network leads to a sharing of resources that permits those cues to serve as hints for each other. Our simulation results show that such sharing of computational resources allows each of the tasks to facilitate the learning (i.e., bootstrapping) of the other, even when the cues are not sufficient on their own.
Statistical Cues in Language Acquisition: Word Segmentation by Infants
A critical component of language acquisition is the ability to learn from the information present in the language input. In particular, young language learners would benefit from leaming mechanisms capable of utilizing the myriad statistical cues to linguistic structure available in the input. The present study examines eight-month-old infants' use of statistical cues in discovering word boundaries. Computational models suggest that one of the most useful cues in segmenting words out of continuous speech is distributional information: the detection of consistent orderings of sounds. In this paper, we present results suggesting that eight-month-old infants can in fact make use of the order in which sounds occur to discover word-like sequences. The implications of this early ability to detect statistical information in the language input will be discussed with regard to theoretical issues in the field of language acquisition.
Cognition and the Statistics of Natural Signals
This paper illustiates how the statistical structure of natural signals may help understand cognitive phenomena. We focus on a regularity found in audio visual speech perception. Experiments by Massaro and colleagues consistently show that optic and acoustic speech signals have separable influences on perception. From a Bayesian point of view this regularity reflects a perceptual system that treats optic and acoustic speech as if they were conditionally independent signals. In this paper we perform a statistical analysis of a database of audiovisual speech to check whether optic and acoustic speech signals are indeed conditionally independent. If so, the regularities found by Massaro and colleagues could be seen as an optimal processing strategy of the perceptual system. We analyze a small database of audio visual speech using hidden Markov models, the most successful models in automatic speech recognition. The results suggest that acoustic and optic speech signals are indeed conditionally independent and that therefore, the separability found by Massaro and colleagues may be explained in terms of optimal perceptual processing: Independent processing of optic and acoustic speech results in no significant loss of information.
An Abstract Computational Model of Learning Selective Sensing Skills
In this paper we review the benefits of abstract computational models of cognition and present one such model of behavior in a flight-control domain. The model's central assumptions aire that differences among subjects are due to differences in sensing skills, and that the main form of learning involves updating statistics to distinguish relevant from irrelevant features. We report an implementation of this abstract model of sensory learning, along with a system that searches the space of parameter settings in order to fit the model to observations. We compare the sensory-learning framework to an alternative based on the power law, finding that the latter fits the data slightly better but that it requires many more parameters.
Epistemic Action Increases With Skill
On most accounts of expertise, as agents increase their skill, they are assumed to make fewer mistakes and to take fewer redundant or backtracking actions. Contrary to such accounts, in this paper we present data collected from people learning to play the videogame Tetris which show that as skill increases, the proportion of game actions that are later undone by backtracking also increases. Nevertheless, we also found that as game skill increases, players speed up as predicted by the power law of practice. We explain the observed increase in backtracking as the result of an interactive search process in which agent-intemal and agent-external actions are interleaved, making the cognitive computation more efficient (i.e., faster). We refer to extemai actions which simplify an agent's computation as epistemic actions.
Perseverative Subgoaling and Production System Models of Problem Solving
Perseverative subgoaling, the repeated successful solution of subgoals, is a common feature of much problem solving, and its pervasive nature suggests that it is an emergent property of a problem solving architecture. This paper presents a set of minimal requirements on a production system architecture for problem solving which will allow perseverative subgoaling whilst guaranteeing the possibility of recovery from such situations. The fundamental claim is that perseverative subgoaling arises during problem solving when the results of subgoals are forgotten before they can be used. This prompts further attempts at the offending subgoals. In order for such attempts to be effective, however, the production system must satisfy three requirements concerning working memory structure, production structure, and memory decay. The minimal requirements are embodied in a model (developed within the COGENT modelling software) which is explored with respect to the task of multicolumn addition. The inter-relationship between memory decay and task difficulty within this task (measured in terms of the number of columns) is discussed.
Probabilistic Plan Recognition for Cognitive Apprenticeship
Interpreting the student's actions and inferring the student's solution plan during problem solving is one of the main challenges of tutoring based on cognitive apprenticeship, especially in domains with large solution spaces. We present a student modeling framework that performs probabilistic plan recognition by integrating in a Bayesian network knowledge about the available plans and their structure and knowledge about the student's actions and mental state. Besides predictions about the most probable plan followed, the Bayesian network provides probabilistic knowledge tracing, that is assessment of the student's domain knowledge. We show how our student model can be used to tailor scaffolding and fading in cognitive apprenticeship. In particular, we describe how the information in the student model and knowledge about the structure of the available plans can be used to devise heuristics to generate effective hinting strategies when the student needs help.
Dissociating Performance from Learning: An Empirical Evaluation of a Computational Model
This paper presents a follow-up to the ATM-Soar models presented at 1993 Meeting of the Cognitive Science Society and the CHI 1994 Research Symposium. The original work described the use of the Soar cognitive architecture to simulate user learning with different ATM interfaces. In particular, it focused on the relative effects of interface instructions (e.g., "Insert card into slot") and perceptual attentional cues (e.g., a flashing area around the card slot) on learning and performance. The study described here involves getting human data on the same tasks to test the predictions of the computational models. The ATM task is simulated on a PC in order to contrast three types of interface conditions: just instructions, instructions plus flashing, and just flashing. Subjects must insert a bank card, check the account balance, and withdraw money. They are asked to repeat the task four times so that the effects of training on performance and teaming can be observed. The data suggests that subjects learn to perform the task faster with attentional attractors, as the Soar model predicted. More interestingly, the Soar model also predicted that people would do better without instructions when there are attentional attractors. This prediction was supported as well.
Rhythmic Commonalities between Hand Gestures and Speech
Studies of coordination in rhythmic limb movement have established that certain phase relationships among cycling limbs are preferred, i.e. patterns such as synchrony and anti-synchrony are produced more often and more reliably than arbitrary relations. A speech experiment in which subjects attempt to place a phrase-medial stress at a range of phases within an overall phrase repetition cycle is presented, and analogous results are found. Certain phase relations occur more frequently and exhibit greater stability than others. To a first approximation, these phases are predicted by a simple harmonic model. The observed commonalities between limb movements and spoken rhythm support Leishley's conjecture that a common control strategy underlies the coordination of all rhythmic activity.
Modeling Beat Perception with a Nonlinear Oscillator
The perception of beat and meter is fundamental to the perception of rhythm, yet modeling this phenomenon has proven a formidable problem. This paper outlines a dynamic model of beat perception in complex, metrically structured rhythms that has been described in detail elsewhere (Large, 1994; Large & Kolen, 1994). A study is described in which pianists performed notated melodies and improvised variations on these same melodies. The performances are analyzed in terms of amount of rubato and rhythmic complexity, and the model's ability to simulate beat perception in these melodies is assessed.
Emotional Decisions
Recent research has yielded an explosion of literature that establishes a strong connection between emotional and cognitive processes. Most notably, Antonio Damasio draws an intimate connection between emotion and cognition in practical decision making. Damasio presents a "somatic marker" hypothesis which explains how emotions are biologically indispensable to decisions. His research on patients with frontal lobe damage indicates that feelings normally accompany response options and operate as a biasing device to dictate choice. What Damasio's hypothesis lacks is a theoretical model of decision making which can advance the conceptual connection between emotional and cognitive decision making processes. In this paper we combine Damasio's somatic marker hypothesis with the coherence theory of decision put forward by Thagard and Millgram. The juxtaposition of Damasio's hypothesis with a cognitive theory of decision making leads to a new and better theory of emotional decisions.
Lateral Connections In The Visual Cortex Can Self-Organize Cooperatively With Multisize RFs Just As With Ocular Dominance and Orientation Columns
Cells in the visual cortex are selective not only to ocular dominance and orientation of the input, but also to its size and spatial frequency. The simulations reported in this paper show how size selectivity could develop through Hebbian self-organization, and how receptive fields of different sizes could organize into columns like those for orientation and ocular dominance. The lateral connections in the network self-organize cooperatively and simultaneously with the receptive field sizes, and produce patterns of lateral connectivity that closely follow the receptive field organization. Together with our previous work on ocular dominance and orientation selectivity, these results suggest that a single Hebbian self-organizing process can give rise to all the major receptive field properties in the visual cortex, and also to structured patterns of lateral interactions, some of which have been verified experimentally and others predicted by the model.
Neuronal Homeostasis and REM Sleep
We propose a novel mechainisin of synaptic maintenance whose goal is to preserve the performance of an associative memory network undergoing synaptic degradation, and to prevent the development of pathological attractors. This mechanism is demonstrated by simulations performed on a low-activity neural model which implements local neuronal homeostasis. We hypothesize that, whereas Hebbian synaptic modifications occur as a learning process during wakefulness and SWS consolidation, the neural-based regulatory mechanisms proposed here take place during REM sleep, where they are driven by bouts of random cortical activity. The role of REM sleep, in our model, is not to prune spurious attractor states, as previously proposed by Crick and Mitchison and by Hopfield Feinstein and Palmer, but to maintain synaptic integrity in face of ongoing synaptic turnover. Our model provides a possible reason for the segmentation of sleep into repetitive SWS and REM phases.
The perception of causality: Feature binding in interacting objects
When one billiard ball strikes and launches another, most observers report seeing the first ball cause the second ball to move. Michotte (1963) argued that the essence of phenomenal causality is "ampliation" of movement, in which the motion of the first object is perceptually transferred to the second object. Michotte provided only phenomenological evidence, however. We extend the reviewing paradigm of Kahneman, Treisman, and Gibbs (1992) to Michotte-style launching events and report response-time data consistent with Michotte's notion of ampliation. We discuss how contemporary theories of feature binding can extend to the domain of interacting objects and address our results. We also suggest that our treatment of ampliation helps clarify controversies regarding whether perceived causality is direct or interpreted and whether it is innate or learned.
Judging the Contingency of a Constant Cue: Contrasting Predictions from an Associative and a Statistical Model
Two contingency judgment experiments are reported where one predictive cue was present on every trial of the task. This constant cue was paired with a second variable cue that was either positively correlated (Experiment 1) or negatively correlated with the outcome event (Experiment 2). Outcome base rate was independently varied in both experiments. Probabilistic contrasts could be calculated for the variable cue but not for the constant cue since the probability of the outcome occurring in the absence of the constant cue was undefined. Cheng & Holyoak's (1995) probabilistic contrast model therefore cannot uniquely specify the way in which the constant cue will be judged. In contrast, judgments of the constant cue were systematically influenced by the variable cue's contingency as well as by the outcome base rate. Specifically, judgments of the constant cue 1) were discounted when the variable cue was a positive predictor of the outcome but were enhanced when the variable cue was a negative predictor of the outcome, and 2) were proportional to the outcome base rate. These effects were anticipated by a connectionist network using the Rescorla-Wagner learning rule.
What Language Might Tell Us About the Perception of Cause
In English, causation can be expressed with either a lexical or periphrastic causative verb. Lexical causatives include both the notion of CAUSE and the notion of RESULT (frequently change-of-state) (e.g. Mulder sunk the boat); Periphrastic causatives encode the notion of CAUSE without the notion of RESULT (e.g. Mulder made the boat sink). According to many linguists, these two kinds of sentences have different meanings: lexical causatives are used for situations involving direct causation while periphrastic causatives are used for situations involving either direct or indirect causation. This research investigated how this distinction might be cognitively determined. Subjects watched 3D animations of marbles hitting one another and then described the scenes and enumerated the total number of events. When causers were inanimate, lexicalization and enumeration were guided by physical contact. When causers were animate, lexicalization and enumeration were guided by factors other than physical contact, possibly intention or ultimate causation. The results suggest how different kinds of causation and their expression might be related to the perception of events.
Mutability, Conceptual Transformation, and Context
Features differ in their mutability. For example, a robin could still be a robin even if it lacked a red breast; but it would probably not count as one if it lacked bones. I have hypothesized (Love & Sloman, 1995) that features are immutable to the extent other features depend on them. We can view a feature's mutability as a measure of transformational difficulty. In deriving new concepts, we often transform existing concepts (e.g. we can go from thinking about a robin to thinking about a robin without a red breast). The difficulty of this transformation, as measured by reaction time, increases with the immutability of the feature transformed. Conceptual transformations are strongly affected by context, but in a principled manner, also explained by feature dependency structure. A detailed account of context's effect on mutability is given, as well as corroborating data. I conclude by addressing how mutability-dependency theory can be applied to the study of similarity, categorization, conceptual combination, and metaphor.
On putting milk in coffee: The effect of thematic relations on similarity judgments.
Ail existing accounts of similarity assume that it is a function of matching and mismatching attributes between mental representations. However, Bassok and Medin (19%) found that the judged similarity of sentences does not necessarily reflect the degree of overlap between the properties of paired stimuli. Rather, similarity judgments are often mediated by a process of thematic integration and reflect the degree to which stimuli can be integrated into a common thematic scenario. We present results of a study which extend this surprising flnding by showing that it also applies to similarity ratings of objects and occurs whether or not subjects explain their judgments. Also, consistent with the Bassok and Medin findings, the tendency towards thematic integration was more pronounced when the paired stimulus shared few attributes--but was still an important factor in similarity judgments between objects which shared many attributes. We discuss the implications of these findings for models of cognitive processes which use similarity as an explanatory construct.
The Role of Situations in Concept Learning
This study examines how situation information is incorporated in concept learning and representation. Unlike most concept learning studies, this study includes situation information during concept learning. Unlike most studies about the influence of situations on episodic memory, this study investigates how situations affects conceptual processing. Experiment 1 demonstrates that people rely on situation information when processing concepts. Subjects verified a concept's property more quickly if the property was learned and tested in the same situation. Experiment 2 shows that in order for a situation to produce priming, the situation must be related to the property in a meaningful manner. Mere cooccurrence between a property and a situation is not sufficient.
Modeling Interference Effects In Instructed Category Learning
Category learning is often seen as a process of inductive generalization from a set of class-labeled exemplars. Human learners, however, often receive direct instruction concerning the structure of a category before being presented with examples. Such explicit knowledge may often be smoothly integrated with knowledge garnered by exposure to instances, but some interference effects have been observed. Specifically, errors in instructed rule following may sometimes arise after the repeated presentation of correctly labeled exemplars. Despite perfect consistency between instance labels and the provided rule, such inductive training can drive categorization behavior away from rule following and towards a more prototype-based or instance-based pattern. In this paper we present a general connectionist model of instructed category learning which captures this kind of interference effect. We model instruction as a sequence of inputs to a network which transforms such advice into a modulating force on classification behavior. Exemplar-based learning is modeled in the usual way: as weight modification via backpropagation. The proposed architecture allows these two sources of information to interact in a psychologically plausible manner Simulation results are provided on a simple instructed category learning task, and these results are compared with human performance on the same task.
Posters
Ethical Reasoning Strategies and Their Relation to Case-Based Instruction: Some Preliminary Results
This paper describes some preliminary results of an experiment to collect, analyze and compare protocols of arguments concerning practical ethical dilemmas prepared by novice and more experienced ethical reasoners. We report the differences we observed between the novice and experienced reasoners' apparent strategies for analyzing ethical dilemmas. We offer an explanation of the differences in terms of specific differences in the difficulty of the strategies' information processing requirements. Finally, we attempt to explain the utility of case-based ethics instruction in terms of the need to inculcate information processing skills required by the experienced reasoners' strategy.
Explaining preferred mental models in Allen inferences with a metrical model of imagery
We present a simple metrical representation and algorithm to explain putative imagery processes underlying the empirical mental model preferences found by Knauff, Rauh and Schlieder (1995) for Allen inferences (Allen, 1983). The computational theory is compared with one based on ordinal information only (Schlieder, in preparation). Both provide good fits with the data. They differ psychologically in background theories, visualisation strategies motivated by these, and model construction processes generating models with the properties indicated as desirable by the strategies. They differ computationally in assumptions about knowledge strength (ordinal: weaker) and algorithmic simplicity (metrical: simpler). Our theory and its comparison with the ordinal theory provide the basis for a discussion of issues pertaining to imagery in general: Using the assumption of imagery inexactness, we develop a sketch theory of mental images and motivate a new visualisation strategy ('regularisation'). We demonstrate systematic methods of modelling imagery processes and of analysing such models. We also outline some criteria for comparison (and future integration?) of cognitive modelling approaches.
Can We Unmask the Phonemic Masking Effect? The Problem of Methodological Divergence
In studying cognition, we infer the presence of mental structures in an idealized setting from performance in various experimental settings. Although experimental settings are believed to tap the mental structure of interest, they also always reflect idiosyncratic task-pecific properties. Indeed, distinct methods often diverge in their outcomes. How can we assess the presence of the mental structure in the idealized setting given divergent outcomes of distinct methods? We illustrate this problem in a specific example concerning the contribution of phonology in reading. Evidence for the role of phonology in the "idealized" reading setting is assessed by different methods. Methods of masked and unmasked display disagree in their outcomes. The contribution of phonology appears robust under masking, but limited under unmasked display. We outline two alternative explanations for the robustness of phonological effects under masking. On one view, phonemic masking effects are a true reflection of early reading stages (Berent & Perfetti, 1995). Conversely, Verstaen et al. (1995) argue (1) that masking overestimates the contribution of phonology and (2) that phonemic masking effects are eliminated by a manipulation that discourages reliance on phonology. We demonstrate that (2) is incorrect, but (1) cannot be resolved empirically.
The Evaluation of the Communicative Effect
Aim of our research is an analysis of the inferential processes involved in a speaker's evaluation of the communicative effect achieved on a hearer. We present a computational model where such evaluation process relies on two main factors which may vary according to their strength: 1. the verbal commitment of the hearer to play his role in the behavioral game actually bid by the speaker, 2. the personal beliefs of the speaker concerning bearer's beliefs. The hypothesis was tested as follows. First, we devised a questionnaire in order to collect human subjects' evaluations of communicative effects. Subjects were required to consider some scenarios and to identify themselves with a speaker. Their task was to evaluate, for each scenario, the conmunicative effect they had reached on the hearer (acceptance to play the game, refusal, or indecision). Then, we implemented our computational model in a connectionist network; we chose a set of input variables whose combination describes all the scenarios, and we used part of the experimental data to train the network. Finally, we compared the outputs of the network with the evaluations performed by the human subjects. The results are satisfactory.
Computational Power and Realistic Cognitive Development
We explore the ability of a static connectionist algorithm to model children's acquisition of velocity, time, and distance concepts under architectures of different levels of computational power. Diagnosis of rules learned by networks indicated that static networks were either too powerful or too weak to capture the developmental course of children's concepts. Networks with too much power missed intermediate stages; those with too little power failed to reach terminal stages. These results were robust under a variety of learning parameter values. We argue that a generative connectionist algorithm provides a better model of development of these concepts by gradually increasing representational power.
Learning Qualitative Relations in Physics with Law Encoding Diagrams
This paper describes a large scale experiment that evaluates the effectiveness of Law Encoding Diagrams (LEDs) for learning qualitative relations in the domain of elastic collisions in physics. A LED is a representation that captures the laws or important relations of a domain in the internal structure of a diagram by means of diagrammatic constraints. The subjects were 88 undergraduate physics students, divided into three learning trial conditions. One group used computer based LEDs, another used conventional computer based representations (tables and fonnulas), and the third was a nonintervention control group. Only the LED subjects had a significant improvement in their pre-test to post-test qualitative reasoning. The LEDs appear to make it easier for subjects to explore more of the space of different forms of collisions and hence gain a better qualitative understanding of the domain.
Building A Baby
We show how an agent can acquire conceptual knowledge by sensorimotor interaction with its environment. The method has much in common with the notion of image-schemas, which are central to Mandler's theory of conceptual development. We show that Mandler's approach is feasible in an artificial agent.
The Iteration of Concept Combination in Sense Generation
We report work in progress on the computational modelling of a theory of concepts and concept combination. The sense generation approach to concepts provides a perspicuous way of treating a range of recalcitrant concept combinations: privative combinations {e.g., fake gun, stone lion, apparent friend). We argue that a proper treatment of concept combination must respect important syntactic constraints on the combination process, the simplest being the priority of syntactic modifier over the head in case of conflicts. We present a model of privative concept combinations based on the sense generation approach. The model was developed using cogent, an object-oriented modelling environment designed to simplify and clarify the implementation process by minimising the 'distance' between the box/arrow 'language' of psychological theorising and the theory's implementation. In addition to simple privatives (i.e., ones with a single modifier, like fake gun) the model also handles iterated, or complex, privative combinations (i.e., ones with more than one modifier, like fake stone lion), and reflects their associated modification ambiguities. We suggest that the success of this model reflects both the utility of COGENT as a modelling framework and the adequacy of sense generation as a theory of concept combination.
Sociocultural Approaches to Analyzing Cognitive Development in Interdisciplinary Teams
This paper considers whether a sociocultural theory of cognition can supply a suitable perspective for analyzing the nature of interdisciplinary collaboration within groups in the National Institute for Science Education (NISE). We discuss the metaphors of apprenticeship and voice in conversation to identify relevant elements of analysis in group discourse. The NISE group shows evidence of cognitive apprenticeship and of multiple voicedness, but the theories do not fully explain the impact of interdisciplinary interaction on group cognitive development. Although both the apprenticeship metaphor and the voice metaphor provide useful tools for analysis, it would be useful to have a metaphor that deals more directly with interaction among members of equal status from mature communities of practice.
Modeling Qualitative Differences in Symmetry Judgments
Symmetry perception is an important cognitive process across many areas of cognition. This research explores symmetry as a special case of similarity—self-similarity—and proposes that qualitative relationships play a role in the early perception of symmetry. To support this claim, we present evidence from two psychological studies where subjects performed synmietry judgments for randomly constructed polygons. Subjects were faster and/or more accurate at detecting asymmetry for stimuli with qualitative asymmetries than for stimuli with equivalent quantitative asymmetries. Aspects of this effect are replicated using the MAGI computational model, which detects symmetry using a method of structural alignment. The results of this study suggest that qualitative information influences early perception of symmetry, and provides further support for the MAGI model.
Unification of Language Understanding, Device Comprehension and Knowledge Acquisition
Cognitive agents often acquire knowledge of how devices work by reading a book. We describe a computational theory of understanding a natural language description of a device, comprehending how the device works, and acquiring a device model. The theory posits a complex interplay between language, memory, comprehension, problem-solving and learning faculties. Long-term memory contains cases of previously encountered devices and associated structure-behavior-function (SBF) models that explain how the known device works. Language processing is both bottom-up and top-down. Bottom-up processing is done through spreading-activation networks. Where the semantics of the nodes and links in the network arises from the SBF ontology. The comprehension process constructs a SBF model for the new device by adapting the known device models - we call this process adaptive modeling. This multifaculty computational theory is instantiated in an operational computer system called KA that (i) reads and understands English language descriptions of devices from David Macaulay's popular science book The Way Things Work, (ii) comprehends how the described device works, and (iii) acquires a SBF model for the device.
Cognitive Modeling of Action Selection Learning
Our goal is to develop a hybrid cognitive model of how humans acquire skills on complex cognitive tasks. We aie pursuing this goal by designing hybrid computational architectures for the NRL Navigation task, which requires competent sensorimotor coordination. In this paper, we describe results of directly fitting human execution data on this task. We next present and then empirically compare two methods for modeling control knowledge acquisition (reinforcement learning and a novel variant of action models) with human learning on the task. The paper concludes with an experimental demonstration of the impact of background knowledge on system performance. Our results indicate that the performance of our action models approach more closely approximates the rate of human learning on this task than does reinforcement learning.
The Effect of Selection Instructions on Reasoning about Thematic Content Rules in Wason's Card Selection Task
This study examined the effects of selection instruction and thematic content on subjects' reasoning performance on the Wason card selection task. Facilitation has frequently been demonstrated when subjects arc instructed to check for violations of a conditional rule that involves thematic content. We noted that the thematic rules previously used are also pragmatic rules that express regulations. We compared reasoning about two kinds of thematic rules: pragmatic and nonpragmatic. Subjects were instructed either to determine if the rule has been violated or to determine if the rule is true or false. The results indicate an interaction between instruction type and thematic rule type. Contrary to previous findings of facilitation on thematic materials with violation instructions, we found facilitation for true/false instructions relative to violation instructions on non-pragmatic content rules. These results stand in contrast to previous descriptions of true/false instructions as more difficult and cognitively demanding than violation instructions. We explain our findings in terms of differences in the inherent status of the two types of thematic rules.
Integration and Shielding of Regular and Irregular Items in MLPs
Multi-layer perceptrons (MLPs) can learn both regular and irregular items given sufficient interleaved training, but not from sequential presentation of items. McClelland, McNaughton and O'Reilly (1994) addressed this problem in their proposal that the hippocampus and neocortex (H/NC) form a two component memory system in which the hippocampus interleaves training of items to the neocortex so that it can develop structure without interference of later items on earlier ones. We have been studying such an interleaving system under the constraint of limiting the capacity of the training batch (analogous to a finite limit on the hippocampus). In previous simulations (Gray & Wiles, 1996) we demonstrated that a quasi-regular learning task trained with a recency rehearsal scheme did not suffer interference to a catastrophic level, but did suffer interference on irregular and similar regular items. The current study introduces a new rehearsal scheme in which items are retained in a finite training batch based on how well the MLP has learned them: Error rehearsal enabled the MLP to learn (1) a high proportion of the domain, (2) retention of both regular and irregular items from the initial training batch and (3) partial shielding of both regular and irregular items from later interference. The results demonstrate that although finite training batches can pose a problem for MLPs, an error rehearsal scheme can reduce interference on both regular and irregular items, even when they are no longer in the current training batch. Implications for the role of the hippocampus in interleaving items for the neocortex are discussed.
Weighting in Similarity Judgements: Investigating the "MAX Hypothesis"
Most models of similarity assume differential weights for the represented properties. However, comparatively little work has addressed the issue of how the cognitive system assigns these weights. Of particular interest to the modelling of similarity are factors which arise from the comparison process itself. One such factor is defined by Goldstone, Medin & Centner's (1991) 'MAX Hypothesis'. We present a series of experiments which clarify the main components of 'MAX' and examine its scope.
Incremental Centering and Center Ambiguity
In this paper, we present a model of anaphor resolution within the framework of the centering model. The consideration of an incremental processing mode introduces the need to manage structural ambiguity at the center level. Hence, the centering framework is further refined to account for local and global parsing ambiguities which propagate up to the level of center representations, yielding moderately adapted data structures for the centering algorithm.
A Connectionist Architecture with Inherent Systematicity
For connectionist networks to be adequate for higher level cognitive activities such as natural language interpretation, they have to generalize in a way that is appropriate given the regularities of the domain. Fodor and Pylyshyn (1988) identified an important pattern of regularities in such domains, which they called systematicity. Several attempts have been made to show that connectionist networks can generalize in accordance with these regularities, but not to the satisfaction of the critics. To address this challenge, this paper starts by establishing the implications of systematicity for connectionist solutions to the variable binding problem. Based on the work of Hadley (1994a), we argue that the network must generalize information it learns in one variable binding to other variable bindings. We then show that temporal synchrony variable binding (Shastri and Ajjanagadde, 1993) inherently generalizes in this way. Thereby we show that temporal synchrony variable binding is a connectionist architecture that accounts for systematicity. This is an important step in showing that connectionism can be an adequate architecture for higher level cognition.
Empirical Evidence for Constraint Relaxation Insight Problem Solving
Using a new developed task environment that allows to control for depth and width of problem space (Match Stick Algebra problems), three experiments were conducted to investigate the role of implicit constraints in insight problem solving. The first experiment showed that constraints caused by prior knowledge of common algebra lead to large differences in solution times, when they were encountered for the first time. No differences were found after the constraints had been relaxed. In the second experiment complimentary moves had to be applied in two different equation structures, one similiar to common algebra, one dissimiliar to common algebra. Consistent with our predictions different problem structures lead to a reversed order of task difficulty for the same moves depending on the activation of prior knowledge from real algebra. In the third experiment it was shown that a re-distribution of activation in a network causes the removing of constraints. Non-detectable priming of the solution lead to significantly more solutions in the experimental group as compared to a control group.
Context Effects on Problem Solving
Context effects on problem solving demonstrated so far in the literature are the result of systematic manipulation of some supposedly irrelevant to the solution elements of the problem description. Little attention has been paid to the role of casual entities in the environment which are not part of the problem description, but which might influence the problem solving process. The main purpose of the current paper is to avoid this limitation and to study the context effects (if any) caused by such accidental elements from the problem solver's environment and in this way to test the predictions made by the dynamic theory of context and its implementation in the DUAL cognitive architecture. Two experiments have been performed. In Experiment I the entities whose influence is being tested are part of the illustrations accompanying the target problem descriptions and therefore they belong to the core of the context, while in Experiment II the tested entities are part of the illustrations accompanying other problems' descriptions, they are accidental with respect to the target problem and therefore they possibly belong to the periphery of the context (if a context effect could be demonstrated at all). The results demonstrate both near and far context effects on problem solving caused by core (Experiment I) and peripheral elements (Experiment II) of the perception-induced context, respectively.
Linking Adaption and Similarity Learning
The case-based reasoning (CBR) process solves problems by retrieving prior solutions and adapting them to fit new circumstances. Many studies examine how casebased reasoners learn by storing new cases and refining the indices used to retrieve cases. However, little attention has been given to learning to refine the process for applying retrieved cases. This paper describes research investigating how a case-based reasoner can learn strategies for adapting prior cases to fit new situations, and how its similarity criteria may be refined pragmatically to reflect new capabilities for case adaptation. We begin by highlighting psychological research on the development of similarity criteria and summarizing our model of case adaptation learning. We then discuss initial steps towards pragmatically refining similarity criteria based on experiences with case adaptation.
Lifelong science learning: A longitudinal case study
How do students link school and personal experiences to develop a useful account of complex science topics? Can science courses provide a firm foundation for lifelong science learning? To answer these questions we analyze how "Pat" integrates and differentiates ideas and develops models to explain complex, personally-relevant experience with thermal phenomena. We examine Pat's process of conceptual change during an 8th grade science class where a heat flow model of thermal events is introduced as well as after studying biology in ninth grade and after studying chemistry in the 11th grade. Pat regularly links new ideas from science class and personal experience to explain topics like insulation and conduction or thermal equilibrium. Thus Pat links experience with home insulation to experiments using wool as an insulator. This linkage leads Pat to consider "air pockets" as a factor in insulation and to distinguish insulators (with air pockets) from metal conductors that "attract heat." These linkages help Pat construct a heat flow account of thermal events and connect it to the microscopic model introduced in chemistry. Pat's process of conceptual change demonstrates how longitudinal case studies contribute to the understanding of conceptual development. Future work will synthesize the conceptual change process of all 40 students we have studied longitudinally.
Dissociating Semantic and Associative Word Relationships Using High-Dimensional Semantic Space
The Hyperspace Analogue to Language (HAL) model is a methodology for capturing semantics from a corpus by analysis of global co-occurrence. A priming experiment from Lund et al. (1995) which did not produce associative priming with humans or in the HAL simulation is repeated with rearranged control trials. Our experiment now finds associative priming with human subjects, while the HAL simulation again does not produce associative priming. Associative word norms are examined in relation to HAL's semantics in an attempt to illuminate the semantic bias of the model. Correlations with association norms are found in the temporal sequence of words within the corpus. When the associative norm data are split according to simulation semantic distances, a minority of the associative pairs that are close semantic neighbors are found to be responsible for this correlation. This result suggests that most associative information is not carried by temporal word sequence in language. This methodology is found to be useful in separating typical "associative" stimuli into pure-associative and semantic-associative subsets. The notion that associativity can be characterized by temporal association in language receives little or no support from our corpus analysis and priming experiments. The extent that "word associations" can be characterized by temporal association seems to be more a function of semantic neighborhood which is a reflection of semantic similarity in HAL's vector representations.
Inferential Realization Constraints on Functional Anaphora in the Centering Model
We present an inference-based text understanding methodology for the resolution of functional anaphora in the context of the centering model. A set of heuristic realization constraints is proposed, which incorporate language-independent conceptual criteria (based on the well-formedness and conceptual strength of role chains in a terminological knowledge base) and language-dependent information structure constraints (based on topic/comment or theme/rheme orderings). We state text-grammatical predicates for functional anaphora and then turn to the procedural aspects of their evaluation within the framework of an actor-based implementation of a lexically distributed text parser.
On the Nature of Timing Mechanisms in Cognition
The ability to resolve timing differences within and between patterns is critical to the perception of music and speech; similarly, many motor skills such as music performance require fine temporal control of movements. Two important issues concern (1) the nature of the mechanism used for time measurement and (2) whether timing distinctions in perception and motor control are based on the same mechanism. In this paper, clock- and entrainment-based conceptions of time measurement are discussed; and predictions of both classes of model are then evaluated with respect to a tempo-discrimination experiment involving isochronous auditory sequences. The results from this experiment are shown to favor entrainment- over clock-based approaches to timing. The implications of these data are then discussed with respect to the hypothesized role of the cerebellum in timing.
Emergent Letter Perception: Implementing the Role Hypothesis
Empirical psychological experimentation (very briefly reviewed here) has provided evidence of top-down conceptual constraints on letter perception. The role hypothesis suggests that these conceptual constraints take the form of structural subcomponents (roles) and relations between subcomponents (r-roles). In this paper, we present a fully-implemented computer model based on the role hypothesis of letter recognition. The emergent model of letter perception discussed below offers a cogent explanation of human letter-perception data — especially with regard to error-making. The model goes beyond simple categorization by parsing a letter-form into its constituent parts. As it runs, the model dynamically builds (and destroys) a context-sensitive internal representation of the letter that it is perceiving. The representation emerges as by-product of a parallel exploration of possible categories. The model is able to successfully recognize (i.e., conceptually parse) many diverse letters at the extremes of their categories.
Deafness Drives Development of Attention to Change in the Visual Field
Deaf (n = 37) and hearing (n = 37) subjects ages 6-7, 9-10, and 18 + participated in a visual attention experiment designed to test the hypothesis that vision in the deaf becomes specialized over developmental time to detect change in the visual field. All children, regardless of hearing status, should attend to change in the visual field. However, the differing developmental experiences and sensory "tools" between deaf and hearing create different demands on their visual systems. Hearing individuals may become capable of ignoring many changes in the visual field because they can simulUneously monitor the world auditorially and attend to task-relevant information visually. If so, then deaf individuals may find it difficult to ignore change in the visual field because their visual system must both monitor the world and attend to task-relevant information without simultaneous auditory input. Subjects in this experiment completed two attentional capture tasks in which they searched for a uniquely shaped target in the presence of two irrelevant stimulus manipulations (color or motion). This manipulation was applied to the target on half tbe task trials and to a distractor on tbe other half. Attention to the irrelevant manipulations will create differential reaction times (RTs) when the target is manipulated versus when a distractor is manipulated. Results indicated divergent development between the two groups. Both deaf and hearing children produced differential RTs in the two tasks, while only deaf adults attended to the task-irrelevant changes. Further, while hearing subjects were more affected by motion than color, deaf subjects are more equally affected by both. Results are discussed as compensatory changes in visual processing as a result of auditory deprivation.
Backward Masking Reflects the Processing Demand of the Masking Stimulus
Backward masking is often used to limit visual processing in studies of word recognition, semantic priming, and text processing. However, the manner in which the masking stimulus interferes with perception of the target is not well understood. Several explanations of the backward masking effect are considered, a termination hypothesis, an attention capture hypothesis, and a capacity sharing hypothesis. A point of distinction, the effect of manipulating the processing demands of the masking stimulus, is tested in two experiments. Frequency in print of the masking stimulus is manipulated in a first experiment and both frequency and repetition of the masking stimulus are tested in the second. The results disconfirm two of the hypotheses, termination and attention capture, and support the capacity sharing hypothesis.
The Emergence of Perceptual Category Representations During Early Development: A Connectionist Analysis
A number of recent studies on early categorization suggest that young infants form category representations for stimuli at both global and basic levels of exclusiveness (i.e., mammal, cat). A set of computational models designed to analyze the factors responsible for the emergence of these representations are presented. The models (1) simulated the formation of global-level and basic-level representations, (2) yielded a global-to-basic order of category emergence and (3) revealed the formation of two distinct global-level representations - an initial "self-organizing' perceptual global level and a subsequently "trained" arbitrary (i.e., non-perceptual) global level. Information from the models is used to make a number of testable predictions concerning category development in infants.
Improving the Use of Analogies by Learning to Encode Their Causal Structure
We investigated whether training in how to encode the causal stnicture of problems would improve individual's use of analogies to previously encountered cases. Subjects were trained either in how to encode the causal structure of business cases or given a lecture of equal length on a variety of decision making procedures. They were then asked to study several business cases and their successful solution. One week later, when asked to solve new problems, subjects who were trained in causal analysis, compared to the control group, were more likely to use an appropriate analogy from the previously studied cases (positive transfer) and were less likely to use an inappropriate analogy (negative transfer). Further anaJyses showed that training in causal analysis increased subjects' ability to encode the causal structure of the problem and increased the likelihood of being reminded of the analogy. Thus, the ability to encode causal structure and use analogies appropriately is not responsive only to increasing domain knowledge, but can also be improved by general training in the identification and encoding of the central components of the causal structure of problems.
Confidence Judgements, Performance, and Practice, in Artificial Grammar Learning
Artificial grammar leeiming is noted for the claim that subjects are unaware of their knowledge. Chan (1992) and Dienes et al. (in press) have demonstrated that subjects are unawcire in the sense that they lack metaknowledge. Dissociations between subjects' performance and their confidence in their decisions suggest that the learning mechanism may be in some sense encapsulated from the "confidence system". Here we tested the alternative hypothesis that the confidence system is initially poorly calibrated, or does not know which aspects of the learning mechanism to attend to, by training and testing subjects over four weekly sessions. On all four weeks we found a strong, near-perfect association between confidence and performance for trained subjects, but a dissociation for untrained control subjects. We discuss possible explanations for these results, and previously observed dissociations.
A Symbolic Model of Cognitive Transition
Study of cognitive development on the balance scale task has inspired a wide range of human and computational work. The task requires that children predict the outcome of placing a discrete number of weights at various distances on either side of a fulcrum. The current project examined the adequacy of the symbolic learning algorithm C4.5 as a model of cognitive transition on this task. Based on a set of novel assumptions, our C4.S simulations were able to exhibit regularities found in the human data including orderly stage progression, U-shaped development, and the torque difference effect. Unlike previous successful models of the task, the current model used a single free parameter, is not restricted in the size of the balance scale that it can accommodate, and does not require the assumption of a highly structured output representation or a training environment biased towards weight or distance information. The model makes a number of predictions differing from those of previous computational efforts.
Practice Effects and Learning Control on Acquisition, Outcome, and Efficiency
This paper presents results from a study that attempted to replicate unexpected findings from a previous study (Shute & Gawlick, 1995) which investigated the effects of differential practice opportunities on skill acquisition, outcome, efficiency, and retention. These same variables were examined in a new study (N= 380), and the following results were replicated: (1) Learners receiving fewer practice opportimities completed the curriculum significantly faster than the other practice conditions, but at the expense of greater errors; and (2) Despite acquisition differences, all groups performed comparably on the outcome measure. This study also examines the effects of learner control (LC) on these same parameters. We included a condition where students chose their degree of practice, per problem set. Overall, this group completed the curriculum faster, and showed the highest outcome efficiencies, relative to the other conditions. Preliminary results from the retention part of this study (n = 76) continue to show an overall LC advantage, as well as a significant condition x gender interaction. That is, the LC condition is optimal for males, while the extended practice condition is best for females. We discuss the implications of these findings in relation to the design of efficacious instruction.
A Connectionist Model of Reflective Reasoning Using Temporal Properties of Node Firing
This paper presents a connectionist model of human reasoning that uses temporal relations between node firing. Temporal synchrony is used for representing variable binding and concepts. Temporal succession serves to represent rules by linking antecedent to consequent parts of the rule. The number of successive synchronies is affected by two well-known neurobiological parameters, the frequency of neural rythmic activity and the precision of neural synchronization. Reasoning is predicted to be constrained by these variables. An experiment manipulating the amount of successive synchronies is presented. Experimental results would seem to confirm the predictions.
Culture Enhances the Evolvability of Cognition
This paper discusses the role of culture in the evolution of cognitive systems. We define "culture" as any information transmitted between individuals and between generations by nongenetic means. Experiments are presented that use genetic programming systems that include special mechanisms for cultural transmission of information. These systems evolve computer programs that perform cognitive tasks including mathematical function mapping and action selection in a virtual world. The data show that the presence of culture-supporting mechanisms can have a clear beneficial impact on the evolvability of correct programs. The implications that these results may have for cognitive science are briefly discussed.
Quantifier Interpretation and Syllogistic Reasoning: an Individual Differences Account
It is frequently assumed that interpretational errors can explain reasoning errors. However, the evidence for this position has heretofore been less than convincing. Newstead (1995) failed to show expected relations between Gricean implicatures (Grice, 1975) and reasoning errors, and different measures of illicit conversion (Begg & Denny, 1969; Chapman & Chapman, 1959) frequently fail to correlate in the expected fashion (Newstead, 1989; 1990). This paper examines the relation between interpretation and reasoning using the more configurational approach to classifying subjects' interpretation patterns, described in Stenning & Cox (1995). There it is shown that subjects' interpretational errors tend to fall into clusters of properties defined in terms of rashness, hesitancy and the subject/predicate structure of inferences. First we show that interpretations classified by illicit conversion errors, though correlated with fallacious reasoning, are equally correlated with errors which cannot be due to conversion of premises. Then we explore how the alternative method of subject profiling in terms of hesitancy, rashness and subject/predicate affects syllogistic reasoning performance, through analysis in terms of both general reasoning accuracy and the Figural Effect (Johnson-Laird & Bara, 1984). We show that subjects assessed as rash on the interpretation tasks show consistent characteristic error patterns on the syllogistic reasoning task, and that hesitancy, and possibly rashness, interact with the Figural Effect.
Bottom-up Skill Learning in Reactive Sequential Decision Tasks
This paper introduces a hybrid model that unifies connectionist, symbolic, and reinforcement learning into an integrated architecture for bottom-up skill learning in reactive sequential decision tasks. The model is designed for an agent to learn continuously from on-going experience in the world, without the use of preconceived concepts and knowledge. Both procedural skills and high-level knowledge are acquired through an agent's experience interacting with the world. Computational experiments with the model in two domains are reported.
A Dynamical System for Language Processing
A dynamical systems model of language processing suggests a resolution of the debate about the influences of syntactic and lexical constraints on processing. Syntactic hypotheses are modeled as attractors which compete for the processor's trajectory. When accumulating evidence puts the processor close to an attractor, processing is quick and lexical differences are hard to detect. When the processor lands between several attractors, multiple hypotheses compete and lexical diiferences can tip the balance one way or the other. This approach allows us to be more explicit about the emergent properties of lexicalist models that are hypothesized to account for syntactic effects (MacDonald, Pearlmutter & Seidenbeig, 1994; Trueswell & Tanenhaus, 1994).
A Connectionist Model of Metaphor by Pattern Completion
In this paper we present a simple process model (based on connectionist pattern completion) of A is B metaphor comprehension. The Metaphor by Pattern Completion (MPC) model capitalizes on an existing semantic memory mechanism. Metaphorical enhancement is produced by presenting a semantic vector representation of the target word (A) to a connectionist network storing the knowledge base (B). Effects found in human data such as meaning enhancement, asymmetric processing, context sensitivity and compound indexing all fall naturally out of the pattern completion mechanism. The MPC model suggests a simple way of separating literal from metaphorical statements. It provides a means of predicting when a metaphor will appear to fail. Moreover, we suggest that the mechanism can form the basis of a comparison procedure that supports analogy. The MPC mechanism avoids the problem of identifying which features of a concept are relevant for similarity matching in analogies, because the prior metaphor stage naturally enhances relevant features and suppresses the irrelevant features. The M P C model is both domain general (in that it does not depend on the structure of the metaphor domain) and parsimonious (in that it does not posit metaphor-specific mechanisms).
Multi-Level Analysis of Memory Dissociations
Dissociations between explicit and implicit memory tests, between recollective and automatic retrieval processes, and between memorial states of awareness of past events all suggest that human memory is not a unitary faculty. Memory dissociations reflect the complex relationship between consciousness and memory. To understand such a complex relationship, any single level of analysis is not enough and may be misleading. A multi-level analysis was proposed. One of the most serious problems with the process-dissociation procedure is its failure to separate process level of analysis and memorial awareness level of analysis. One experiment was reported to support the above arguments.
Order Effects and Frequency Learning in Belief Updating
This paper examines order effects and frequency learning in belief updating. We present an experiment that tests for the existence of order effects for actual decisions during frequency learning and for belief evaluations after frequency learning in a realistic tactical decision making task. The experiment revealed that (a) subjects showed order effects for actual decisions during frequency learning—an effect not reported previously and (b) subjects still showed order effects for belief evaluations even after having correctly learned most of the frequency information. We also present a simulation for the frequency learning behavior and some preliminary results of a simulation for the order effect, and suggest networks for potential combinations of the order effect and frequency learning.
Direct Visual Access is the only Way to Access the Chinese Mental Lexicon
We argue for a view that, for written Chinese, direct visual access is the only way to access information stored in the mental lexicon. Phonology plays no role in initial lexical access and has limited effect on access to lexical semantics. Evidence supporting this view is adduced from three sets of experiments that either failed to detect any phonological effect in lexical access, or failed to prove that the phonological effects obtained are pre-lexical in nature, or demonstrate successfully the presence of orthographic effect in lexical access. We conclude that words in the lexicon can be accessed in different ways, depending on the general configurations of the writing systems in different languages.
Society Member Abstracts
A Semantic Markov Field Model of Text Recall
A probabilistic model of text recall is proposed which assigns a probability mass to a given recall protocol. Knowledge analyses of semantic relationships among events identified in the text are used to specify the architecture of the probability model. Twelve subjects (the training data group) were then asked to recall twelve texts from memory. The recall protocols generated by the twelve subjects were then used to estimate the strengths of the semantic relationships in the probabilistic model. The Gibbs Sampler algorithm (a connectionist-like algorithm) was then used to sample from the probabilistic model in order to generate synthesized recall protocols. These synthesized recall protocols were then compared with the original set of recall data and recall data collected from an additional group of twelve human subjects (the test data group).
Vertical Foreshortening Effect and the L Illusion
Researchers have found that a perceptual error (an overestimation of the vertical line in comparison to the horizontal) usually occurs with a range of about 1 1 - 1 5 % for an inverted letter T (IT) figure and about 3 - 9 % for a letter L(L) figure (Avery & Day, 1969; Brosvic & Cohen, 1988; Collani, 1985; Finger & Spelt, 1947; Kunnapas, 1955, 1957, 1958; McBride, Risser, & Slotnick, 1987; Post & Chadeijian, 1987; Ritter, 1917; Rivers, 1901; Schiffinan & T h o m p s o n , 1974; W u n d t , 1859, 1898). Although these two illusory effects are obviously different, they have been considered as the same illusion, namely the vertical-horizontal illusion. Kunnapas (1955) explicitly hypothesized that a part of the illusory effect of the IT figure is caused by the bisection illusion effect. In other words, the difference between those two figures' illusory effects can be explained by the fact that the horizontal line in the IT is bisected by the vertical line. Therefore, to classify these two illusions as one type of illusion became logically acceptable and it has never been challenged. According to the viewemess-thatness-thereness ( V T T ) model (Hui, 1996), the L and the IT figures represent two different spatial relationships. Therefore, they are caused by different inferential contents as well as processes. T h e present paper focuses on the L illusion. According to the V T T model, an L figure would evoke a two-dimensional object representation, in which the vertical line represents its vertical dimension and the horizontal line represents its horizontal dimension, and the two-dimensional object is facing a self-assigned viewer. It resembles a situation, such as a wall which stands in the front of a viewer. Its left edge and the foot line correspond to the two lines of the L figure. Thus, the L illusion might be caused by a vertical foreshortening effect. T o suppwrt this hypothesis, the present researcher reinterpreted the empirical data from an experiment done by Collani(1985). Then, three figures were designed and named as Trapezoid, Triangle, and Fence-like figures. Although each of these figures contains a vertical line (which bisects the horizontal line, just like in an IT figure), they most likely evoke two-dimensional object representations, such as a trapezoid, a triangle, and a fence. Therefore, a vertical foreshortening process would operate as well, producing about 3 - 9 % of illusory effect as the L figiu-e. In other words, the fact that each of their horizontal lines w a s bisected would not cause their illusory effects as s a m e as the IT illusion (about 11-15%). T h e results conftrmed die predictions.
A New Model for the Stroop Effect
In general, the Stroop effect demonstrates our inability to ignore meaningful but irrelevant information. Typically, this effect is explained in terms of speed of processing. For instance, in the color-word Stroop task, words are considered to be processed faster than colors, therefore, the word, which is a valid response, either facilitates or interferes with naming the color. In order to examine which dimension (i.e., color or word) is processed faster in the Stroop task, researchers have varied the stimulus onset asynchrony between the color and word dimensions. This research suggests that m a x i m u m interference and facilitation occur when the t w o dimensions are presented within 1 0 0 m s e c of each other. Interestingly, Stroop interference can be found w h e n the word precedes the color and when the color precedes the word. Although thesefindings d o not support the typical explanation of Stroop processing described above, this research was conducted using non-integrated color-word stimuli. A non-integrated color-word stimulus consists of a color word with a color block. An integrated color-word stimulus is a color word printed in a color. The processing of non-integrated stimuli m a y not be the s a m e as the processing of integrated stimuli. In one experiment, integrated color-word stimuli were presented for varying durations (40 to 1 0 0 0 m s e c ) and then masked. Stimuli consisted of color congruent, color incongruent, and color neutral words (e.g., B O O K , CHAIR, LADDER, T O P ) . Results show that color incongruent stimuli produces significantly longer RTs than color congruent words at the shortest durations of 4 0 and 6 0 msec. Therefore, the Stroop effect appears to occur only w h e n processing time is limited. A second study attempted to replicate these findings in the parafovea. However, parafoveal presentation of integrated color-word stimuli failed to produce Stroop interference. In order to assess whether the lack of Stroop interference was due to spatially distributing attention over an area which limited attentional resources available to a given stimulus or due to the retinal location of the stimulus (i.e., due to acuity issues, etc.), a third study was conducted in which the location of the color-word stimulus was validly cued on 6 7 % of the trials. The results s h o w Stroop interference for validly cued locations. Therefore, failure to find Stroop interference in the second experiment was due to the spreading of attention. These three experiments suggest that Stroop interference occurs during the initial stages of processing and. is depends upon attention resources. In a fourth study, integrated color-word stimuli were presented in the fovea. Stimuli consisted of color words and nonwords. Subjects were asked to respond either word or nonword instead of responding to the color. Results show that color congruent stimuli were identified as words significantly faster than color incongruent words and nonwords. Therefore, color enhanced word processing. Again, thisfindingquestions the relative speed of processing account of Stroop processing. Finally, a fourth experiment used a color-color version of the Stroop task. Subjects were presented t w o blocks of color. The two blocks were either the s a m e color (congruent) or different colors (incongruent). Single blocks of color were presented as the neutral condition. The results show that incongruent color blocks produce Stroop interference. This finding demonstrates Stroop interference with information within the s a m e domain (color) instead of two separate domains (color and word). Thus, these findingssuggest that the Stroop effect not only occurs during the initial stages of processing and depends on attentional resources but that information within the s a m e domain as the target dimension can cause interference and facilitation. A n e w model for Stroop processing is presented to accommodate these findings. Implications for neural network accounts of the Stroop effect are also discussed.