Abstract
This paper examines whether a classical model could be translated into a PDP network using a standard connectionist training technique called extra output learning. In Study 1, standard machine learning techniques were used to create a decision tree that could be used to classify 8124 different mushrooms as being edible or poisonous on the basis of 21 different Features (Schlimmer, 1987). In Study 2, extra output learning was used to insert this decision tree into a PDP network being trained on the identical problem. An interpretation of the trained network revealed a perfect mapping from its internal structure to the decision tree, representing a precise translation of the classical theory to the connectionist model. In Study 3, a second network was trained on the mushroom problem without using extra output learning. An interpretation of this second network revealed a different algorithm for solving the mushroom problem, demonstrating that the Study 2 network was indeed a proper theory translation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abu-Mostafa, Y. S. (1990) 'Learning from hints in neural networks', Journal of Complexity, 6, pp. 192-198.
Aldenderfer, M. S. and Blashfield, R. K. (1984), Cluster Analysis. (Vol. 07-044), Beverly Hills, CA: Sage Publications.
Andrews, R., Diederich, J. and Tickle, A. B. (1995), 'A survey and critique of techniques for extracting rules from trained artificial neural networks', Knowledge-Based Systems, 8, pp. 373-389.
Bechtel, W., and Abrahamsen, A. (1991), Connectionism and the Mind, Cambridge, MA: Basil Blackwell.
Berkeley, I. S. N., Dawson, M. R. W., Medler, D. A., Schopflocher, D. P. and Hornsby, L. (1995), Density plots of hidden value unit activations reveal interpretable bands, Connection Science, pp. 167-186.
Born, R. (1987), Artificial intelligence: The case against, London: Croom Helm.
Broadbent, D. (1985), 'A question of levels: Comment on McClelland and Rumelhart', Journal of Experimental Psychology: General. 114, pp. 189-192.
Caruana, R. and de Sa, V. R. (1997), 'Promoting poor features to supervisors: Some inputs work better as outputs', in M. C. Mozer, M. I. Jordan and T. Petsche (eds.), Advances in Neural Information Processina Systems 9, Cambridge, MA: MIT Press.
Churchland, P. M. (1985), 'Reduction. qualia, and the direct introspection of brain states', The Journal of Philosophy. LXXXII, pp. 8-28.
Churchland, P. M. (1988), Matter and consciousness. Revised edition, Cambridge, MA: MIT Press.
Churchland, P. M. (1995), The engine of reason. the seat of the soul, Cambridge, MA: MIT Press.
Churchland, P. S., Koch, C. and Sejnowski, T. J. (1990), 'What is computational neuroscience?', in E. L. Schwartz (ed.), Computational Neuroscience, Cambridge, MA: MIT Press.
Churchland, P. S. and Sejnowski, T. J. (1989), 'Neural representation and neural computation', in L. Nadel, L. A. Cooper, P. Culicover, and R. M. Harnish (eds.), Neural Connections. Mental Computation, Cambridge, MA: MIT Press, pp 15-48.
Churchland, P. S. and Sejnowski, T. J. (1992), The computational brain, Cambridge,MA: MIT Press.
Clark, A. (1989), Microcoanition, Cambridge, MA: MIT Press.
Clark, A. (1993), Associative engines, Cambridge, MA: MIT Press.
Crick, F. and Asanuma, C. (1986), 'Certain aspects of the anatomy and physiology of the cerebral cortex', in J. McClelland and D. E. Rumelhart (eds.), Parallel Distributed Processing (Vol. 2), Cambridge, MA: MIT Press.
Dawson, M. R. W. (1990), 'Training networks of value units: Learning in PDP systems with nonmonotonicactivation functions', Canadian Psychology 31(4), pp. 391.
Dawson, M. R. W. (1991), 'The how and why of what went where in apparent motion: Modeling solutions to the motion correspondence process', Psychological Review 98, pp 569-603.
Dawson, M. R. W. (1998), Understanding Cognitive Science. Oxford, UK: Blackwell.
Dawson, M. R. W., Medler, D. A. and Berkeley, I. S. N. (1997), 'PDP networks can provide models that are not mere implementations of classical theories. Philosophical Psychology. 10, 25-40.
Dawson, M. R. W. and Schopflocher, D. P. (1992a), 'Autonomous processing in PDP networks', Philosophical Psychology. 5, pp. 199-219.
Dawson, M. R. W. and Schopflocher, D. P. (1992b), 'Modifying the generalized delta rule to train networks of nonmonotonic processors for pattern classification', Connection Science 4, pp. 19-31.
Dawson, M. R. W. and Shamanski, K. S. (1994), 'Connectionism, confusion and cognitive science', Journal of Intelligent Systems. 4, pp. 215-262.
Dawson, M. R. W., Shamanski, K. S. and Medler, D. A. (1993), From connectionism to cognitive science. Paper presented at the Fifth University of New Brunswick Symposium on Artificial Intelligence, Fredericton, NB.
Douglas, R. J. and Martin, K. A. C. (1991), 'Opening the grey box', Trends In Neuroscience 14, pp. 286-293.
Dreyfus, H. L. and Dreyfus, S. E. (1988), 'Making a mind versus modeling the brain. Artificial intelligence back at the branchpoint', in S. Graubard (ed.), The Artificial Intelligence Debate, Cambridge, MA: MIT Press.
Elman, J. (1990), 'Finding structure in time', Cognitive science 14, pp. 179-211.
Everitt, B. (1980), Cluster Analysis New York: Halsted.
Fodor, J. A. and McLaughlin, B. P. (1990), 'Connectionism and the problem of systematicity: Why Smolensky' solution doesn't work', Cognition 35, pp. 183-204.
Fodor, J. A. and Pylyshyn, Z. W. (1988), 'Connectionism and cognitive architecture', Cognition 28, pp. 3-71.
Gallant, S. I. (1993), Neural network learning and expert systems, Cambridge, MA: MIT Press.
Gailmo, O. and Carlstrom, J. (1995), 'Some experiments using extra output learning to hing multilayer perceptrons', in L. F. Niklasson and M. B. Boden (eds.), Current Trends in Connectionism-Proceedings of the 1995 Swedish Conference on Connectionism, Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 179-190.
Garson, J. W. (1994), 'No representations without rules: The prospects for a compromise between paradigms in cognitive science', Mind and Language 9, pp. 25-37.
Graubard, S. (1988), The artificial intelligence debate, Cambridge, MA: MIT Press.
Hadley, R. F. (1994a), 'Systematicity in connectionist language learning', Minds and Machines 3, pp. 183-200.
Hadley, R. F. (1994b), 'Systematicity revisited: Reply to Christiansen and Chater and Niclasson and van Gelder', Mind and Language 9, pp. 431-444.
Hadley, R. F. (1997), 'Cognition, systematicity, and nomic necessity', Mind and Language 12, pp. 137-153.
Hadley, R. F. and Hayward, M. B. (1997), 'Strong semantic systematicity from Hebbian connectionist learning', Minds and Machines 7, pp. 1-37.
Hanson, S. J. and Burr, D. J. (1990), 'What connectionist models learn: Learning and representation in connectionist networks', Behavioral and Brain Sciences 13, pp. 471-518.
Haugeland, J. (1985), Artificial intelligence: The very idea, Cambridge, MA: MIT Press.
Hecht-Nielsen, R. (1987), Neurocomputing, Reading, MA: Addison-Wesley.
Hinton, G. E. (1986), Learning distributed representations of concepts. Paper presented at the the 8th Annual Meeting of the Cognitive Science Society, Ann Arbor, MI.
Hooker, C. A. (1979), 'Critical notice: R.M. YoshidaŠs Reduction in the Physical Sciences', Dialogue 18, pp. 81-99.
Hooker, C. A. (1981), 'Towards a general theory of reduction', Dialogue 20, pp. 38-59, 201-236, 496-529.
Hopcroft, J. E. and Ullman, J. D. (1979), Introduction to Automata Theorv. Languages. and Computation, Reading. MA: Addison-Wesley.
Horgan, T. and Tienson, J. (1996), Connectionism and the philosophy of psychology, Cambridge, MA: MIT Press.
Kilian, J. and Siegelmann, H. T. (1993), On the power of sigmoid neural networks. Paper presented at the Proceedings of the Sixth ACM Workshop on Computational Learning Theory.
Kremer, S. C. (1995), 'On the computational powers of Elman-style recurrent networks', IEEE Transactions on neural networks 6, pp. 1000-1004.
Lachter, J. and Bever, T. G. (1988), 'The relation between linguistic structure and associative theories of language learning-A constructive critique of some connectionist learning models', Cognition 28, pp. 195-247.
Lincoff, G. H. (1981), National Auduboii Society field guide to North American mushrooms, New York: Alfred A. Knopf Publishers.
Marr, D. (1982), Vision, San Francisco, CA. W.H. Freeman.
McCaughan, D. B. (1997, June 9-12), On the properties of periodic perceptrons. Paper presented at the IEEE/INNS International Conference on Neural Networks (ICNN'97), Houston, TX.
McClelland, J. (1992), 'Can connectionist models discover the structure of natural language?', in R. Morelli, W. M. Brown, D. Anselmi, K. Haberlandt, and D. Lloyd (eds.), Minds, Brains. and Computers: Perspectives in Cognitive Science and Artificial Intelligence, Norwood, NJ: Ablex.
McClelland, J. L., Rumelhart, D. F. and Hinton, G. E. (1986), 'The appeal of parallel distributed processing', in D. Rumelhart and J. McClelland (eds.), Parallel Distributed Processing (Vol. 1), Cambridge, MA: MIT Press.
McCloskey, M. (1991), 'Networks and theories: The place of connectionism in cognitive science', Psychological Science 2, pp. 387-395.
McCulloch, W. S. and Pitts, W. (1943), 'A logical calculus of the ideas immanent in nervous activity', Bulletin of Mathematical Biophysics 5, pp. 115-133.
Medler, D. A. (1998), The crossroads of connectionism: Where do we go from here? Unpublished Doctoral dissertation, University of Alberta, Edmonton, AB.
Michie, D., Speigelhalter, D. J. and Taylor, C. C. (1994), Machine learning, neural and statistical classification. New York, NY: Ellis Horwood.
Milligan, G. W. and Cooper, M. C. (1985), 'An examination of procedures for determining the number of clusters in a data set', Psychometrika 50, pp. 159-179.
Minsky, M. (1972), Computation: finite and infinite machines. London: Prentice-Hall International.
Mozer, M. C. and Smolensky, P. (1989), 'Using relevance to reduce network size automatically', Connection Science 1, pp. 3-16.
Omlin, C. W. and Giles, C. L. (1996), 'Extraction of rules from discrete-time recurrent neural networks', Neural networks 9, pp. 41-52.
Pinker, S. and Prince, A. (1988), 'On language and connectionism: Analysis of a parallel distributed processing model of language acquisition', Cognition 28, pp. 73-193.
Pylyshyn, Z. W. (1984), Computation and cognition, Cambridge, MA.: MIT Press.
Pylyshyn, Z. W. (1991), 'The role of cognitive architectures in theories of cognition', in K. VanLehn (ed.), Architectures For Intelligence, Hillsdale, NJ: Lawrence Eribaum Associates, pp 189-223.
Quinlan, J. R. (1986), 'Induction of decision trees', Machine Learning 1, pp. 81-106.
Ramsey, W., Stich, S. P. and Rumelhart, D. E. (1991), Philosophy and connectionist theory, Hillsdale, NJ: Lawrence Erlbaum Associates.
Ripley, B. D. (1996), Pattern recognition and neural networks. Cambridge, UK: Cambridge University Press.
Rumeihart, D. E., Hinton, G. E. and Williams, R. J. (1986), 'Learning representations by back-propagating errors', Nature 323, pp. 533-536.
Rumelhart, D. E. and McClelland, J. L. (1985), 'Levels indeed! A response to Broadbent', Journal of Experimental Psychology: General 114, pp. 193-197.
Schlimmer, J. S. (1987), Concept acquisition through representational adjustment. Unpublished Doctoral dissertation, University of California Irvine, Irvine, CA.
Schneider, W. (1987), 'Connectionism: Is it a paradigm shift for psychology?', Behavior Research Methods, Instruments and Computers 19, pp. 73-83.
Seidenberg, M. (1993), 'Connectionist models and cognitive theory', Psychological Science 4, pp. 228-235.
Siegelman, H. T. and Sontag, E. D. (1991), 'Turing computability with neural nets', Applied Mathematics Letters 4, pp. 77-80.
Siegelmann, H. T. and Sontag, E. D. (1995), 'On the computational power of neural nets', Journal of Computer and System Sciences 50, pp. 132-150.
Sigelmann, H. T. (1999), Neural Networks and Analog Computation: Beyond the Turing Limit, Boston, MA: Birkhauser.
Smith, B. C. (1996), On the Origin of Objects, Cambridge, MA: MIT Press.
Smolensky, P. (1988), 'On the proper treatment of connectionism', Behavioural and Brain Sciences 11, pp. 1-74.
Stork, D. G. (1997), 'Scientist on the set: An interview with Marvin Minsky', in D. G. Stork (ed.), HAL' Legacy: 2001' Computer as Dream and Reality, Cambridge, MA: MIT Press, pp. 15-32.
Suddarth, S. C. and Kergosien, Y. L. (1990), 'Rule-injection hints as a means of improving network performance and learning time', in L. B. Almeida and C. J. Wellekens (eds.), Neural Networks. Lecture Notes in Computer Science (Vol. 412), Berlin: Springer Verlag, pp. 120-129.
Suddarth, S. C., Sutton, S. A. and Holden, A. D. C. (1988), A symbolic-neural method for solving control problems, Paper presented at the IEEE International Conference on Neural Networks, San Diego, CA.
VanLehn, K. (1991), Architectures for intelligence, Hillsdale, NJ: Lawrence Erlbaum Associates.
Von Eckardt, B. (1993), What is cognitive science?, Cambridge, MA: MIT Press.
Williams, R. and Zipser, D. (1989), 'A learning algorithm for continually running fully recurrent neural networks', Neural Computation 1, pp. 270-280.
Yu, Y.-H. and Simmons, R. F. (1990), 'Extra output based learning', Proceedings of the International Joint Conference on Neural Networks (IJCNN-90) 3, pp. 161-166.
Rights and permissions
About this article
Cite this article
Dawson, M., Medler, D., McCaughan, D. et al. Using Extra Output Learning to Insert a Symbolic Theory into a Connectionist Network. Minds and Machines 10, 171–201 (2000). https://doi.org/10.1023/A:1008313828824
Issue Date:
DOI: https://doi.org/10.1023/A:1008313828824