Abstract
Attribute-oriented generalization (AOG) is a knowledge discovery method that uses generalization to simplify the descriptions of patterns in database data. AOG repeatedly replaces specific values for an attribute with more general concepts according to domain expert defined concept hierarchies. The degree of generalization is controlled by 2 userdefined thresholds. As presented by other researchers, the AOG process does not consider how interesting the results will be to the user. Given a relation retrieved from a database, many different relations can be created by generalization, some of which will be more interesting to the user than others. The attribute selection strategy, the method of choosing the next attribute for generalization, determines which of the many possible relations will be generated and thus can be used to direct the user towards the most interesting relations. We evaluate the performance of ten previously proposed and new attribute selection strategies by applying them to a 10,000 tuple public domain database and an 8,000,000 tuple commercial database. The strategies are compared using criteria that consider their ability to efficiently produce interesting results. We use measures of interestingness that consider the structure of the hierarchies that are used to guide generalization. Based on the comparison of the experimental results, a strategy that considers the complexity of the concept hierarchies was found to provide efficient and effective guidance towards interesting results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
7.0 References
D.B. Barber and H.J. Hamilton, “Attribute Selection Strategies for Attribute-Oriented Generalization,” Proceedings ofthe Canadian AI Conference (AI '96), 429–441.
C. L. Carter, H. J. Hamilton and N, Cercone, “The Software Architecture of DBLEARN,” Technical Report CS-94-04,” University of Regina, 1994.
H. J. Hamilton and D. F. Fudger, “Estimating DBLEARN's Potential for Knowledge Discovery in Databases,” Computational Intelligence, 11(2), 1995, 1–18.
H. J. Hamilton, R.J. Hilderman and N. Cercone, “Attribute-oriented Induction Using Domain Generalization Graphs,” Proceedings of the Eighth International Conference on Tools with Artificial Intelligence, Toulouse, France, 1996, 246–253.
J. Han, Y. Cai and N. Cercone, “Knowledge Discovery in Databases: An Attribute-Oriented Approach,” Proceedings of the 18th VLDB Conference, Vancouver, British Columbia, 1992, 547–559.
M. Klemettimin, H. Mannila, P. Ronkainen, H. Toivonen and A. I. Verkamo, “Finding Interesting Rules from Large Sets of Discovered Association Rules,” in: Adams N.R., Bhargava B.K. and Yesha Y., Eds., Third International Conference on Information and Knowledge Management, ACM Press, Gaitersburg, Maryland, Nov.-Dec., 1994, 401–407.
G.H. John, R. Kohavi G.H. John and L. Pfleger, “Irrelevant Features and the Subset Selection Problem,” in: W.W. Cohen and H. Hirsh, Eds., Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann, San Francisco, CA., 1994, 121–129.
J. A. Major and J. J. Mangano, “Selecting Among Rules Induced from a Hurricane Database,” Knowledge Discovery in Databases: Papers from the 1993 Workshop, Technical Report WS-93-02, AAAI Press, Menlo Park, CA., 1993, 1–13.
C. J. Matheus, G. Piatetsky-Shapiro and D. McNeill, “Selecting and Reporting What is Interesting: The KEFIR Application to Healthcare Data,” in: U. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, Eds., Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, Menlo Park, CA., 1995, 401–419.
T.M. Mitchell, Version Spaces: An Approach to Concept Learning, Ph.d thesis, Stanford University, 1978.
G. Piatetsky-Shapiro; “Discovery, Analysis and Presentation of Strong Rules,” in: G. Piatetsky-Shapiro and W. J. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Menlo Park, CA, 1991, 229–248.
J.R. Quinlan, C4.5: Programs forMachine Learning, Morgan Kaufmann, Los Altos, CA, 1993.
N. Shan, H. J. Hamilton and N. Cercone, “GRG: Knowledge Discovery Using Information Generalization, Information Reduction and Rule Generation,” 7th IEEE International Conference on Tools with Artificial Intelligence, Washington, D.C., November, 1995., 372–379.
A. Silberschatz and A. Tuzhilin, “On Subjective Measures of Interestingness in Knowledge Discovery,” Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, Canada, August, 1995, 275–281.
P. Smyth and R. M. Goodman, “Rule Induction using Information Theory,” in: Piatetsky-Shapiro G. and Frawley W.J., Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Menlo Park, CA, 1991, 159–176.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barber, B., Hamilton, H.J. (1997). A comparison of atttribute selection strategies for attribute-oriented generalization. In: Raś, Z.W., Skowron, A. (eds) Foundations of Intelligent Systems. ISMIS 1997. Lecture Notes in Computer Science, vol 1325. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63614-5_10
Download citation
DOI: https://doi.org/10.1007/3-540-63614-5_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63614-4
Online ISBN: 978-3-540-69612-4
eBook Packages: Springer Book Archive