Abstract
Association rule extraction from operational datasets often produces several tens of thousands, and even millions, of association rules. Moreover, many of these rules are redundant and thus useless. Using a semantic based on the closure of the Galois connection, we define a condensed representation for association rules. This representation is characterized by frequent closed itemsets and their generators. It contains the non-redundant association rules having minimal antecedent and maximal consequent, called min-max association rules. We think that these rules are the most relevant since they are the most general non-redundant association rules. Furthermore, this representation is a basis, i.e., a generating set for all association rules, their supports and their confidences, and all of them can be retrieved needless accessing the data. We introduce algorithms for extracting this basis and for reconstructing all association rules. Results of experiments carried out on real datasets show the usefulness of this approach. In order to generate this basis when an algorithm for extracting frequent itemsets—such as Apriori for instance—is used, we also present an algorithm for deriving frequent closed itemsets and their generators from frequent itemsets without using the dataset.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining Association Rules Between Sets of Items in Large Databases. In Proceedings of the SIGMOD Conference, pp. 207–216.
Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the VLDB Conference, pp. 478–499.
Armstrong, W.W. (1974). Dependency Structures of Data Base Relationships. In Proceedings of the IFIP Congress, pp. 580–583.
Baralis, E. and Psaila, G. (1997). Designing Templates for Mining Association Rules. Journal of Intelligent Information Systems, 9(1), 7–32; L. Kerschberg, Z. Ras, and M. Zemankova (Eds.), Kluwer Academic Publishers.
Bastide, Y., Taouil, Y., Pasquier, N., Stumme, G., and Lakhal, L. (2000). Mining Frequent Patterns with Counting Inference. SIGKDD Explorations, 2(2), 66–75; U. Fayyad, S. Sarawagi and P. Bradley (Eds.), ACMComputer Press.
Bayardo, R.J. (1998). Efficiently Mining Long Patterns from Databases. In Proceedings of the SIGMOD Conference, pp. 85–93.
Bayardo, R.J. and Agrawal, R. (1999). Mining the Most Interesting Rules.In Proceedings of the KDD Conference, pp. 145–154.
Bayardo, R.J., Agrawal, R., and Gunopulos, D. (2000).Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery, 4(2/3), 217–240; S. Chaudhuri (Ed.), Kluwer Academic Publishers.
Beeri, C. and Bernstein, P.A. (1979). Computational Problems Related to the Design of Normal form Relational Schemas. Transactions on Database Systems, 4(1), 30–59.
Blake, C.L. and Merz, C.J. (1998). UCI Machine Learning Databases Repository. University of California, Irvine, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html.
Brin, S., Motwani, R., and Silverstein, C. (1997.) Beyond Market Baskets: Generalizing Association Rules to Correlation. In Proceedings of the SIGMOD Conference, pp. 265–276.
Brin, S., Motwani, R., Ullman, J.D., and Tsur, S. (1997). Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proceedings of the SIGMOD Conference, pp. 255–264.
Duquenne, V. and Guigues, J.-L. (1986). Famille Minimale d’implications Informatives résultant d’un Tableau de données Binaires.Mathématiques et Sciences Humaines, 24(95), 5–18.
Ganter, B. and Wille, R. (1999.) Formal Concept Analysis: Mathematical Foundations. Springer-Verlag.
Han, J. and Fu, Y. (1999). Mining Multiple-Level Association Rules in Large Databases. Transactions on Knowledge and Data Engineering, 11(5), 798–804; P.-S. Yu (Ed.), IEEE Computer Science
Hettich, S. and Bay, S.D. (1999.) UCI Knowledge Discovery in Databases Archive. University of California, Irvine, Department of Information and Computer Science,http://kdd.ics.uci.edu
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., and Verkamo, A.I. (1994). Finding interesting Rules from Large Sets of Discovered Association Rules. In Proceedings of the CIKM conference, pp. 401–407.
Lin, D. and Kedem, Z.M. (1998). Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set. In Proceedings of the EDBT Conference, pp. 105–119.
Liu, B., Hsu, W., and Ma, Y. (1999). Pruning and Summarizing the Discovered Association Rules. In Proceedings of the KDD Conference, pp. 125–134.
Luxenburger, M. (1991). Implications Partielles Dans un Contexte. Mathématiques, Informatique et Sciences Humaines, 29(113), 35–55.
Maier, D. (1980). Minimum Covers in Relational Database Model. Journal of the ACM, 27(4), 664–674; ACM Computer Press.
Mannila, H. and Toivonen, H. (1996). Multiple Uses of Frequent Sets and Condensed Representations. In Proceedings of the KDD Conference, pp. 189–194.
Mannila, H. and Toivonen, H. (1997). Levelwise Search and Borders of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery, 1(3), 241–258; U. Fayyad (Ed.), Kluwer Academic Publishers.
Mannila, H., Toivonen, H., and Verkamo, A.I. (1994). Efficient Algorithms for Discovering Association Rules. In Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pp. 181–192.
Meo, R., Psaila, G., and Ceri, S. (1998). An Extension to SQL for MiningAssociation Rules. Data Mining and Knowledge Discovery, 2(2), 195–224, U. Fayyad (Ed.), Kluwer Academic Publishers.
Morimoto, Y., Fukuda, T., Matsuzawa, H., Tokuyama, T., and Yoda, K. (1998). Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Proceedings of the VLDB Conference, pp. 380–391.
Ng, R.T., Lakshmanan, V.S., Han, J., and Pang, A. (1998). Exploratory Mining and Pruning Optimizations of Constrained Association Rules. In Proceedings ofthe SIGMOD Conference, pp. 13–24.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1998). Pruning Closed Itemset Lattices for Association Rules. In Proceedings of the BDA Conference, pp. 177–196.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Discovering Frequent Closed Itemsets for Association Rules. In Proceedings of the ICDT Conference, LNCS 1540, pp. 398–416.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems, 24(1), 25–46; M. Jarke and D. Shasha (Eds.), Elsevier Science.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Closed SetBased Discovery of Small Covers for Association Rules. In Proceedings of the BDA Conference, pp. 361–381.
Pei, J., Han, J., and Mao, R. (2000). Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets. In Proceedings of the DMKD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30.
Piatetsky-Shapiro, G. and Matheus, C.J. (1994). The Interestingness of Deviations. In Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pp. 25–36.
Silberschatz, A. and Tuzhilin, A. (1996). What Makes Patterns Interesting in Knowledge Discovery Systems. Transactions on Knowledge and Data Engineering, 8(6), 970–974; IEEE Computer Science.
Silverstein, C., Brin, S., and Motwani, R. (1998). Beyond Market Baskets: Generalizing Association Rules to Dependence Rules. Data Miningand Knowledge Discovery, 2(1), 39–68; U. Fayyad (Ed.), Kluwer AcademicPublishers.
Srikant, R. and Agrawal, R. (1995). Mining Generalized Association Rules. In Proceedings of the VLDB Conference, pp. 407–419.
Srikant, R. and Agrawal, R. (1996). Mining Quantitative Association Rules in Large Relational Tables. In Proceedings of the SIGMOD Conference, pp. 1–12.
Srikant, R., Vu, Q., and Agrawal, R. (1997). Mining Association Rules with Item Constraints. In Proceedings of the KDD Conference, pp. 67–73.
Taouil, R., Pasquier, N., Bastide, Y., and Lakhal, L. (2000). Mining Bases for Association Rules Using Closed Sets. In Proceedings of theICDE Conference, p. 307.
Toivonen, H., Klemettinen, M., Ronkainen, P., Hätönen, K.,and Mannila, H. (1995). Pruning and Grouping Discovered Association Rules. In Proceedings of the ECML Workshop, pp. 47–52.
Zaki, M.J. (2000). Generating Non-Redundant Association Rules. In Proceedings of the KDD Conference, pp. 34–43.
Zaki, M.J. and Hsiao, C.-J. (1999). ChARM: An Efficient Algorithm for Closed Association Rule Mining. Technical Report 99-10.
Zaki, M.J. and Ogihara, M. (1998). Theoretical Foundations of Association Rules. In Proceedings of the DMKD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 7:1–7:8.
Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li, W. (1997). New Algorithms for Fast Discovery of Association Rules. In Proceedings of the KDD Conference, pp. 283–286.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pasquier, N., Taouil, R., Bastide, Y. et al. Generating a Condensed Representation for Association Rules. J Intell Inf Syst 24, 29–60 (2005). https://doi.org/10.1007/s10844-005-0266-z
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10844-005-0266-z