Generating a Condensed Representation for Association Rules

Nicolas Pasquier¹,
Rafik Taouil²,
Yves Bastide³,
Gerd Stumme⁴ &
…
Lotfi Lakhal⁵

250 Accesses
Explore all metrics

Abstract

Association rule extraction from operational datasets often produces several tens of thousands, and even millions, of association rules. Moreover, many of these rules are redundant and thus useless. Using a semantic based on the closure of the Galois connection, we define a condensed representation for association rules. This representation is characterized by frequent closed itemsets and their generators. It contains the non-redundant association rules having minimal antecedent and maximal consequent, called min-max association rules. We think that these rules are the most relevant since they are the most general non-redundant association rules. Furthermore, this representation is a basis, i.e., a generating set for all association rules, their supports and their confidences, and all of them can be retrieved needless accessing the data. We introduce algorithms for extracting this basis and for reconstructing all association rules. Results of experiments carried out on real datasets show the usefulness of this approach. In order to generate this basis when an algorithm for extracting frequent itemsets—such as Apriori for instance—is used, we also present an algorithm for deriving frequent closed itemsets and their generators from frequent itemsets without using the dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining Association Rules Between Sets of Items in Large Databases. In Proceedings of the SIGMOD Conference, pp. 207–216.
Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the VLDB Conference, pp. 478–499.
Armstrong, W.W. (1974). Dependency Structures of Data Base Relationships. In Proceedings of the IFIP Congress, pp. 580–583.
Baralis, E. and Psaila, G. (1997). Designing Templates for Mining Association Rules. Journal of Intelligent Information Systems, 9(1), 7–32; L. Kerschberg, Z. Ras, and M. Zemankova (Eds.), Kluwer Academic Publishers.
Google Scholar
Bastide, Y., Taouil, Y., Pasquier, N., Stumme, G., and Lakhal, L. (2000). Mining Frequent Patterns with Counting Inference. SIGKDD Explorations, 2(2), 66–75; U. Fayyad, S. Sarawagi and P. Bradley (Eds.), ACMComputer Press.
Bayardo, R.J. (1998). Efficiently Mining Long Patterns from Databases. In Proceedings of the SIGMOD Conference, pp. 85–93.
Bayardo, R.J. and Agrawal, R. (1999). Mining the Most Interesting Rules.In Proceedings of the KDD Conference, pp. 145–154.
Bayardo, R.J., Agrawal, R., and Gunopulos, D. (2000).Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery, 4(2/3), 217–240; S. Chaudhuri (Ed.), Kluwer Academic Publishers.
Beeri, C. and Bernstein, P.A. (1979). Computational Problems Related to the Design of Normal form Relational Schemas. Transactions on Database Systems, 4(1), 30–59.
Google Scholar
Blake, C.L. and Merz, C.J. (1998). UCI Machine Learning Databases Repository. University of California, Irvine, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html.
Brin, S., Motwani, R., and Silverstein, C. (1997.) Beyond Market Baskets: Generalizing Association Rules to Correlation. In Proceedings of the SIGMOD Conference, pp. 265–276.
Brin, S., Motwani, R., Ullman, J.D., and Tsur, S. (1997). Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proceedings of the SIGMOD Conference, pp. 255–264.
Duquenne, V. and Guigues, J.-L. (1986). Famille Minimale d’implications Informatives résultant d’un Tableau de données Binaires.Mathématiques et Sciences Humaines, 24(95), 5–18.
Google Scholar
Ganter, B. and Wille, R. (1999.) Formal Concept Analysis: Mathematical Foundations. Springer-Verlag.
Han, J. and Fu, Y. (1999). Mining Multiple-Level Association Rules in Large Databases. Transactions on Knowledge and Data Engineering, 11(5), 798–804; P.-S. Yu (Ed.), IEEE Computer Science
Hettich, S. and Bay, S.D. (1999.) UCI Knowledge Discovery in Databases Archive. University of California, Irvine, Department of Information and Computer Science,http://kdd.ics.uci.edu
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., and Verkamo, A.I. (1994). Finding interesting Rules from Large Sets of Discovered Association Rules. In Proceedings of the CIKM conference, pp. 401–407.
Lin, D. and Kedem, Z.M. (1998). Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set. In Proceedings of the EDBT Conference, pp. 105–119.
Liu, B., Hsu, W., and Ma, Y. (1999). Pruning and Summarizing the Discovered Association Rules. In Proceedings of the KDD Conference, pp. 125–134.
Luxenburger, M. (1991). Implications Partielles Dans un Contexte. Mathématiques, Informatique et Sciences Humaines, 29(113), 35–55.
Google Scholar
Maier, D. (1980). Minimum Covers in Relational Database Model. Journal of the ACM, 27(4), 664–674; ACM Computer Press.
Google Scholar
Mannila, H. and Toivonen, H. (1996). Multiple Uses of Frequent Sets and Condensed Representations. In Proceedings of the KDD Conference, pp. 189–194.
Mannila, H. and Toivonen, H. (1997). Levelwise Search and Borders of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery, 1(3), 241–258; U. Fayyad (Ed.), Kluwer Academic Publishers.
Mannila, H., Toivonen, H., and Verkamo, A.I. (1994). Efficient Algorithms for Discovering Association Rules. In Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pp. 181–192.
Meo, R., Psaila, G., and Ceri, S. (1998). An Extension to SQL for MiningAssociation Rules. Data Mining and Knowledge Discovery, 2(2), 195–224, U. Fayyad (Ed.), Kluwer Academic Publishers.
Morimoto, Y., Fukuda, T., Matsuzawa, H., Tokuyama, T., and Yoda, K. (1998). Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Proceedings of the VLDB Conference, pp. 380–391.
Ng, R.T., Lakshmanan, V.S., Han, J., and Pang, A. (1998). Exploratory Mining and Pruning Optimizations of Constrained Association Rules. In Proceedings ofthe SIGMOD Conference, pp. 13–24.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1998). Pruning Closed Itemset Lattices for Association Rules. In Proceedings of the BDA Conference, pp. 177–196.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Discovering Frequent Closed Itemsets for Association Rules. In Proceedings of the ICDT Conference, LNCS 1540, pp. 398–416.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems, 24(1), 25–46; M. Jarke and D. Shasha (Eds.), Elsevier Science.
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Closed SetBased Discovery of Small Covers for Association Rules. In Proceedings of the BDA Conference, pp. 361–381.
Pei, J., Han, J., and Mao, R. (2000). Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets. In Proceedings of the DMKD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30.
Piatetsky-Shapiro, G. and Matheus, C.J. (1994). The Interestingness of Deviations. In Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pp. 25–36.
Silberschatz, A. and Tuzhilin, A. (1996). What Makes Patterns Interesting in Knowledge Discovery Systems. Transactions on Knowledge and Data Engineering, 8(6), 970–974; IEEE Computer Science.
Google Scholar
Silverstein, C., Brin, S., and Motwani, R. (1998). Beyond Market Baskets: Generalizing Association Rules to Dependence Rules. Data Miningand Knowledge Discovery, 2(1), 39–68; U. Fayyad (Ed.), Kluwer AcademicPublishers.
Srikant, R. and Agrawal, R. (1995). Mining Generalized Association Rules. In Proceedings of the VLDB Conference, pp. 407–419.
Srikant, R. and Agrawal, R. (1996). Mining Quantitative Association Rules in Large Relational Tables. In Proceedings of the SIGMOD Conference, pp. 1–12.
Srikant, R., Vu, Q., and Agrawal, R. (1997). Mining Association Rules with Item Constraints. In Proceedings of the KDD Conference, pp. 67–73.
Taouil, R., Pasquier, N., Bastide, Y., and Lakhal, L. (2000). Mining Bases for Association Rules Using Closed Sets. In Proceedings of theICDE Conference, p. 307.
Toivonen, H., Klemettinen, M., Ronkainen, P., Hätönen, K.,and Mannila, H. (1995). Pruning and Grouping Discovered Association Rules. In Proceedings of the ECML Workshop, pp. 47–52.
Zaki, M.J. (2000). Generating Non-Redundant Association Rules. In Proceedings of the KDD Conference, pp. 34–43.
Zaki, M.J. and Hsiao, C.-J. (1999). ChARM: An Efficient Algorithm for Closed Association Rule Mining. Technical Report 99-10.
Zaki, M.J. and Ogihara, M. (1998). Theoretical Foundations of Association Rules. In Proceedings of the DMKD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 7:1–7:8.
Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li, W. (1997). New Algorithms for Fast Discovery of Association Rules. In Proceedings of the KDD Conference, pp. 283–286.

Download references

Author information

Authors and Affiliations

I3S (CNRS UMR 6070)—Université de Nice-Sophia Antipolis, 06903, Sophia Antipolis, France
Nicolas Pasquier
LI—Université Francois Rabelais de Tours, 3 place Jean Jaurès, 41000, Blois, France
Rafik Taouil
IRISA—INRIA Rennes, Campus Universitaire de Beaulieu, 35042, Rennes, France
Yves Bastide
Fachbereich Mathematik/Informatik, Universität Kassel, 34121, Kassel, Germany
Gerd Stumme
LIM (CNRS FRE 2246)—Université de la Méditerranée, case 901, 13288, Marseille, France
Lotfi Lakhal

Authors

Nicolas Pasquier
View author publications
You can also search for this author in PubMed Google Scholar
Rafik Taouil
View author publications
You can also search for this author in PubMed Google Scholar
Yves Bastide
View author publications
You can also search for this author in PubMed Google Scholar
Gerd Stumme
View author publications
You can also search for this author in PubMed Google Scholar
Lotfi Lakhal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicolas Pasquier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pasquier, N., Taouil, R., Bastide, Y. et al. Generating a Condensed Representation for Association Rules. J Intell Inf Syst 24, 29–60 (2005). https://doi.org/10.1007/s10844-005-0266-z

Download citation

Received: 25 April 2000
Revised: 02 March 2004
Accepted: 04 March 2004
Issue Date: January 2005
DOI: https://doi.org/10.1007/s10844-005-0266-z

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficiently mining association rules based on maximum single constraints

An Efficient Approach for Extraction Positive and Negative Association Rules from Big Data

Post–mining on Association Rule Bases

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Generating a Condensed Representation for Association Rules

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficiently mining association rules based on maximum single constraints

An Efficient Approach for Extraction Positive and Negative Association Rules from Big Data

Post–mining on Association Rule Bases

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now