Abstract
Formal concept analysis has been successfully applied as a data mining framework whereby target patterns come in the form of intent families and implication bases. Since their extraction is a challenging task, especially for large datasets, parallel techniques should be helpful in reducing the computational effort and increasing the scalability of the approach. In this paper we describe a way to parallelize a recent divide-and-conquer method computing both the intents and the Duquenne-Guiges implication basis of dataset. Wile intents admit a straightforward computation, adding the basis — whose definition is recursive — poses harder problems, in particular, for parallel design. A first, and by no means final, solution relies on a partition of the basis that allows the crucial and inherently sequential step of redundancy removal to be nevertheless split into parallel subtasks. A prototype implementation of our method, called ParCIM, shows a nearly linear acceleration w.r.t. its sequential counter-part.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 1st edn. Morgan Kaufmann Publishers, San Francisco, California, USA (2001)
Wang, J., Han, J., Pei, J.: Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD 2003. Proceedings of the ninth ACM SIGKDD, ACM Press, New York (2003)
Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the Second SIAM International Conference on Data Mining, SIAM (2002)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1997)
Duquenne, V., Guigues, J.L.: Famille minimale d’implications informatives résultant d’un tableau de données binaires. Mathématiques et Sciences Sociales 95, 5–18 (1986)
Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Mach. Learn. 24, 95–122 (1996)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24, 25–46 (1999)
Zaki, M.J.: Parallel and distributed association mining: A survey. IEEE Concurrency 7, 14–25 (1999)
Ganter, B.: Two basic algorithms in concept analysis. Technical Report Preprint 831, Technische Hochschule, Darmstadt, Germany (1984)
Valtchev, P., Missaoui, R., Lebrun, P.: A partition-based approach towards constructing galois (concept) lattices. Discrete Math. 256, 801–829 (2002)
Valtchev, P., Duquenne, V.: Towards scalable divide-and-conquer methods for computing concepts and implications. Preprint accepted to Discrete Applied Mathematics (2006)
Djoufak, J.F.K., Valtchev, P., Djamegni, C.T.: A parallel algorithm for lattice construction. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 249–264. Springer, Heidelberg (2005)
Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge (1990)
Maier, D.: The Theory of Relational Databases. Computer Science Press (1983)
Armstrong, W.W.: Dependency structures of data base relationships. In: IFIP Congress, pp. 580–583 (1974)
Goetz, S.: Algorithms in cgm, bsp and bsp* model: A survey. Technical report, Carleton Unviversity, Ottawa (1997)
Dehne, F., Fabri, A., Rau-Chaplin, A.: Scalable parallel computational geometry for coarse grained multicomputers. Journal of Computational Geometry and Applications 6, 379–400 (1996)
Robson, R.: Using the STL: the C++ standard template library, 2nd edn. Springer, Heidelberg (2000)
Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edn. MIT Press, Cambridge, MA (1999)
Parka, H.K., Chi, D.H., Lee, D.K., Ryu, K.W.: An efficient parallel algorithm for merging in the postal model. ETRI Journal 21, 31–39 (1999)
Djamegni, C.T.: Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays. J. Parallel Distrib. Comput. 64, 345–359 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Djoufak Kengue, J.F., Valtchev, P., Tayou Djamegni, C. (2007). Parallel Computation of Closed Itemsets and Implication Rule Bases. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-540-74742-0_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74741-3
Online ISBN: 978-3-540-74742-0
eBook Packages: Computer ScienceComputer Science (R0)