Improving Parallelism in Structural Data Mining

Min Cai²⁰,
Istvan Jonyer²⁰ &
Marcin Paprzycki²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3911))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

832 Accesses

Abstract

Large amount of data collected daily requires efficient algorithms for its processing. The SUBDUE data mining system discovers substructures in structurally complex data, based on the minimum description length principle. Its parallel implementation, MPI-SUBDUE, was created in 2001 to reduce computation time and/or to deal with larger datasets. In this paper, a new, more efficient implementation of MPI-SUBDUE is introduced. The experimental results show that, for the mutagenesis dataset, the new implementation outperforms the original one by up to 33% and that the performance gain increases with the number of processors used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Generic Methodology for the Design of Parallel Algorithms Based on Pattern Languages

Comparative Study of Parallelism on Data Mining

Extracting Correlated Patterns on Multicore Architectures

References

Cook, D.J., Holder, L.B., Galal, G., Maglothin, R.: Approaches to Parallel Graph-Based Knowledge Discovery. Journal of Parallel and Distributed Computing 61(3), 427–446 (2001)
Article MATH Google Scholar
Karypis, G., Kumar, V.: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices, Version 4.0. University of Minnesota, Department of Computer Science and Engineering, Army HPC Research Center, Minneapolis, MN (1998)
Google Scholar
Karypis, G., Schloegel, K., Kumar, V.: Parallel Graph Partitioning and Sparse Matrix Ordering Library, Ver. 3.1. University of Minnesota, Department of Computer Science and Engineering, Army HPC Research Center, Minneapolis, MN (2003)
Google Scholar
Karypis, G., Kumar, V.: Multilevel K-way Partitioning Scheme for Irregular Graphs. Journal of Parallel and Distributed Computing 48(1), 96–129 (1998)
Article MATH Google Scholar
Karypis, G., Kumar, V.: A Fast and Highly Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing (1998)
Google Scholar
Galal, G.M., Cook, D.J., Holder, L.B.: Improving Scalability in a Knowledge Discovery System by Exploiting Parallelism. In: The Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 171–174 (1997)
Google Scholar
Gorlatch, S.: Send-Receive Considered Harmful: Myth and Realities of Message Passing. ACM Transaction on Programming Languages and Systems 26(1) (January 2004)
Google Scholar
http://cygnus.uta.edu/subdue/
http://www-ai.ijs.si/~ilpnet2/apps/pm.html

Download references

Author information

Authors and Affiliations

Department of Computer Science, Oklahoma State University, Stillwater, Oklahoma, 74078, U.S.A.
Min Cai & Istvan Jonyer
Computer Science Institute, SWPS, 03-815, Warsaw, Poland
Marcin Paprzycki

Authors

Min Cai
View author publications
You can also search for this author in PubMed Google Scholar
Istvan Jonyer
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Paprzycki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computational and Information Sciences, Czestochowa University of Technology, Poland
Roman Wyrzykowski
Computer Science Department,, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra
Poznan Supercomputing and Networking Center, Poland
Norbert Meyer
Informatics & Mathematical Modeling, Technical University of Denmark, 2800, Lyngby, DK, Denmark
Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, M., Jonyer, I., Paprzycki, M. (2006). Improving Parallelism in Structural Data Mining. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2005. Lecture Notes in Computer Science, vol 3911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752578_55

Download citation

DOI: https://doi.org/10.1007/11752578_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34141-3
Online ISBN: 978-3-540-34142-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving Parallelism in Structural Data Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Generic Methodology for the Design of Parallel Algorithms Based on Pattern Languages

Comparative Study of Parallelism on Data Mining

Extracting Correlated Patterns on Multicore Architectures

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving Parallelism in Structural Data Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Generic Methodology for the Design of Parallel Algorithms Based on Pattern Languages

Comparative Study of Parallelism on Data Mining

Extracting Correlated Patterns on Multicore Architectures

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation