Data Summarization Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making

Rayner Alfred^22,23 &
Dimitar Kazakov²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4093))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2301 Accesses
12 Citations

Abstract

A new approach is needed to handle huge dataset stored in multiple tables in a very-large database. Data mining and Knowledge Discovery in Databases (KDD) promise to play a crucial role in the way people interact with databases, especially decision support databases where analysis and exploration operations are essential. In this paper, we present related works in Relational Data Mining, define the basic notions of data mining for decision support and the types of data aggregation as a means of categorizing or summarizing data. We then present a novel approach to relational domain learning to support the development of decision making models by introducing automated construction of hierarchical multi-attribute model for decision making. We will describe how relational dataset can naturally be handled to support the construction of hierarchical multi-attribute model by using relational aggregation based on pattern’s distance. In this paper, we presents the prototype of “Dynamic Aggregation of Relational Attributes” (hence called DARA) that is capable of supporting the construction of hierarchical multi-attribute model for decision making. We experimentally show these results in a multi-relational domain that shows higher percentage of correctly classified instances and illustrate set of rules extracted from the relational domains to support decision-making.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Learning Relational Data Based on Multiple Instances of Summarized Data Using DARA

An Efficient Approach of Multi-Relational Data Mining and Statistical Technique

Supervised Pre-processing of Numerical Variables for Multi-Relational Data Mining

References

Bezdek, J.C.: Some new indexes of cluster validiy. IEEE Trans. Syst., Man, Cybern. B 28, 301–315 (1998)
Google Scholar
Marko, B.: 2001. Decision Support. In: Mladenic, D., Lavrač, N., Bohanec, M., Moyle, S. (eds.) Data Mining and Decision Support: Integration and Collaboration, Kluwer Aca. Publishers, Dordrecht (2003)
Google Scholar
Dillon, W., Goldstein, M.: Multivariate analysis, pp. 157–208. John Wiley and Sons, Chichester (1984)
Google Scholar
Džeroski, S., Blockeel, H., Kompare, B., Kramer, S., Pfahringer, B., Van Laer, W.: Experiments in Predicting Biodegradability. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, Springer, Heidelberg (1999)
Google Scholar
Džeroski, S., Lavrač, N. (eds.): Relational Data mining. Springer, Heidelberg (2001)
Google Scholar
Getoor, L., Friedman, N., Koller, D., Pfeffer, A.: Learning Probabilistic relational models. In: Džeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)
Google Scholar
Horvath, T., Wrobel, S., Bohnebeck, U.: Relational instance-based learning with lists and terms. Machine Learning 43(1/2), 53–80 (2001)
Google Scholar
Kirsten, M., Wrobel, S., Horvath, T.: Distance based approaches to relational learning and clustering. In: Džeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)
Google Scholar
Knobbe, A., De Haas, M., Siebes, A.: Propositionalization and aggregates. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 277–288. Springer, Heidelberg (2001)
Google Scholar
Koller, D., Pfeffer, A.: Probabilistic frame-based systems. In: AAAI/IAAI, pp. 580–587 (1998)
Google Scholar
Kramer, S., Lavrač, N., Flach, P.: Propositionalization approaches to relational data mining. In: Džeroski, S., Lavrač, N. (eds.) Relational Data mining, Springer, Heidelberg (2001)
Google Scholar
Krogel, M.A., Rawles, S., Železny, F., Flach, P.A., Lavrač, N., Wrobel, S.: Comparative evaluation of approaches to propositionalization. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 197–214. Springer, Heidelberg (2003)
Google Scholar
Muggleton, S.H., DeRaedt, L.: Inductive Logic programming: Theory and Methods. The Journal of Logic Programming 19 & 20, 629–680 (1994)
Google Scholar
Muggleton, S.H.: Inverse Entailment and Progol. New Generation Computing 13, 245–286 (1995)
Google Scholar
McQueen, J.: Some Methods of classification and analysis of multivariate observations. In: Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–293 (1967)
Google Scholar
Perlich, C., Provost, F.: Aggregation-based feature invention and relational concept classes. In: Proceedings of the Ninth ACM International Conference on Knowledge Discovery and Data Mining (KDD) (2003)
Google Scholar
Perlich, C., Provost, F.: ACORA: Distribution-based aggregation for relational learning from identifier attributes. Journal of Machine Learning (2005)
Google Scholar
Propescul, A., Ungar, L.H., Lawrence, S., Pennock, D.M.: Structural Logistic Regression: Combining relational and statistical learning. In: Proceedings of the workshop on Multi-Relational Data Mining (MRDM-2002), University of Alberta, Edmonton, Canada, July 2002, pp. 130–141 (2002)
Google Scholar
Srinivasan, A., King, R.D.: Feature Construction with Inductive Logic Programming: A Study of Quantitative Predictions of Biological Activity Aided by Structural Attributes. Data Mining and Knowledge Discovery 3(1), 37–57 (1999)
Google Scholar
Srinivasan, A., King, R.D., Bristol, D.W.: An Assessment of ILP-Assisted Models for Toxicology and the PTE-3 Experiment. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, Springer, Heidelberg (1999)
Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Salton, G., Michael, J.: McGill, Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)
Google Scholar
Power, D.J.: Decision Support Systems Glossary (1999), http://DSSResources.COM/glossary/
INSEAD, Decision Sciences. PhD Program Description (2003), http://www.insead.edu/phd/program/decision.htm
Hillier, F.S., Lieberman, G.J.: Introduction to Operation Research. McGraw-Hill, New York (2000)
Google Scholar
Clemen, R.T.: Making Hard Decisions: An introduction to Decision Analysis. Duxbury Press (1996)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concept and Techniques. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Mallach, E.G.: Understanding Decision Support Systems and Expert Systems. Irwin, Burr Ridge (1994)
Google Scholar
DAS, Decision Analysis Software (2001), http://faculty.fuqua.duke.edu/daweb/dasw.htm
Younes, H.L.S.: Current tools for assisting intelligent agents in real-time decision making, MSc Thesis (2001), http://www-2.cs.cmu.edu/~lorens/papers/mscthesis.html
Parmigiani, G.: Modelling in Medical Decision Making: A Bayesian Approach. John Wiley & Sons, Ltd., Chichester (2002)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the International Conference on Very Large Databases, Santiago de Chile, Chile (1994)
Google Scholar
Watanabe, T., Suzuki, H., Takabayashi, L.: Application of prototypeline to chronic hepatitis data. In: Working core of ECML/PKDD 2003 Discovery Challenge, pp. 166–177 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of York, Heslington, York, YO105DD, United Kingdom
Rayner Alfred & Dimitar Kazakov
School of Engineering and Information Technology, On Study Leave from Universiti Malaysia Sabah, 88999, Kota Kinabalu, Sabah, Malaysia
Rayner Alfred

Authors

Rayner Alfred
View author publications
You can also search for this author in PubMed Google Scholar
Dimitar Kazakov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electronic Engineering, The University of Queensland, Queensland, Australia
Xue Li
University of Alberta, Canada
Osmar R. Zaïane
Northwest Polytechnical University, China
Zhanhuai Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alfred, R., Kazakov, D. (2006). Data Summarization Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_97

Download citation

DOI: https://doi.org/10.1007/11811305_97
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data Summarization Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Learning Relational Data Based on Multiple Instances of Summarized Data Using DARA

An Efficient Approach of Multi-Relational Data Mining and Statistical Technique

Supervised Pre-processing of Numerical Variables for Multi-Relational Data Mining

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Data Summarization Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Learning Relational Data Based on Multiple Instances of Summarized Data Using DARA

An Efficient Approach of Multi-Relational Data Mining and Statistical Technique

Supervised Pre-processing of Numerical Variables for Multi-Relational Data Mining

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation