research-article

Density estimation trees

Authors:

Alexander G. GrayAuthors Info & Claims

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 627 - 635

https://doi.org/10.1145/2020408.2020507

Published: 21 August 2011 Publication History

Abstract

In this paper we develop density estimation trees (DETs), the natural analog of classification trees and regression trees, for the task of density estimation. We consider the estimation of a joint probability density function of a d-dimensional random vector X and define a piecewise constant estimator structured as a decision tree. The integrated squared error is minimized to learn the tree. We show that the method is nonparametric: under standard conditions of nonparametric density estimation, DETs are shown to be asymptotically consistent. In addition, being decision trees, DETs perform automatic feature selection. They empirically exhibit the interpretability, adaptability and feature selection properties of supervised decision trees while incurring slight loss in accuracy over other nonparametric density estimators. Hence they might be able to avoid the curse of dimensionality if the true density is sparse in dimensions. We believe that density estimation trees provide a new tool for exploratory data analysis with unique capabilities.

References

[1]

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.

[2]

C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.

Digital Library

[3]

C. M. Bishop. Pattern recognition and machine learning. Springer, 2006.

Digital Library

[4]

R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley New York, 1973.

[5]

J. H. Friedman. A Recursive Partitioning Decision Rule for Nonparametric Classification. Transactions on Computers, 1977.

Digital Library

[6]

J. H. Friedman. A tree-structured approach to nonparametric multiple regression. Smoothing Techniques for Curve Estimation, 1979.

[7]

T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.

[8]

L. Wasserman. All of Nonparametric Statistics. Physica-Verlag, 2006.

Digital Library

[9]

C. Cortes and V. N. Vapnik. Support vector networks. Machine Learning, 1995.

Digital Library

[10]

B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.

[11]

H. Liu, J. Lafferty, and L. Wasserman. Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo. In AISTATS, 2007.

[12]

P. Müller and F.A. Quintana. Nonparametric Bayesian data analysis. Statistical Science, 2004.

[13]

A. G. Gray and A. W. Moore. Nonparametric density estimation: Toward computational tractability. In SIAM Data Mining, 2003.

[14]

D. Lee and A. G. Gray. Fast high-dimensional kernel summations using the monte carlo multipole method. In Advances in NIPS, 2008.

[15]

A. Beygelzimer, S. Kakade, and J.C. Langford. Cover Trees for Nearest Neighbor. ICML, 2006.

Digital Library

[16]

P. Ram, D. Lee, W. March, and A. Gray. Linear-time algorithms for pairwise statistical problems. In Advances in NIPS. 2009.

[17]

J. Lafferty and L. Wasserman. Rodeo: Sparse Nonparametric Regression in High Dimensions. Arxiv preprint math.ST/0506342, 2005.

[18]

R. Riegel, A. G. Gray, and G. Richards. Massive-Scale Kernel Discriminant Analysis: Mining for Quasars. In SIAM Data Mining (SDM), 2008.

[19]

W. Guan, M. Zhou, C. Y. Hampton, B. B. Benigno, L. D. Walker, A. G. Gray, J. F. McDonald, and F. M. Fernandez. Ovarian Cancer Detection from Metabolomic Liquid Chromatography/Mass Spectrometry Data by Support Vector Machines. BMC Bioinformatics Journal, 2009.

[20]

R. Kohavi. Scaling up the accuracy of naive-Bayes classifiers: A decision-tree hybrid. In KDD, 1996.

Digital Library

[21]

P. Smyth, A. G. Gray, and U. M. Fayyad. Retrofitting decision tree classifiers using kernel density estimation. In ICML, 1995.

[22]

G. Hooker. Diagnosing extrapolation: Tree-based density estimation. In SIGKDD, 2004.

Digital Library

[23]

A.M. Molinaro, S. Dudoit, and M.J. Van der Laan. Tree-based multivariate regression and density estimation with right-censored data. Journal of Multivariate Analysis, 2004.

Digital Library

[24]

G. Schmidberger and E. Frank. Unsupervised discretization using tree-based density estimation. Lecture Notes in Computer Science, 2005.

Digital Library

[25]

J. Basak and R. Krishnapuram. Interpretable hierarchical clustering by constructing an unsupervised decision tree. IEEE transactions on knowledge and data engineering, 2005.

Digital Library

[26]

T. Seidl, I. Assent, P. Kranen, R. Krieger, and J. Herrmann. Indexing density models for incremental learning and anytime classification on data streams. In ICEDT: Advances in Database Technology, 2009.

Digital Library

[27]

W.H. Wong and L. Ma. Optional Pólya tree and Bayesian inference. The Annals of Statistics, 2010.

[28]

E. Parzen. On the estimation of a probability density function and mode. Annals of Mathematical Statistics, pages 1065--1076, 1962.

[29]

V. N. Vapnik and A. Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 1971.

[30]

L. Devroye, L. Györfi, and G. Lugosi. A probabilistic theory of pattern recognition. Springer Verlag, 1996.

[31]

C.L. Blake and C.J. Merz. UCI repository of machine learning databases. 1998.

[32]

A. G. Gray and A. W. Moore. 'n-body' problems in statistical learning. In Advances in NIPS, 2000.

[33]

R. Lupton, J.E. Gunn, Z. Ivezic, G.R. Knapp, S. Kent, and N. Yasuda. The SDSS Imaging Pipelines. Arxiv preprint astro-ph/0101420, 2001.

Cited By

Spallitta GMasina GMorettin PPasserini ASebastiani R(2024)Enhancing SMT-based Weighted Model Integration by Structure AwarenessArtificial Intelligence10.1016/j.artint.2024.104067(104067)Online publication date: Jan-2024
https://doi.org/10.1016/j.artint.2024.104067
Saha BKrotov DZaki MRam PKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)End-to-end differentiable clustering with associative memoriesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619639(29649-29670)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619639
Blockeel HDevos LFrénay BNanfack GNijssen S(2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
https://doi.org/10.3389/frai.2023.1124553
Show More Cited By

Index Terms

Density estimation trees
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
2. Mathematics of computing
  1. Probability and statistics
    1. Nonparametric statistics

Recommendations

Mixture density estimation with group membership functions

The mixture density model has been extensively studied in the field of statistical pattern recognition. And the EM algorithm has been well known as a convenient and efficient tool to iteratively compute the maximum likelihood estimates of mixture model ...
Ensemble Gaussian mixture models for probability density estimation

Estimation of probability density functions (PDF) is a fundamental concept in statistics. This paper proposes an ensemble learning approach for density estimation using Gaussian mixture models (GMM). Ensemble learning is closely related to model ...
Fuzzy projection pursuit density estimation by eigenvalue method

This paper focuses on the use of kernel method and projection pursuit regression for non-parametric probability density estimation. Direct application of the kernel method is not able to pick up characteristic features of multidimensional density ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2011

1446 pages

ISBN:9781450308137

DOI:10.1145/2020408

General Chair:
Chid Apte
IBM Research
,
Program Chairs:
Joydeep Ghosh
UT Austin
,
Padhraic Smyth
UC Irvine

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '11

Sponsor:

KDD '11: The 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 21 - 24, 2011

California, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

42
Total Citations
View Citations
1,224
Total Downloads

Downloads (Last 12 months)89
Downloads (Last 6 weeks)8

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Spallitta GMasina GMorettin PPasserini ASebastiani R(2024)Enhancing SMT-based Weighted Model Integration by Structure AwarenessArtificial Intelligence10.1016/j.artint.2024.104067(104067)Online publication date: Jan-2024
https://doi.org/10.1016/j.artint.2024.104067
Saha BKrotov DZaki MRam PKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)End-to-end differentiable clustering with associative memoriesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619639(29649-29670)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619639
Blockeel HDevos LFrénay BNanfack GNijssen S(2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
https://doi.org/10.3389/frai.2023.1124553
Kim YPiech CHilliger IKhosravi HRienties BDawson S(2023)The Student Zipf Theory: Inferring Latent Structures in Open-Ended Student Work To Help EducatorsLAK23: 13th International Learning Analytics and Knowledge Conference10.1145/3576050.3576116(464-475)Online publication date: 13-Mar-2023
https://dl.acm.org/doi/10.1145/3576050.3576116
Yang LBaratchi Mvan Leeuwen M(2023)Unsupervised discretization by two-dimensional MDL-based histogramMachine Learning10.1007/s10994-022-06294-6112:7(2397-2431)Online publication date: 16-Feb-2023
https://doi.org/10.1007/s10994-022-06294-6
Inouzhe H(2023)Advances in Cytometry Gating Based on Statistical Distances and DissimilaritiesStatistical Methods at the Forefront of Biomedical Advances10.1007/978-3-031-32729-2_6(115-141)Online publication date: 15-Apr-2023
https://doi.org/10.1007/978-3-031-32729-2_6
Fotiadis DSakellarios APotsika V(2023)Future TrendsMultiscale Modelling in Biomedical Engineering10.1002/9781119819028.ch11(331-354)Online publication date: 5-May-2023
https://doi.org/10.1002/9781119819028.ch11
Alizadeh ASinghal MBehzadan VTavallali PRanganath A(2022)Stochastic Induction of Decision Trees with Application to Learning Haar Trees2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00137(825-830)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00137
Scaldelai DMatioli LSantos S(2022)TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimationJournal of Applied Statistics10.1080/02664763.2022.2159339(1-19)Online publication date: 22-Dec-2022
https://doi.org/10.1080/02664763.2022.2159339
Morales FRamírez JRamos E(2022)A mathematical assessment of the isolation random forest method for anomaly detection in big dataMathematical Methods in the Applied Sciences10.1002/mma.857046:1(1156-1177)Online publication date: 17-Jul-2022
https://doi.org/10.1002/mma.8570
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents