Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2020408.2020507acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Density estimation trees

Published: 21 August 2011 Publication History

Abstract

In this paper we develop density estimation trees (DETs), the natural analog of classification trees and regression trees, for the task of density estimation. We consider the estimation of a joint probability density function of a d-dimensional random vector X and define a piecewise constant estimator structured as a decision tree. The integrated squared error is minimized to learn the tree. We show that the method is nonparametric: under standard conditions of nonparametric density estimation, DETs are shown to be asymptotically consistent. In addition, being decision trees, DETs perform automatic feature selection. They empirically exhibit the interpretability, adaptability and feature selection properties of supervised decision trees while incurring slight loss in accuracy over other nonparametric density estimators. Hence they might be able to avoid the curse of dimensionality if the true density is sparse in dimensions. We believe that density estimation trees provide a new tool for exploratory data analysis with unique capabilities.

References

[1]
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
[2]
C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
[3]
C. M. Bishop. Pattern recognition and machine learning. Springer, 2006.
[4]
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley New York, 1973.
[5]
J. H. Friedman. A Recursive Partitioning Decision Rule for Nonparametric Classification. Transactions on Computers, 1977.
[6]
J. H. Friedman. A tree-structured approach to nonparametric multiple regression. Smoothing Techniques for Curve Estimation, 1979.
[7]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.
[8]
L. Wasserman. All of Nonparametric Statistics. Physica-Verlag, 2006.
[9]
C. Cortes and V. N. Vapnik. Support vector networks. Machine Learning, 1995.
[10]
B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.
[11]
H. Liu, J. Lafferty, and L. Wasserman. Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo. In AISTATS, 2007.
[12]
P. Müller and F.A. Quintana. Nonparametric Bayesian data analysis. Statistical Science, 2004.
[13]
A. G. Gray and A. W. Moore. Nonparametric density estimation: Toward computational tractability. In SIAM Data Mining, 2003.
[14]
D. Lee and A. G. Gray. Fast high-dimensional kernel summations using the monte carlo multipole method. In Advances in NIPS, 2008.
[15]
A. Beygelzimer, S. Kakade, and J.C. Langford. Cover Trees for Nearest Neighbor. ICML, 2006.
[16]
P. Ram, D. Lee, W. March, and A. Gray. Linear-time algorithms for pairwise statistical problems. In Advances in NIPS. 2009.
[17]
J. Lafferty and L. Wasserman. Rodeo: Sparse Nonparametric Regression in High Dimensions. Arxiv preprint math.ST/0506342, 2005.
[18]
R. Riegel, A. G. Gray, and G. Richards. Massive-Scale Kernel Discriminant Analysis: Mining for Quasars. In SIAM Data Mining (SDM), 2008.
[19]
W. Guan, M. Zhou, C. Y. Hampton, B. B. Benigno, L. D. Walker, A. G. Gray, J. F. McDonald, and F. M. Fernandez. Ovarian Cancer Detection from Metabolomic Liquid Chromatography/Mass Spectrometry Data by Support Vector Machines. BMC Bioinformatics Journal, 2009.
[20]
R. Kohavi. Scaling up the accuracy of naive-Bayes classifiers: A decision-tree hybrid. In KDD, 1996.
[21]
P. Smyth, A. G. Gray, and U. M. Fayyad. Retrofitting decision tree classifiers using kernel density estimation. In ICML, 1995.
[22]
G. Hooker. Diagnosing extrapolation: Tree-based density estimation. In SIGKDD, 2004.
[23]
A.M. Molinaro, S. Dudoit, and M.J. Van der Laan. Tree-based multivariate regression and density estimation with right-censored data. Journal of Multivariate Analysis, 2004.
[24]
G. Schmidberger and E. Frank. Unsupervised discretization using tree-based density estimation. Lecture Notes in Computer Science, 2005.
[25]
J. Basak and R. Krishnapuram. Interpretable hierarchical clustering by constructing an unsupervised decision tree. IEEE transactions on knowledge and data engineering, 2005.
[26]
T. Seidl, I. Assent, P. Kranen, R. Krieger, and J. Herrmann. Indexing density models for incremental learning and anytime classification on data streams. In ICEDT: Advances in Database Technology, 2009.
[27]
W.H. Wong and L. Ma. Optional Pólya tree and Bayesian inference. The Annals of Statistics, 2010.
[28]
E. Parzen. On the estimation of a probability density function and mode. Annals of Mathematical Statistics, pages 1065--1076, 1962.
[29]
V. N. Vapnik and A. Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 1971.
[30]
L. Devroye, L. Györfi, and G. Lugosi. A probabilistic theory of pattern recognition. Springer Verlag, 1996.
[31]
C.L. Blake and C.J. Merz. UCI repository of machine learning databases. 1998.
[32]
A. G. Gray and A. W. Moore. 'n-body' problems in statistical learning. In Advances in NIPS, 2000.
[33]
R. Lupton, J.E. Gunn, Z. Ivezic, G.R. Knapp, S. Kent, and N. Yasuda. The SDSS Imaging Pipelines. Arxiv preprint astro-ph/0101420, 2001.

Cited By

View all
  • (2024)Enhancing SMT-based Weighted Model Integration by Structure AwarenessArtificial Intelligence10.1016/j.artint.2024.104067(104067)Online publication date: Jan-2024
  • (2023)End-to-end differentiable clustering with associative memoriesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619639(29649-29670)Online publication date: 23-Jul-2023
  • (2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2011
1446 pages
ISBN:9781450308137
DOI:10.1145/2020408
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. decision trees
  2. density estimation

Qualifiers

  • Research-article

Conference

KDD '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)89
  • Downloads (Last 6 weeks)8
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing SMT-based Weighted Model Integration by Structure AwarenessArtificial Intelligence10.1016/j.artint.2024.104067(104067)Online publication date: Jan-2024
  • (2023)End-to-end differentiable clustering with associative memoriesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619639(29649-29670)Online publication date: 23-Jul-2023
  • (2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
  • (2023)The Student Zipf Theory: Inferring Latent Structures in Open-Ended Student Work To Help EducatorsLAK23: 13th International Learning Analytics and Knowledge Conference10.1145/3576050.3576116(464-475)Online publication date: 13-Mar-2023
  • (2023)Unsupervised discretization by two-dimensional MDL-based histogramMachine Learning10.1007/s10994-022-06294-6112:7(2397-2431)Online publication date: 16-Feb-2023
  • (2023)Advances in Cytometry Gating Based on Statistical Distances and DissimilaritiesStatistical Methods at the Forefront of Biomedical Advances10.1007/978-3-031-32729-2_6(115-141)Online publication date: 15-Apr-2023
  • (2023)Future TrendsMultiscale Modelling in Biomedical Engineering10.1002/9781119819028.ch11(331-354)Online publication date: 5-May-2023
  • (2022)Stochastic Induction of Decision Trees with Application to Learning Haar Trees2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00137(825-830)Online publication date: Dec-2022
  • (2022)TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimationJournal of Applied Statistics10.1080/02664763.2022.2159339(1-19)Online publication date: 22-Dec-2022
  • (2022)A mathematical assessment of the isolation random forest method for anomaly detection in big dataMathematical Methods in the Applied Sciences10.1002/mma.857046:1(1156-1177)Online publication date: 17-Jul-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media