Nothing Special   »   [go: up one dir, main page]

Skip to main content

Fast Randomized Algorithms for Robust Estimation of Location

  • Conference paper
  • First Online:
Temporal, Spatial, and Spatio-Temporal Data Mining (TSDM 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2007))

Abstract

A fundamental procedure appearing within such clustering methods as k-Means, Expectation Maximization, Fuzzy-C-Means and Minimum Message Length is that of computing estimators of location. Most estimators of location exhibiting useful robustness properties require at least quadratic time to compute, far too slow for large data mining applications. In this paper, we propose O(Dn√n)-time randomized algorithms for computing robust estimators of location, where n is the size of the data set, and D is the dimension.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. P. S. Bradley, O. L. Mangasarian, and W. N. Street. Clustering via concave minimization. Advances in Neural Information Processing Systems, 9:368–374, 1997.

    Google Scholar 

  2. V. Cherkassky and F. Muller. Learning from Data — Concept, Theory and Methods. John Wiley, NY, 1998.

    Google Scholar 

  3. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society B, 39:1–38, 1977.

    MATH  MathSciNet  Google Scholar 

  4. V. Estivill-Castro. Hybrid genetic algorithms are better for spatial clustering. In Proc. 6th Pacific Rim Intern. Conf. Artificial Intelligence PRICAI 2000, Melbourne, 2000. Springer-Verlag Lecture Notes in AI, to appear.

    Google Scholar 

  5. V. Estivill-Castro and J. Yang. A fast and robust general purpose clustering algorithm. In Proc. 6th Pacific Rim Intern. Conf. Artificial Intelligence PRICAI 2000, Melbourne, 2000. Springer-Verlag Lecture Notes in AI, to appear.

    Google Scholar 

  6. U. Fayyad and R. Uthurusamy. Data mining and knowledge discovery in databases. Comm. ACM, 39(11):24–26, Nov. 1996. Special issue on Data Mining.

    Article  Google Scholar 

  7. F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel. Robust Statistics: The Approach Based on Influence Functions. John Wiley, NY, 1986.

    MATH  Google Scholar 

  8. P. J. Huber. Robust Regression. John Wiley, NY, 1981.

    Google Scholar 

  9. M. I. Jordan and R. I. Jacobs. Supervised learning and divide-and-conquer: a statistical approach. In Proc. 10th Intern. Conf. Machine Learning, San Mateo, CA, Morgan Kaufmann, pp. 159–166, 1993.

    Google Scholar 

  10. K. Koperski, J. Han, and J. Adhikary. Mining knowledge in geographical data. Communications of the ACM, to appear.

    Google Scholar 

  11. R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, UK, 1995.

    MATH  Google Scholar 

  12. N. J. Nilsson. Artificial Intelligence — A New Synthesis. Morgan Kaufmann, 1998.

    Google Scholar 

  13. E. M. Reingold, J. Nievergelt, and N. Deo. Combinatorial Algorithms: Theory and Practice. Prentice-Hall, Englewood Cliffs, NJ, 1977.

    Google Scholar 

  14. P. J. Rousseeuw and A. M. Leroy. Robust Regression and Outlier Detection. John Wiley, NY, 1987.

    MATH  Google Scholar 

  15. M. A. Tanner. Tools for Statistical Inference. Springer-Verlag, NY, USA, 1993.

    MATH  Google Scholar 

  16. D. M. Titterington, A. F. M. Smith, and U. E. Makov. Statistical Analysis of Finite Mixture Distributions. John Wiley, UK, 1985.

    MATH  Google Scholar 

  17. D. E. Tyler. Some issues in the robust estimation of multivariate location and scatter. In W. Stahel and S. Weisberg, eds., Directions in Robust Statistics and Diagnostics — Part II, Springer-Verlag, Berlin, pp. 327–336, 1991.

    Google Scholar 

  18. C. S. Wallace. Intrinsic classfication of spatially correlated data. The Computer Journal, 41(8):602–611, 1998.

    Article  MATH  Google Scholar 

  19. R. R. Wilcox. Introduction to Robust Estimation and Hypothesis Testing. Academic Press, San Diego, CA, 1997.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Estivill-Castro, V., Houle, M.E. (2001). Fast Randomized Algorithms for Robust Estimation of Location. In: Roddick, J.F., Hornsby, K. (eds) Temporal, Spatial, and Spatio-Temporal Data Mining. TSDM 2000. Lecture Notes in Computer Science(), vol 2007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45244-3_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-45244-3_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41773-6

  • Online ISBN: 978-3-540-45244-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics