Abstract
We consider some consimilar problems of searching for disjoint clusters in the finite set of points in Euclidean space. The goal is to maximize the minimum subset size so that the value of each intracluster quadratic variation would not exceed a given constant. We prove that all considered problems are NP-hard even on a line.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Rao, M.: Cluster analysis and mathematical programming. J. Amer. Stat. Assoc. 66, 622–626 (1971)
Hansen, P., Jaumard, B., Mladenovich, N.: Minimum sum of squares clustering in a low dimensional space. J. Classif. 15, 37–55 (1998)
Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Math. Program. 79, 191–215 (1997)
Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering large graphs via the singular value decomposition. Mach. Learn. 56, 9–33 (2004)
Arora, S., Raghavan, P., Rao, S.: Approximation schemes for Euclidean k-medians and related problems. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 106–113 (1998)
Kariv, O., Hakimi, S.: An algorithmic approach to network location problems. Part 1: the \(p\)-centers. SIAM J. Appl. Math. 37, 513–538 (1979)
Feder, T., Greene, D.: Optimal algorithms for approximate clustering. In: Proceedings of 20th Annual ACM Symposium on Theory of Computing, pp. 434–444 (1988)
Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the k-center problem. Math. Oper. Res. 10(2), 180–184 (1985)
Kaufman, L., Rousseeuw, P.J.: Clustering by means of Medoids, in statistical data analysis based on the \(L_{1}\)-Norm and related methods. Edited by Dodge, Y., pp. 405–416. Dodge, North-Holland (1987)
Krarup, J., Pruzan, P.: The simple plant location problem: survey and synthesis. Eur. J. Oper. Res. 12(1), 36–81 (1983)
Mirchandani, P., Francis, R. (eds.): Discrete Location Theory. Wiley-Intersience, New York (1990)
Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 642–651 (2001)
Agarwal, P.K., Phillips, J.M.: An efficient algorithm for 2D Euclidean 2-center with outliers. In: Proceedings of 16th Annual European Symposium Algorithms, pp. 64–75 (2008)
McCutchen, R.M., Khuller, S.: Streaming algorithms for k-center clustering with outliers and with anonymity. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX/RANDOM -2008. LNCS, vol. 5171, pp. 165–178. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85363-3_14
Hatami, B., Zarrabi-Zade, H.: A streaming algorithm for 2-center with outliers in high dimensions. Comput. Geom. 60, 26–36 (2017)
Jain, A.K.: Data clustering: 50 years beyond \(k\)-Means. Patt. Recogn. Lett. 31(8), 651–666 (2010)
Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., Herawan, T.: Big data clustering: a review. In: Murgante, B., et al. (eds.) ICCSA 2014, Part V. LNCS, vol. 8583, pp. 707–720. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09156-3_49
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, New York (2013)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)
Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). The MIT Press (2017)
Fu, T.-C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)
Kuenzer, C., Dech, S., Wagner, W.: Remote Sensing Time Series. Remote Sensing and Digital Image Processing, vol. 22. Springer, Switzerland (2015)
Liao, T.W.: Clustering of time series data – a survey. Patt. Recogn. 38(11), 1857–1874 (2005)
de Waal, T., Pannekoek, J., Scholtus, S.: Handbook of Statistical Data Editing and Imputation. Wiley, New Jersey (2011)
Osborne, J.W.: Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data, 1st edn. SAGE Publication, Inc., Los Angeles (2013)
Farcomeni, A., Greco, L.: Robust Methods for Data Reduction. Chapman and Hall/CRC (2015)
Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: Approximation polynomial algorithm for the data editing and data cleaning problem. Patt. Recogn. Image Anal. 27(3), 365–370 (2017)
Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: 1/2-Approximation polynomial-time algorithm for a problem of searching a subset. In: Proceedings of 2017 International MultiConference on Engineering, Computer and Information Sciences (SIBIRCON), 18–22 September 2017, Novosibirsk, Russia, pp. 8–12 (2017)
Kel’manov, A.V., Khamidullin, S.A., Khandeev, V.I., Pyatkin, A.V.: Exact algorithms for two quadratic Euclidean problems of searching for the largest subset and longest subsequence. In: Battiti, R., Brunato, M., Kotsireas, I., Pardalos, P.M. (eds.) LION 2018. LNCS, vol. 11353, pp. 308–318. Springer (2018). https://doi.org/10.1007/978-3-030-05348-2_28
Kel’manov, A.V., Khandeev, V.I., Panasenko A.V.: Exact algorithms for the special cases of two hard to solve problems of searching for the largest subset. In: van der Aalst, W.M.P., et al. (eds.) AIST 2018. LNCS, vol. 11179, pp. 294–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-11027-7_28
Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Khandeev, V.I., Shenmaier, V.V., Shamardin, Yu.V.: An approximation polynomial algorithm for a problem of searching for the longest subsequence in a finite sequence of points in Euclidean space. In: Eremeev, A., Khachay, M., Kochetov, Y., Pardalos, P. (eds.) OPTA 2018. CCIS, vol. 871, pp. 120–130. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93800-4_10
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco (1979)
Acknowledgments
The study presented was supported by the Russian Science Foundation, project 16-11-10041.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kel’manov, A., Khandeev, V., Pyatkin, A. (2019). NP-hardness of Some Max-Min Clustering Problems. In: Evtushenko, Y., Jaćimović, M., Khachay, M., Kochetov, Y., Malkova, V., Posypkin, M. (eds) Optimization and Applications. OPTIMA 2018. Communications in Computer and Information Science, vol 974. Springer, Cham. https://doi.org/10.1007/978-3-030-10934-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-10934-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10933-2
Online ISBN: 978-3-030-10934-9
eBook Packages: Computer ScienceComputer Science (R0)