Abstract
Motivated by the method for solving center-based Least Squares—clustering problem (Kogan in Introduction to clustering large and high-dimensional data, Cambridge University Press, 2007; Teboulle in J Mach Learn Res 8:65–102, 2007) we construct a very efficient iterative process for solving a one-dimensional center-based l 1—clustering problem, on the basis of which it is possible to determine the optimal partition. We analyze the basic properties and convergence of our iterative process, which converges to a stationary point of the corresponding objective function for each choice of the initial approximation. Given is also a corresponding algorithm, which in only few steps gives a stationary point and the corresponding partition. The method is illustrated and visualized on the example of looking for an optimal partition with two clusters, where we check all stationary points of the corresponding minimizing functional. Also, the method is tested on the basis of large numbers of data points and clusters and compared with the method for solving the center-based Least Squares—clustering problem described in Kogan (2007) and Teboulle (2007).
Similar content being viewed by others
References
Banerjee A., Merugu S., Dhillon I.S., Ghosh J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)
Ben-Israel, A., Iyigun, C.: Probabilistic D-clustering. J. Classif. 25 (2007). doi:10.1007/s00357-007-0021-y
Boley D.L.: Principal direction divisive partitioning. Data Min. Knowl. Discov. 2, 325–344 (1998)
Boyd D.L., Vandenberghe L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Chaovalitwongse, W.A., Butenko, S., Pardalos, P.M.: Clustering Challenges in Biological Networks. World Scientific (2009)
Clarke F.H.: Generalized gradients an applications. Trans. Am. Math. Soc. 205, 247–262 (1975)
Clarke F.H.: Optimization and Nonsmooth Analysis. SIAM, Philadelphia (1990)
Cupec R., Grbić R., Sabo K., Scitovski R.: Three points method for searching the best least absolute deviations plane. Appl. Math. Comput. 215, 983–994 (2009)
Demidenko E.: Criteria for unconstrained global optimization. J. Optim. Theory Appl. 136, 375–395 (2008)
Dhillon I.S., Mallela S., Kumar R.: A divisive information-theoretic feature clustering algorithm for text classification. J. Mach. Learn. Res. 3, 1265–1287 (2003)
Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means, spectral clustering and normalized cuts. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), August 22–25, 2004, Seattle, Washington, USA, pp. 551–556 (2004)
Domínguez, E., Muñoz, J.: Applying bio-inspired techniques to the p-median problem. IWANN 2005; Computational Intelligence Bioinspired Syst., 8th Int. Workshop Artificial Neural Networks. Lecture Notes in Computer Science, pp. 67–74. Springer, Berlin (2005)
Finkel D.E., Kelley C.T.: Additive scaling and the \({\tt DIRECT}\) algorithm. J. Glob. Optim. 36, 597–608 (2006)
Floudas C.A., Gounaris C.E.: A review of recent advances in global optimization. J. Glob. Optim. 45, 3–38 (2009)
Gaviano M., Lera D.: A global minimization algorithm for Lipschitz functions. Optim. Lett. 2, 1–13 (2008)
Gan G., Ma C., Wu J.: Data Clustering: Theory, Algorithms, and Applications. SIAM, Philadelphia (2007)
Hansen E.R., Walster G.W.: Global Optimization Using Interval Analysis, 2nd edn, Revised and Expanded. Marcel Dekker, New York (2004)
Iyigun, C.: Probabilistic Distance Clustering. Dissertation, Graduate School, New Brunswick, Rutgers (2007)
Iyigun C., Ben-Israel A.: A generalized Weiszfeld method for the multi-facility location problem. Oper. Res. Lett. 38, 207–214 (2010)
Jones D.R., Perttunen C.D., Stuckman B.E.: Lipschitzian optimization without the Lipschitz constant. JOTA 79, 157–181 (1993)
Kogan J.: Introduction to Clustering Large and High-Dimensional Data. Cambridge University Press, Cambridge (2007)
Kogan, J., Nicholas, C., Wiacek, M.: Hybrid Clustering of large high dimensional data. In: Castellanos, M., Berry, M.W. (eds.) Proceedings of the Workshop on Text Mining. SIAM (2007)
Kogan, J., Teboulle, M.: Scaling clustering algorithms with Bregman distances. In: Berry, M.W., Castellanos, M. (eds.) Proceedings of the Workshop on Text Mining at the Sixth SIAM International Conference on Data Mining (2006)
Kogan J., Nicholas C., Wiacek M.: Hybrid clustering with divergences. In: Berry, M.W., Castellanos, M. (eds.) Survey of Text Mining: Clustering, Classification, and Retrieval, 2nd edn, Springer, Berlin (2007)
Leisch F.: A toolbox for K-centroids cluster analysis. Comput. Stat. Data Anal. 51, 526–544 (2006)
Littau D., Boley D.L.: Clustering very large data sets with PDDP. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data: Recent Advances in Clustering, pp. 99–126. Springer, New York (2006)
Pan S., Chen J.S.: Two unconstrained optimization approaches for the Euclidean k-centrum location problem. Appl. Math. Comput. 189, 1368–1383 (2007)
Pardalos P.M., Hansen P.: Data Mining and Mathematical Programming. American Mathematical Society, Providence (2008)
Petrić J., Zlobec S.: Nonlinear Programming (in Croatian). Naučna knjiga, Beograd (1989)
Piyavskij, S. A.: An algorithm for finding the absolute extremum of a function. J. Comput. Math. Math. Phys. 12, 888–896 (1972, in Russian)
Reese, J.: Solution Methods for the p-Median Problem: An Annotated Bibliography. Wiley (2006)
Rodrígues-Chia A.M., Espejo I., Drezner Z.: On solving the planar k-centrum problem with Euclidean distances. Eur. J. Oper. Res. 207, 1169–1186 (2010)
Sabo K., Scitovski R., Vazler I.: Searching for a best LAD-solution of an overdetermined system of linear equations motivated by searching for a best LAD-hyperplane on the basis of given data. J. Optim. Theory Appl. 149, 293–314 (2011)
Sabo K., Scitovski R., Vazler I., Zekić-Sušac M.: Mathematical models of natural gas consumption. Energy Convers. Manag. 52, 1721–1727 (2011)
Sabo K., Scitovski R.: The best least absolute deviations line—properties and two efficient methods. ANZIAM J. 50, 185–198 (2008)
Schöbel A., Scholz D.: The big cube small cube solution method for multidimensional facility location problems. Comput. Oper. Res. 37, 115–122 (2010)
Späth, H.: Cluster-Formation und Analyse. R. Oldenburg Verlag, München (1983)
Su, Z., Kogan, J., Nicholas, C.: Constrained clustering with k-means type algorithms. In: Berry, M.W., Kogan, J. (eds.) Text Mining Applications and Theory, pp. 81–103. Wiley, Chichester (2010)
Teboulle M.: A unified continuous optimization framework for center-based clustering methods. J. Mach. Learn. Res. 8, 65–102 (2007)
Vazler, I., Sabo, K., Scitovski, R.: Weighted median of the data in solving least absolute deviations problems. Commun. Stat. Theory Methods, Art ID 539750 (LSTA-2010-0354) (2011, to appear)
Vivek E.P., Sudha N.: Robust Hausdorff distance measure for face recognition. Pattern Recognit. 40, 431–442 (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the Ministry of Science, Education and Sports, Republic of Croatia, through research grant 235-2352818-1034.
Rights and permissions
About this article
Cite this article
Sabo, K., Scitovski, R. & Vazler, I. One-dimensional center-based l 1-clustering method. Optim Lett 7, 5–22 (2013). https://doi.org/10.1007/s11590-011-0389-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-011-0389-9