Abstract
Clusterwise linear regression consists of finding a number of linear regression functions each approximating a subset of the data. In this paper, the clusterwise linear regression problem is formulated as a nonsmooth nonconvex optimization problem and an algorithm based on an incremental approach and on the discrete gradient method of nonsmooth optimization is designed to solve it. This algorithm incrementally divides the whole dataset into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate good starting points for solving global optimization problems at each iteration of the incremental algorithm. The algorithm is compared with the multi-start Späth and the incremental algorithms on several publicly available datasets for regression analysis.
Similar content being viewed by others
References
Preda, C., Saporta, G.: Clusterwise PLS regression on a stochastic process. Comput. Stat. Data Anal. 49, 99–108 (2005)
Wedel, M., Kistemaker, C.: Consumer benefit segmentation using clusterwise linear regression. Int. J. Res. Mark. 6(1), 45–59 (1989)
Späth, H.: Algorithm 39: clusterwise linear regression. Computing 22, 367–373 (1979)
Späth, H.: Algorithm 48: a fast algorithm for clusterwise linear regression. Computing 29, 175–181 (1981)
Gaffney, S., Smyth, P.: Trajectory clustering using mixtures of regression models. In: Chaudhuri, S., Madigan, D. (eds.) Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, New York, pp. 63–72 (1999)
Zhang, B.: Regression clustering. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM03), pp. 451–458. IEEE Computer Society, Washington, DC (2003)
DeSarbo, W.S., Cron, W.L.: A maximum likelihood methodology for clusterwise linear regression. J. Classif. 5(2), 249–282 (1988)
Garcìa-Escudero, L.A., Gordaliza, A., Mayo-Iscar, A., San Martin, R.: Robust clusterwise linear regression through trimming. Comput. Stat. Data Anal. 54, 3057–3069 (2010)
DeSarbo, W.S., Oliver, R.L., Rangaswamy, A.: A simulated annealing methodology for clusterwise linear regression. Psychometrika 54(4), 707–736 (1989)
Carbonneau, R.A., Caporossi, G., Hansen, P.: Globally optimal clusterwise regression by mixed logical-quadratic programming. Eur. J. Oper. Res. 212, 213–222 (2011)
Caporossi, G., Hansen, P.: Variable neighborhood search for least squares clusterwise regression. Technical Report, G-2005-61, Les Cahiers du GERAD, Montreal (2005)
Bagirov, A.M., Ugon, J., Mirzayeva, H.: Nonsmooth nonconvex optimization approach to clusterwise linear regression problems. Eur. J. Oper. Res. 229, 132–142 (2013)
Bagirov, A.M.: Continuous subdifferential approximations and their applications. J. Math. Sci. 115(5), 2567–2609 (2003)
Bagirov, A.M., Karasozen, B., Sezer, M.: Discrete gradient method: derivative-free method for nonsmooth optimization. J. Optim. Theory Appl. 137(2), 317–334 (2008)
Clarke, F.H.: Optimization and Nonsmooth Analysis. John Wiley, New York (1983)
Bagirov, A.M., Ugon, J.: Piecewise partially separable functions and a derivative-free algorithm for large scale nonsmooth optimization. J. Glob. Optim. 35(2), 163–195 (2006)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
Bache, K., Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA (2013). http://archive.ics.uci.edu/ml
Yeh, I-Cheng: Modeling slump flow of concrete using second-order regressions and artificial neural networks. Cem. Concr. Compos. 29(6), 474–480 (2007)
Cortez, P., Morais, A.: A Data mining approach to predict forest fires using meteorological data. In: Neves, J., Santos, M.F., Machado, J. (eds.) New Trends in Artificial Intelligence, Proceedings of the 13th EPIA 2007-Portuguese Conference on Artificial Intelligence, Guimaraes, pp. 512–523 (2007)
Yeh, I-Cheng: Modeling of strength of high performance concrete using artificial neural networks. Cem. Concr. Res. 28(12), 1797–1808 (1998)
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47(4), 547–553 (2009)
Acknowledgments
The research by A.M. Bagirov was supported under Australian Research Council’s Discovery Projects funding scheme (Project No. DP140103213).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bagirov, A.M., Ugon, J. & Mirzayeva, H.G. Nonsmooth Optimization Algorithm for Solving Clusterwise Linear Regression Problems. J Optim Theory Appl 164, 755–780 (2015). https://doi.org/10.1007/s10957-014-0566-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-014-0566-y