Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Quadratic mixed integer programming and support vectors for deleting outliers in robust regression

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

We consider the problem of deleting bad influential observations (outliers) in linear regression models. The problem is formulated as a Quadratic Mixed Integer Programming (QMIP) problem, where penalty costs for discarding outliers are used into the objective function. The optimum solution defines a robust regression estimator called penalized trimmed squares (PTS). Due to the high computational complexity of the resulting QMIP problem, the proposed robust procedure is computationally suitable for small sample data. The computational performance and the effectiveness of the new procedure are improved significantly by using the idea of ε-Insensitive loss function from support vectors machine regression. Small errors are ignored, and the mathematical formula gains the sparseness property. The good performance of the ε-Insensitive PTS (IPTS) estimator allows identification of multiple outliers avoiding masking or swamping effects. The computational effectiveness and successful outlier detection of the proposed method is demonstrated via simulated experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agulló, J. (2001). New algorithms for computing the least trimmed squares regression estimator. Computational Statistics and Data Analysis, 36, 425–439.

    Article  Google Scholar 

  • Arthanari, T. S., & Dodge, Y. (1993). Mathematical programming in statistics. New York: Wiley.

    Google Scholar 

  • Atkison, A., & Riani, M. (2000). Robust diagnostic regression analysis. Berlin: Wiley.

    Google Scholar 

  • Bazaraa, M., Shevali, H., & Shelty, C. (1993). Nonlinear programming: Theory and algorithms. New York: Wiley.

    Google Scholar 

  • Camarinopoulos, L., & Zioutas, G. (2002). Formulating robust regression estimation as an optimum allocation problem. Journal of Statistical Computation and Simulation, 72(9), 687–705.

    Article  Google Scholar 

  • Giloni, A., & Padberg, M. (2002). Least trimmed squares regression, least median squares regression, and mathematical programming. Mathematical and Computer Modelling, 35, 1043–1060.

    Article  Google Scholar 

  • Hampel, F. R. (1978). Optimally bounding the gross error sensitivity and influence of position in factor space. In Proceedings of the ASA statistical computing section (pp. 59–64). ASA.

  • Hawkins, D. M. (1994). The feasible solution algorithm for least trimmed squares regression. Data Mining and Knowledge Discovery, 17, 185–196.

    Google Scholar 

  • Hawkins, D. M., Bradu, D., & Kass, G. V. (1984). Location of several outliers in multiple regression data using elemental sets. Technometrics, 26, 197–208.

    Article  Google Scholar 

  • Huber, P. J. (1981). Robust statistics. New York: Wiley.

    Google Scholar 

  • Mangasarian, O. L., & Musicant, D. R. (2000). Robust linear and support vector regression. IEEE Transactions on Patern Analysis and Machine Intelligence, 22, 950–955.

    Article  Google Scholar 

  • Peña, D., & Yohai, V. J. (1999). A fast procedure for outlier diagnostics in large regression problems. Journal of the American Statistical Association, 94, 434–445.

    Article  Google Scholar 

  • Rousseeuw, P. J., & Leroy, A. M. (1987). Robust regression and outlier detection. New York: Wiley.

    Google Scholar 

  • Rousseeuw, P. J., & Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212–223.

    Article  Google Scholar 

  • Rousseeuw, P. J., & Van Driessen, K. (2006). Computing LTS regression for large data sets. Data Mining and Knowledge Discovery, 12, 29–45.

    Article  Google Scholar 

  • Smola, A. J., & Scholkopf, B. (1998). On kernel-based method for pattern recognition, regression, approximation and operator inversion. Algorithmica, 22, 211–231.

    Article  Google Scholar 

  • Vapnick, V. N. (1998). Statistical learning theory. New York: Wiley.

    Google Scholar 

  • Wright, S. J. (2000). On reduced convex qp formulations of monotone lcp problems (Technical Report ANL/MCS-P808-0400). Argonne National Laboratory.

  • Zioutas, G., & Avramidis, A. (2005). Deleting outliers in robust regression with mixed integer programming. Acta Mathematicae Applicatae Sinica, 21, 323–334.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. Pitsoulis.

Additional information

This research has been partially funded by the Greek Ministry of Education under the program Pythagoras II.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zioutas, G., Pitsoulis, L. & Avramidis, A. Quadratic mixed integer programming and support vectors for deleting outliers in robust regression. Ann Oper Res 166, 339–353 (2009). https://doi.org/10.1007/s10479-008-0412-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-008-0412-4

Keywords

Navigation