research-article

Robust sparse estimation of multiresponse regression and inverse covariance matrix via the L2 distance

Authors:

Aurelie C. Lozano,

Xinwei DengAuthors Info & Claims

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 293 - 301

https://doi.org/10.1145/2487575.2487667

Published: 11 August 2013 Publication History

Abstract

We propose a robust framework to jointly perform two key modeling tasks involving high dimensional data: (i) learning a sparse functional mapping from multiple predictors to multiple responses while taking advantage of the coupling among responses, and (ii) estimating the conditional dependency structure among responses while adjusting for their predictors. The traditional likelihood-based estimators lack resilience with respect to outliers and model misspecification. This issue is exacerbated when dealing with high dimensional noisy data. In this work, we propose instead to minimize a regularized distance criterion, which is motivated by the minimum distance functionals used in nonparametric methods for their excellent robustness properties. The proposed estimates can be obtained efficiently by leveraging a sequential quadratic programming algorithm. We provide theoretical justification such as estimation consistency for the proposed estimator. Additionally, we shed light on the robustness of our estimator through its linearization, which yields a combination of weighted lasso and graphical lasso with the sample weights providing an intuitive explanation of the robustness. We demonstrate the merits of our framework through simulation study and the analysis of real financial and genetics data.

References

[1]

O. Banerjee, L. Ghaoui, and A. d'Aspremont. Model selection through sparse maximum likelihood estimation. JMLR, 9:485--516, 2008.

Digital Library

[2]

A. Basu, I. R. Harris, N. L. Hjort, and M. C. Jones. Robust and efficient estimation by minimising a density power divergence. Biometrika, 85, 1998.

[3]

R. Beran. Robust location estimates. Annals of Statistics, 5:431--444, 1977.

[4]

P. J. Bickel and E. Levina. Regularized estimation of large covariance matrices. Annals of Statistics, 36(1):199--227, 2008.

[5]

L. Breiman and J. H. Friedman. Predicting multivariate responses in multiple linear regression. JRSS Series B., 1997.

[6]

R. Brem and L. Kruglyak. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA, 102(5):1572--1577, 2005.

[7]

T. T. Cai, H. Li, W. Liu, and J. Xie. Covariate adjusted precision matrix estimation with an application in genetical genomics. Biometrika, 2011.

[8]

E. Candes and T. Tao. The dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35:2313--2351, 2007.

[9]

F. E. Curtis and M. L. Overton. A Sequential Quadratic Programming Algorithm for Nonconvex, Nonsmooth Constrained Optimization. SIAM Journal on Optimization, 22(2):474--500, 2011.

[10]

F. E. Curtis and X. Que. An Adaptive Gradient Sampling Algorithm for Nonsmooth Optimization. Optimization Methods and Software, 2011.

[11]

D. L. Donoho and R. C. Liu. The "automatic" robustness of minimum distance functional. Annals of Statistics, 16:552--586, 1994.

[12]

J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. JASA, 96:1348--1360, 2001.

[13]

M. Finegold and M. Drton. Robust graphical modeling of gene networks using classical and alternative t-distribution. Annals of Applied Statistics, 5(2A):1075--1080, 2011.

[14]

J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432--441, 2008.

[15]

C. Hsieh, M. Sustik, I. Dhillon, and P. Ravikumar. Sparse inverse covariance matrix estimation using quadratic approximation. In NIPS, 2011.

[16]

J. Huang, N. Liu, M. Pourahmadi, and L. Liu. Covariance matrix selection and estimation via penalised normal likelihood. Biometrika, 93(1):85--98, 2006.

[17]

M. Kanehisa, S. Goto, M. Furumichi, M. Tanabe, and M. Hirakawa. Kegg for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acid Research, 31(1):355--360, 2010.

[18]

S. L. Lauritzen. Graphical Models. Oxford: Clarendon Press, 1996.

[19]

W. Lee and Y. Liu. Simultaneous multiple response regression and inverse covariance matrix estimation via penalized gaussian maximum likelihood. J. Multivar. Anal., 2012.

Digital Library

[20]

E. Levina, A. J. Rothman, and J. Zhu. Sparse estimation of large covariance matrices via a nested lasso penalty. Annals of Applied Statistics, 2(1), 2008.

[21]

N. Meinshausen and P. Buhlmann. High-dimensional graphs and variable selection with the lasso. Annals of Statistics, 34(3):1436--1462, 2006.

[22]

S. Negahban, P. Ravikumar, M. Wainwright, and B. Yu. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. CoRR, abs/1010.2731, 2010.

[23]

G. Obozinski, B. Taskar, and M. Jordan. Multi-task feature selection. Tech. report, University of California, Berkeley, 2006.

[24]

P. Z. G. Qian and C. F. J. Wu. Sliced space-filling designs. Biometrika, 96(4):945--956, 2009.

[25]

P. Ravikumar, M. Wainwright, G. Raskutti, and B. Yu. High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence. Electron. J. Statist., 5:935--980, 2011.

[26]

A. Rothman, E. Levina, and J. Zhu. Sparse multivariate regression with covariance estimation. JCGS, 19(4):947--962, 2010.

[27]

D. Scott. Parametric statistical modeling by minimum integrated square error. Technometrics, 43:274--285, 2001.

[28]

K. Sohn and S. Kim. Joint estimation of structured sparsity and output structure in multiple-output regression via inverse-covariance regularization. AISTATS, 2012.

[29]

M. Sugiyama, T. Suzuki, T. Kanamori, M. C. Du Plessis, S. Liu, and I. Takeuchi. Density-difference estimation. NIPS, 25:692--700, 2012.

[30]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1):267--288, 1996.

[31]

H. Wang, G. Li, and G. Jiang. Robust regression shrinkage and consistent variable selection through the LAD-lasso. J. Business and Economics Statistics, 25:347--355, 2007.

[32]

J. Wolfowitz. The minimum distance method. Annals of Mathematical Statistics, 28(1):75--88, 1957.

[33]

M. Yuan and Y. Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19--35, 2007.

Cited By

Xu TDeng PZhang RZhao W(2024)Robust Estimation of Multivariate Time Series Data Based on Reduced Rank ModelJournal of Forecasting10.1002/for.320544:2(474-484)Online publication date: 25-Oct-2024
https://doi.org/10.1002/for.3205
Lian JFreeman LHong YDeng X(2021)Robustness with respect to class imbalance in artificial intelligence classification algorithmsJournal of Quality Technology10.1080/00224065.2021.1963200(1-21)Online publication date: 30-Aug-2021
https://doi.org/10.1080/00224065.2021.1963200
Li CWei FDong WWang XLiu QZhang X(2019)Dynamic Structure Embedded Online Multiple-Output Regression for Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279444641:2(323-336)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1109/TPAMI.2018.2794446
Show More Cited By

Index Terms

Robust sparse estimation of multiresponse regression and inverse covariance matrix via the L2 distance
1. Computing methodologies
  1. Machine learning

Recommendations

Regularized robust estimation of mean and covariance matrix for incomplete data
Highlights
- We consider the robust estimator of mean and covariance for incomplete data.
- We ...
Abstract
This paper considers the robust estimation of the mean and covariance matrix for incomplete multivariate observations with the monotone missing-data pattern. First, we develop two efficient numerical algorithms for the existing robust ...
Robust estimation and variable selection in sufficient dimension reduction

Dimension reduction and variable selection play important roles in high dimensional data analysis. Minimum Average Variance Estimation (MAVE) is an efficient approach among many others. However, because of the use of least squares criterion, MAVE is not ...
Sparse fixed-rank representation for robust visual analysis

Robust visual analysis plays an important role in a great variety of computer vision tasks, such as motion segmentation, pose and face analysis. One of the promising real-world applications is to recover the clean data representation from the corrupted ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2013

1534 pages

ISBN:9781450321747

DOI:10.1145/2487575

Editors:
Rayid Ghani
University of Chicago
,
Ted E. Senator
SAIC
,
Paul Bradley
MethodCare, Inc.
,
Rajesh Parekh
Groupon
,
Jingrui He
Stevens Institute of Technology
,
General Chairs:
Robert L. Grossman
University of Chicago and Open Data Group
,
Ramasamy Uthurusamy
General Motors Corporation (retired)
,
Program Chairs:
Inderjit S. Dhillon
University of Texas
,
Yehuda Koren
Google

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 August 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD' 13

Sponsor:

KDD' 13: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 11 - 14, 2013

Illinois, Chicago, USA

Acceptance Rates

KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
711
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu TDeng PZhang RZhao W(2024)Robust Estimation of Multivariate Time Series Data Based on Reduced Rank ModelJournal of Forecasting10.1002/for.320544:2(474-484)Online publication date: 25-Oct-2024
https://doi.org/10.1002/for.3205
Lian JFreeman LHong YDeng X(2021)Robustness with respect to class imbalance in artificial intelligence classification algorithmsJournal of Quality Technology10.1080/00224065.2021.1963200(1-21)Online publication date: 30-Aug-2021
https://doi.org/10.1080/00224065.2021.1963200
Li CWei FDong WWang XLiu QZhang X(2019)Dynamic Structure Embedded Online Multiple-Output Regression for Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279444641:2(323-336)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1109/TPAMI.2018.2794446
Wu HDeng XRamakrishnan N(2018)Sparse estimation of multivariate Poisson log‐normal models from count dataStatistical Analysis and Data Mining: The ASA Data Science Journal10.1002/sam.1137011:2(66-77)Online publication date: 10-Jan-2018
https://doi.org/10.1002/sam.11370

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten