Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2487575.2487667acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Robust sparse estimation of multiresponse regression and inverse covariance matrix via the L2 distance

Published: 11 August 2013 Publication History

Abstract

We propose a robust framework to jointly perform two key modeling tasks involving high dimensional data: (i) learning a sparse functional mapping from multiple predictors to multiple responses while taking advantage of the coupling among responses, and (ii) estimating the conditional dependency structure among responses while adjusting for their predictors. The traditional likelihood-based estimators lack resilience with respect to outliers and model misspecification. This issue is exacerbated when dealing with high dimensional noisy data. In this work, we propose instead to minimize a regularized distance criterion, which is motivated by the minimum distance functionals used in nonparametric methods for their excellent robustness properties. The proposed estimates can be obtained efficiently by leveraging a sequential quadratic programming algorithm. We provide theoretical justification such as estimation consistency for the proposed estimator. Additionally, we shed light on the robustness of our estimator through its linearization, which yields a combination of weighted lasso and graphical lasso with the sample weights providing an intuitive explanation of the robustness. We demonstrate the merits of our framework through simulation study and the analysis of real financial and genetics data.

References

[1]
O. Banerjee, L. Ghaoui, and A. d'Aspremont. Model selection through sparse maximum likelihood estimation. JMLR, 9:485--516, 2008.
[2]
A. Basu, I. R. Harris, N. L. Hjort, and M. C. Jones. Robust and efficient estimation by minimising a density power divergence. Biometrika, 85, 1998.
[3]
R. Beran. Robust location estimates. Annals of Statistics, 5:431--444, 1977.
[4]
P. J. Bickel and E. Levina. Regularized estimation of large covariance matrices. Annals of Statistics, 36(1):199--227, 2008.
[5]
L. Breiman and J. H. Friedman. Predicting multivariate responses in multiple linear regression. JRSS Series B., 1997.
[6]
R. Brem and L. Kruglyak. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA, 102(5):1572--1577, 2005.
[7]
T. T. Cai, H. Li, W. Liu, and J. Xie. Covariate adjusted precision matrix estimation with an application in genetical genomics. Biometrika, 2011.
[8]
E. Candes and T. Tao. The dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35:2313--2351, 2007.
[9]
F. E. Curtis and M. L. Overton. A Sequential Quadratic Programming Algorithm for Nonconvex, Nonsmooth Constrained Optimization. SIAM Journal on Optimization, 22(2):474--500, 2011.
[10]
F. E. Curtis and X. Que. An Adaptive Gradient Sampling Algorithm for Nonsmooth Optimization. Optimization Methods and Software, 2011.
[11]
D. L. Donoho and R. C. Liu. The "automatic" robustness of minimum distance functional. Annals of Statistics, 16:552--586, 1994.
[12]
J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. JASA, 96:1348--1360, 2001.
[13]
M. Finegold and M. Drton. Robust graphical modeling of gene networks using classical and alternative t-distribution. Annals of Applied Statistics, 5(2A):1075--1080, 2011.
[14]
J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432--441, 2008.
[15]
C. Hsieh, M. Sustik, I. Dhillon, and P. Ravikumar. Sparse inverse covariance matrix estimation using quadratic approximation. In NIPS, 2011.
[16]
J. Huang, N. Liu, M. Pourahmadi, and L. Liu. Covariance matrix selection and estimation via penalised normal likelihood. Biometrika, 93(1):85--98, 2006.
[17]
M. Kanehisa, S. Goto, M. Furumichi, M. Tanabe, and M. Hirakawa. Kegg for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acid Research, 31(1):355--360, 2010.
[18]
S. L. Lauritzen. Graphical Models. Oxford: Clarendon Press, 1996.
[19]
W. Lee and Y. Liu. Simultaneous multiple response regression and inverse covariance matrix estimation via penalized gaussian maximum likelihood. J. Multivar. Anal., 2012.
[20]
E. Levina, A. J. Rothman, and J. Zhu. Sparse estimation of large covariance matrices via a nested lasso penalty. Annals of Applied Statistics, 2(1), 2008.
[21]
N. Meinshausen and P. Buhlmann. High-dimensional graphs and variable selection with the lasso. Annals of Statistics, 34(3):1436--1462, 2006.
[22]
S. Negahban, P. Ravikumar, M. Wainwright, and B. Yu. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. CoRR, abs/1010.2731, 2010.
[23]
G. Obozinski, B. Taskar, and M. Jordan. Multi-task feature selection. Tech. report, University of California, Berkeley, 2006.
[24]
P. Z. G. Qian and C. F. J. Wu. Sliced space-filling designs. Biometrika, 96(4):945--956, 2009.
[25]
P. Ravikumar, M. Wainwright, G. Raskutti, and B. Yu. High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence. Electron. J. Statist., 5:935--980, 2011.
[26]
A. Rothman, E. Levina, and J. Zhu. Sparse multivariate regression with covariance estimation. JCGS, 19(4):947--962, 2010.
[27]
D. Scott. Parametric statistical modeling by minimum integrated square error. Technometrics, 43:274--285, 2001.
[28]
K. Sohn and S. Kim. Joint estimation of structured sparsity and output structure in multiple-output regression via inverse-covariance regularization. AISTATS, 2012.
[29]
M. Sugiyama, T. Suzuki, T. Kanamori, M. C. Du Plessis, S. Liu, and I. Takeuchi. Density-difference estimation. NIPS, 25:692--700, 2012.
[30]
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1):267--288, 1996.
[31]
H. Wang, G. Li, and G. Jiang. Robust regression shrinkage and consistent variable selection through the LAD-lasso. J. Business and Economics Statistics, 25:347--355, 2007.
[32]
J. Wolfowitz. The minimum distance method. Annals of Mathematical Statistics, 28(1):75--88, 1957.
[33]
M. Yuan and Y. Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19--35, 2007.

Cited By

View all
  • (2024)Robust Estimation of Multivariate Time Series Data Based on Reduced Rank ModelJournal of Forecasting10.1002/for.320544:2(474-484)Online publication date: 25-Oct-2024
  • (2021)Robustness with respect to class imbalance in artificial intelligence classification algorithmsJournal of Quality Technology10.1080/00224065.2021.1963200(1-21)Online publication date: 30-Aug-2021
  • (2019)Dynamic Structure Embedded Online Multiple-Output Regression for Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279444641:2(323-336)Online publication date: 1-Feb-2019
  • Show More Cited By

Index Terms

  1. Robust sparse estimation of multiresponse regression and inverse covariance matrix via the L2 distance

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2013
    1534 pages
    ISBN:9781450321747
    DOI:10.1145/2487575
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 August 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. high dimensional data
    2. inverse covariance
    3. l2e
    4. multiresponse regression
    5. robust estimation
    6. sparse learning
    7. variable selection

    Qualifiers

    • Research-article

    Conference

    KDD' 13
    Sponsor:

    Acceptance Rates

    KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 05 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Robust Estimation of Multivariate Time Series Data Based on Reduced Rank ModelJournal of Forecasting10.1002/for.320544:2(474-484)Online publication date: 25-Oct-2024
    • (2021)Robustness with respect to class imbalance in artificial intelligence classification algorithmsJournal of Quality Technology10.1080/00224065.2021.1963200(1-21)Online publication date: 30-Aug-2021
    • (2019)Dynamic Structure Embedded Online Multiple-Output Regression for Streaming DataIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.279444641:2(323-336)Online publication date: 1-Feb-2019
    • (2018)Sparse estimation of multivariate Poisson log‐normal models from count dataStatistical Analysis and Data Mining: The ASA Data Science Journal10.1002/sam.1137011:2(66-77)Online publication date: 10-Jan-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media