Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3406325.3451115acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article
Public Access

Robust testing of low dimensional functions

Published: 15 June 2021 Publication History

Abstract

A natural problem in high-dimensional inference is to decide if a classifier f:ℝn → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝn → {−1,1}, a linear k-junta if it is completely determined by some k-dimensional subspace of the input space. A recent work of the authors showed that linear k-juntas are testable. Thus there exists an algorithm to distinguish between:  (1) f: ℝn → {−1,1} which is a linear k-junta with surface area s.   (2) f is є-far from any linear k-junta with surface area (1+є)s.   The query complexity of the algorithm is independent of the ambient dimension n.
Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between:   (1) f: ℝn → {−1,1} has correlation at least c with some linear k-junta with surface area s.   (2) f has correlation at most c−є with any linear k-junta with surface area at most s.   The query complexity of our tester is kpoly(s/є). Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear k-juntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity kO(poly(logk/є)) for the class of intersection of k-halfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no non-trivial noise tolerant testers were known even for a single halfspace.

References

[1]
Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep? In Advances in Neural Information Processing Systems. Pages 2654–2662.
[2]
Maria-Florina Balcan, Eric Blais, Avrim Blum, and Liu Yang. 2012. Active property testing. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science. Pages 21–30.
[3]
Mihir Bellare, Oded Goldreich, and Madhu Sudan. 1998. Free bits, PCPs, and nonapproximability–towards tight results. SIAM J. Comput., 27, 3, 1998. Pages 804–915.
[4]
E. Blais. 2008. Improved bounds for testing juntas. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques. Springer. Pages 317–330.
[5]
E. Blais. 2009. Testing juntas nearly optimally. In Proceedings of the forty-first annual ACM symposium on Theory of computing. Pages 151–158.
[6]
E. Blais, C. Canonne, T. Eden, A. Levi, and D. Ron. 2018. Tolerant junta testing and the connection to submodular optimization and function isomorphism. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. Pages 2113–2132.
[7]
A. Blum. 1994. Relevant examples and relevant features: Thoughts from computational learning theory. 1994.
[8]
A. Blum, A. Frieze, R. Kannan, and S. Vempala. 1997. A polynomial time algorithm for learning noisy linear threshold functions. Algorithmica, 22, 1/2, 1997. Pages 35–52.
[9]
A. Blum and P. Langley. 1997. Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence, 97, 1-2, 1997. Pages 245–271.
[10]
Nader H Bshouty. 2019. Almost Optimal Distribution-Free Junta Testing. In 34th Computational Complexity Conference (CCC 2019).
[11]
Cristian Buciluä, Rich Caruana, and Alexandru Niculescu-Mizil. 2006. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. Pages 535–541.
[12]
S. Chakraborty, E. Fischer, D. García-Soriano, and A. Matsliah. 2012. Junto-symmetric functions, hypergraph isomorphism and crunching. In 27th Annual Conference on Computational Complexity (CCC). Pages 148–158.
[13]
Xi Chen, Adam Freilich, Rocco A Servedio, and Timothy Sun. 2017. Sample-Based High-Dimensional Convexity Testing. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017).
[14]
X. Chen, Z. Liu, Rocco A. Servedio, Y. Sheng, and J. Xie. 2018. Distribution free junta testing. In Proceedings of the ACM STOC 2018.
[15]
Xi Chen, Rocco A. Servedio, Li-Yang Tan, Erik Waingarten, and Jinyu Xie. 2017. Settling the Query Complexity of Non-adaptive Junta Testing. In Proceedings of the 32Nd Computational Complexity Conference. Pages 26:1–26:19.
[16]
Amit Daniely. 2016. Complexity theoretic limitations on learning halfspaces. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing. Pages 105–117.
[17]
Anindya De, Elchanan Mossel, and Joe Neeman. 2019. Is your function low dimensional? In Conference on Learning Theory, COLT 2019. Proceedings of Machine Learning Research. 99, Pages 979–993.
[18]
Anindya De, Elchanan Mossel, and Joe Neeman. 2019. Junta correlation is testable. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS). Pages 1549–1563.
[19]
Ilias Diakonikolas, Themis Gouleakis, and Christos Tzamos. 2019. Distribution-independent PAC learning of halfspaces with Massart noise. In Advances in Neural Information Processing Systems. Pages 4749–4760.
[20]
Ilias Diakonikolas, Daniel M Kane, and Alistair Stewart. 2018. Learning geometric concepts with nasty noise. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. Pages 1061–1073.
[21]
I. Diakonikolas, H. Lee, K. Matulef, K. Onak, R. Rubinfeld, R. Servedio, and A. Wan. 2007. Testing for Concise Representations. In Proc. 48th Ann. Symposium on Computer Science (FOCS). Pages 549–558.
[22]
E. Fischer, G. Kindler, D. Ron, S. Safra, and A. Samorodnitsky. 2004. Testing juntas. J. Computer & System Sciences, 68, 4, 2004. Pages 753–787.
[23]
V. Guruswami and P. Raghavendra. 2006. Hardness of learning halfspaces with noise. In Proc. 47th IEEE Symposium on Foundations of Computer Science (FOCS). Pages 543–552.
[24]
S. Halevy and E. Kushilevitz. 2004. Distribution-Free Connectivity Testing for Sparse Graphs. Algorithmica, 51, 1, 2004. Pages 24–48.
[25]
S. Halevy and E. Kushilevitz. 2007. Distribution-Free Property Testing. SIAM J. Comput., 37, 4, 2007. Pages 1107–1138.
[26]
Prahladh Harsha, Adam Klivans, and Raghu Meka. 2013. An invariance principle for polytopes. Journal of the ACM (JACM), 59, 6, 2013. Pages 1–25.
[27]
M. Kearns, R. Schapire, and L. Sellie. 1994. Toward Efficient Agnostic Learning. Machine Learning, 17, 2/3, 1994. Pages 115–141.
[28]
A. Klivans, R. O'Donnell, and R. Servedio. 2008. Learning Geometric Concepts via Gaussian Surface Area. In Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS). Pages 541–550.
[29]
P. Kothari, A. Nayyeri, R. O'Donnell, and C. Wu. 2014. Testing Surface Area. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014. Pages 1204–1214.
[30]
M. Ledoux. 1994. Semigroup proofs of the isoperimetric inequality in Euclidean and Gauss space. Bull. Sci. Math., 118, 1994. Pages 485–510.
[31]
Michel Ledoux. 2000. The geometry of Markov diffusion generators. In Annales de la Faculté des sciences de Toulouse: Mathématiques. 9, Pages 305–366.
[32]
Pascal Massart and \'Elodie Nédélec. 2006. Risk bounds for statistical learning. The Annals of Statistics, 34, 5, 2006. Pages 2326–2366.
[33]
K. Matulef, R. O'Donnell, R. Rubinfeld, and R. Servedio. 2010. Testing Halfspaces. SIAM J. on Comput., 39, 5, 2010. Pages 2004–2047.
[34]
Kevin Matulef, Ryan O'Donnell, Ronitt Rubinfeld, and Rocco A. Servedio. 2009. Testing \pm 1-weight halfspace. In APPROX-RANDOM. Pages 646–657.
[35]
E. H. Moore. 1920. On the Reciprocal of the General Algebraic Matrix. Bull. Amer. Math. Soc., 26, 1920. Pages 394–395.
[36]
J. Neeman. 2014. Testing surface area with arbitrary accuracy. In Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014. Pages 393–397.
[37]
M. Parnas, D. Ron, and R. Rubinfeld. 2006. Tolerant property testing and distance approximation. Journal of Computer and System Sciences, 72, 6, 2006. Pages 1012–1042.
[38]
M. Parnas, D. Ron, and A. Samorodnitsky. 2002. Testing Basic Boolean Formulae. SIAM J. Disc. Math., 16, 2002. Pages 20–46. citeseer.ifi.unizh.ch/parnas02testing.html
[39]
R. Penrose. 1955. A generalized inverse for matrices. Mathematical Proceedings of the Cambridge Philosophical Society, 51, 3, 1955. Pages 406–413. https://doi.org/10.1017/S0305004100030401
[40]
G. Pisier. 1986. Probabilistic methods in the geometry of Banach spaces. In Lecture notes in Math. Springer. Pages 167–241.
[41]
Dana Ron and Rocco A Servedio. 2015. Exponentially improved algorithms and lower bounds for testing signed majorities. Algorithmica, 72, 2, 2015. Pages 400–429.
[42]
Mark Rudelson and Roman Vershynin. 2007. Sampling from large matrices: An approach through geometric functional analysis. Journal of the ACM (JACM), 54, 4, 2007. Pages 21–es.
[43]
M. Sa\v glam. 2018. Near Log-Convexity of Measured Heat in (Discrete) Time and Consequences. In 59th IEEE Annual Symposium on Foundations of Computer Science. Pages 967–978.
[44]
R. Servedio, L-Y. Tan, and J. Wright. 2015. Adaptivity helps for testing juntas. In Proceedings of CCC. 33,
[45]
Santosh Vempala and Ying Xiao. 2013. Complexity of learning subspace juntas and ICA. In 2013 Asilomar Conference on Signals, Systems and Computers. Pages 320–324.
[46]
Santosh S Vempala. 2010. Learning convex concepts from gaussian distributions with PCA. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. Pages 124–130.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing
June 2021
1797 pages
ISBN:9781450380539
DOI:10.1145/3406325
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Gaussian analysis
  2. Junta testing
  3. Property testing

Qualifiers

  • Research-article

Funding Sources

Conference

STOC '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 175
    Total Downloads
  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)9
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media