research-article

Interpolation of sparse high-dimensional data

Authors:

Thomas C. H. Lux,

Layne T. Watson,

Tyler H. Chang,

Kirk CameronAuthors Info & Claims

Numerical Algorithms, Volume 88, Issue 1

Pages 281 - 313

https://doi.org/10.1007/s11075-020-01040-2

Published: 01 September 2021 Publication History

Abstract

Increases in the quantity of available data have allowed all fields of science to generate more accurate models of multivariate phenomena. Regression and interpolation become challenging when the dimension of data is large, especially while maintaining tractable computational complexity. Regression is a popular approach to solving approximation problems with high dimension; however, there are often advantages to interpolation. This paper presents a novel and insightful error bound for (piecewise) linear interpolation in arbitrary dimension and contrasts the performance of some interpolation techniques with popular regression techniques. Empirical results demonstrate the viability of interpolation for moderately high-dimensional approximation problems, and encourage broader application of interpolants to multivariate approximation in science.

References

[1]

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, GS., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G, Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/. Software available from tensorflow.org (2015)

[2]

Barthelmann V, Novak E, and Ritter K High dimensional polynomial interpolation on sparse grids Adv. Comput. Math. 2000 12 4 273-288

[3]

Basak D, Pal S, and Patranabis DC Support vector regression Neural. Inf. Process-Lett. Rev. 2007 11 10 203-224

[4]

Bengio Y and Grandvalet Y No unbiased estimator of the variance of k-fold cross-validation J. Mach. Learn. Res. 2004 5 Sep 1089-1105

[5]

de Boor C A Practical Guide to Splines, vol. 27 1978 New York Springer

[6]

de Boor, C., Höllig, K, Riemenschneider, S.: Box Splines, vol. 98. Springer Science & Business Media (2013)

[7]

Bos L, De Marchi S, Sommariva A, and Vianello M Computing multivariate Fekete and Leja points by numerical linear algebra SIAM J. Numer. Anal. 2010 48 5 1984-1999

[8]

Bungartz HJ and Griebel M Sparse grids Acta Numer. 2004 13 147-269

[9]

Cameron, K.W., Anwar, A., Cheng, Y., Xu, L., Li, B., Ananth, U., Bernard, J., Jearls, C., Lux, T.C.H., Hong, Y., Watson, L.T., Butt, A.R.: Moana: modeling and analyzing i/o variability in parallel system experimental design. IEEE Trans. Parallel Distrib. Syst. (2019)

[10]

Chang, T.H., Watson, L.T., Lux, T.C.H., Li, B., Xu, L., Butt, A.R., Cameron, K.W., Hong, Y.: A polynomial time algorithm for multivariate interpolation in arbitrary dimension via the Delaunay triangulation. In: Proceedings of the ACMSE 2018 Conference, ACMSE ’18. ACM, New York, pp. 12:1–12:8. (2018)

[11]

Cheney, E.W., Light, W.A.: A Course in Approximation Theory, vol. 101. American Mathematical Soc (2009)

[12]

Chkifa A, Cohen A, and Schwab C High-dimensional adaptive sparse polynomial interpolation and applications to parametric pdes Found. Comput. Math. 2014 14 4 601-633

[13]

Chollet, F., et al.: Keras. https://keras.io (2015)

[14]

Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). arXiv:1511.07289 (2015)

[15]

Cortez, P., Morais, AdJR: A data mining approach to predict forest fires using meteorological data. 13th Portuguese Conference on Artificial Intelligence (2007)

[16]

Cortez, P., Silva, A.M.G.: Using data mining to predict secondary school student performance. Proceedings of 5th Annual Future Business Technology Conference Porto (2008)

[17]

Cover T and Hart P Nearest neighbor pattern classification IEEE Trans. Inf. Theory. 1967 13 1 21-27

[18]

Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for Lvcsr using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013. IEEE, pp. 8609–8613 (2013)

[19]

De Vito S, Massera E, Piga M, Martinotto L, and Di Francia G On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario Sens. Actuators B 2008 129 2 750-757

[20]

Dennis, J.E. Jr., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, vol. 16. Siam (1996)

[21]

Dirichlet GL ÜBer die reduction der positiven quadratischen formen mit drei unbestimmten ganzen zahlen J. Reine. Angewandte Math. 1850 40 209-227

[22]

Friedman, J.H.: Multivariate adaptive regression splines. Ann. Stat.:1–67 (1991)

[23]

Friedman, J.H.: The Computational Statistics Laboritory of Stanford University: Fast mars (1993)

[24]

Fritsch FN and Carlson RE Monotone piecewise cubic interpolation SIAM J. Numer. Anal. 1980 17 2 238-246

[25]

Goh, G.: Why momentum really works. Distill. http://distill.pub/2017/momentum (2017)

[26]

Gordon WJ and Wixom JA Shepard’s method of “metric interpolation” to bivariate and multivariate interpolation Math. Comput. 1978 32 141 253-264

[27]

Hornik K, Stinchcombe M, and White H Multilayer feedforward networks are universal approximators Neural. Netw. 1989 2 5 359-366

[28]

Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14, Montreal, pp. 1137–1145 (1995)

[29]

Kövari T and Pommerenke C On the distribution of fekete points Mathematika 1968 15 1 70-75

[30]

Lazos D, Sproul AB, and Kay M Optimisation of energy management in commercial buildings with weather forecasting inputs: a review Renew. Sustain. Energy. Rev. 2014 39 587-603

[31]

Lee DT and Schachter BJ Two algorithms for constructing a Delaunay triangulation Int. J. Comput. Inf. Sci. 1980 9 3 219-242

[32]

Lilliefors HW On the Kolmogorov-Smirnov test for normality with mean and variance unknown J. Am. Stat. Assoc. 1967 62 318 399-402

[33]

Lux, T.C.H., Pittman, R., Shende, M., Shende, A.: Applications of supervised learning techniques on undergraduate admissions data. In: Proceedings of the ACM International Conference on Computing Frontiers. ACM, pp. 412–417 (2016)

[34]

Lux, T.C.H., Watson, L.T., Chang, T.H., Bernard, J., Li, B., Yu, X., Xu, L., Back, G., Butt, A.R., Cameron, K.W., et al: Nonparametric Distribution Models for Predicting and Managing Computational Performance Variability. In: Southeastcon 2018. IEEE, pp. 1–7 (2018)

[35]

Lux, T.C.H., Watson, L.T., Chang, T.H., Bernard, J., Li, B., Yu, X., Xu, L., Back, G., Butt, A.R., Cameron, K.W., et al.: Novel meshes for multivariate interpolation and approximation. In: Proceedings of the ACMSE 2018 Conference. ACM, pp. 13 (2018)

[36]

Migliorati G, Nobile F, von Schwerin E, and Tempone R Approximation of quantities of interest in stochastic pdes by the random discrete lˆ2 projection on polynomial spaces SIAM J. Sci. Comput. 2013 35 3 A1440-A1460

[37]

Møller MF A scaled conjugate gradient algorithm for fast supervised learning Neural. Netw. 1993 6 4 525-533

[38]

Navidi, W.C.: Statistics for Engineers and Scientists, 4 edn. McGraw-Hill Education (2015)

[39]

Nobile F, Tamellini L, and Tempone R Convergence of quasi-optimal sparse-grid approximation of hilbert-space-valued functions: application to random elliptic pdes Numer. Math. 2016 134 2 343-388

[40]

Norcott, W.D.: Iozone filesystem benchmark. http://www.iozone.org. [Online; Accessed 12 Oct 2017] (2017)

[41]

Park JS Optimal latin-hypercube designs for computer experiments J. Stat. Plann. Inference. 1994 39 1 95-111

[42]

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, and Duchesnay E Scikit-learn: Machine learning in Python J. Mach. Learn. Res. 2011 12 2825-2830

[43]

Pozzolo, A.D., Caelen, O., Johnson, R.A., Bontempi, G.: Calibrating probability with undersampling for unbalanced classification. In: IEEE Symposium Series on Computational Intelligence. IEEE, pp. 159–166. https://www.kaggle.com/mlg-ulb/creditcardfraud. [Online; Accessed 25 Jan 2019] (2015)

[44]

Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat.:400–407 (1951)

[45]

Rudy, J., Cherti, M.: Py-earth: a python implementation of multivariate adaptive regression splines. https://github.com/scikit-learn-contrib/py-earth. [Online; Accessed 09 Jul 2017] (2017)

[46]

Rumelhart DE, Hinton GE, Williams RJ, et al. Learning representations by back-propagating errors Cogn. Model. 1988 5 3 1

[47]

Shepard, D.: A two-dimensional interpolation function for irregularly-spaced data. In: Proceedings of the 1968 23rd ACM National Conference. ACM, pp 517–524 (1968)

[48]

Thacker WI, Zhang J, Watson LT, Birch JB, Iyer MA, and Berry MW Algorithm 905: Sheppack: modified Shepard algorithm for interpolation of scattered multivariate data ACM Trans. Math. Softw. (TOMS) 2010 37 3 34

[49]

Tsanas A, Little MA, McSharry PE, and Ramig LO Accurate telemonitoring of parkinson’s disease progression by noninvasive speech tests IEEE Trans. Biomed. Eng. 2010 57 4 884-893

[50]

Unther Greiner, G., Hormann, K.: Interpolating and approximating scattered 3d-data with hierarchical tensor product b-splines. In: Proceedings of Chamonix, p. 1 (1996)

[51]

Williams, G.J.: Weather dataset rattle package. In: Rattle: A Data Mining GUI for R, vol. 1. The R Journal, pp. 45–55. https://www.kaggle.com/jsphyg/weather-dataset-rattle-package. [Online; Accessed 25 Jan 2019] (2009)

Cited By

Gillette AKur E(2024)Algorithm 1049: The Delaunay Density DiagnosticACM Transactions on Mathematical Software10.1145/370013450:4(1-21)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3700134
Chang TWatson LLeyffer SLux TAlmohri H(2024)Remark on Algorithm 1012: Computing Projections with Large DatasetsACM Transactions on Mathematical Software10.1145/365658150:2(1-8)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3656581

Index Terms

Interpolation of sparse high-dimensional data

Index terms have been assigned to the content through auto-classification.

Recommendations

High accuracy geometric Hermite interpolation

We describe a parametric cubic spline interpolation scheme for planar curves which is based on an idea of Sabin for the construction of C^1 bicubic parametric spline surfaces. The method is a natural generalization of [standard] Hermite interpolation. ...
Data visualization using rational spline interpolation

A smooth curve interpolation scheme for positive, monotonic, and convex data has been developed. This scheme uses piecewise rational cubic functions. The two families of parameters, in the description of the rational interpolant, have been constrained ...
On geometric interpolation of circle-like curves

In this paper, geometric interpolation of certain circle-like curves by parametric polynomial curves is studied. It is shown that such an interpolating curve of degree n achieves the optimal approximation order 2n, the fact already known for particular ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Numerical Algorithms

Numerical Algorithms Volume 88, Issue 1

Sep 2021

513 pages

ISSN:1017-1398

Issue’s Table of Contents

© Springer Science+Business Media, LLC, part of Springer Nature 2020.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 September 2021

Accepted: 28 October 2020

Received: 30 March 2019

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gillette AKur E(2024)Algorithm 1049: The Delaunay Density DiagnosticACM Transactions on Mathematical Software10.1145/370013450:4(1-21)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3700134
Chang TWatson LLeyffer SLux TAlmohri H(2024)Remark on Algorithm 1012: Computing Projections with Large DatasetsACM Transactions on Mathematical Software10.1145/365658150:2(1-8)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3656581

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents