Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Interpolation of sparse high-dimensional data

Published: 01 September 2021 Publication History

Abstract

Increases in the quantity of available data have allowed all fields of science to generate more accurate models of multivariate phenomena. Regression and interpolation become challenging when the dimension of data is large, especially while maintaining tractable computational complexity. Regression is a popular approach to solving approximation problems with high dimension; however, there are often advantages to interpolation. This paper presents a novel and insightful error bound for (piecewise) linear interpolation in arbitrary dimension and contrasts the performance of some interpolation techniques with popular regression techniques. Empirical results demonstrate the viability of interpolation for moderately high-dimensional approximation problems, and encourage broader application of interpolants to multivariate approximation in science.

References

[1]
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, GS., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G, Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/. Software available from tensorflow.org (2015)
[2]
Barthelmann V, Novak E, and Ritter K High dimensional polynomial interpolation on sparse grids Adv. Comput. Math. 2000 12 4 273-288
[3]
Basak D, Pal S, and Patranabis DC Support vector regression Neural. Inf. Process-Lett. Rev. 2007 11 10 203-224
[4]
Bengio Y and Grandvalet Y No unbiased estimator of the variance of k-fold cross-validation J. Mach. Learn. Res. 2004 5 Sep 1089-1105
[5]
de Boor C A Practical Guide to Splines, vol. 27 1978 New York Springer
[6]
de Boor, C., Höllig, K, Riemenschneider, S.: Box Splines, vol. 98. Springer Science & Business Media (2013)
[7]
Bos L, De Marchi S, Sommariva A, and Vianello M Computing multivariate Fekete and Leja points by numerical linear algebra SIAM J. Numer. Anal. 2010 48 5 1984-1999
[8]
Bungartz HJ and Griebel M Sparse grids Acta Numer. 2004 13 147-269
[9]
Cameron, K.W., Anwar, A., Cheng, Y., Xu, L., Li, B., Ananth, U., Bernard, J., Jearls, C., Lux, T.C.H., Hong, Y., Watson, L.T., Butt, A.R.: Moana: modeling and analyzing i/o variability in parallel system experimental design. IEEE Trans. Parallel Distrib. Syst. (2019)
[10]
Chang, T.H., Watson, L.T., Lux, T.C.H., Li, B., Xu, L., Butt, A.R., Cameron, K.W., Hong, Y.: A polynomial time algorithm for multivariate interpolation in arbitrary dimension via the Delaunay triangulation. In: Proceedings of the ACMSE 2018 Conference, ACMSE ’18. ACM, New York, pp. 12:1–12:8. (2018)
[11]
Cheney, E.W., Light, W.A.: A Course in Approximation Theory, vol. 101. American Mathematical Soc (2009)
[12]
Chkifa A, Cohen A, and Schwab C High-dimensional adaptive sparse polynomial interpolation and applications to parametric pdes Found. Comput. Math. 2014 14 4 601-633
[13]
Chollet, F., et al.: Keras. https://keras.io (2015)
[14]
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). arXiv:1511.07289 (2015)
[15]
Cortez, P., Morais, AdJR: A data mining approach to predict forest fires using meteorological data. 13th Portuguese Conference on Artificial Intelligence (2007)
[16]
Cortez, P., Silva, A.M.G.: Using data mining to predict secondary school student performance. Proceedings of 5th Annual Future Business Technology Conference Porto (2008)
[17]
Cover T and Hart P Nearest neighbor pattern classification IEEE Trans. Inf. Theory. 1967 13 1 21-27
[18]
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for Lvcsr using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013. IEEE, pp. 8609–8613 (2013)
[19]
De Vito S, Massera E, Piga M, Martinotto L, and Di Francia G On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario Sens. Actuators B 2008 129 2 750-757
[20]
Dennis, J.E. Jr., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, vol. 16. Siam (1996)
[21]
Dirichlet GL ÜBer die reduction der positiven quadratischen formen mit drei unbestimmten ganzen zahlen J. Reine. Angewandte Math. 1850 40 209-227
[22]
Friedman, J.H.: Multivariate adaptive regression splines. Ann. Stat.:1–67 (1991)
[23]
Friedman, J.H.: The Computational Statistics Laboritory of Stanford University: Fast mars (1993)
[24]
Fritsch FN and Carlson RE Monotone piecewise cubic interpolation SIAM J. Numer. Anal. 1980 17 2 238-246
[25]
Goh, G.: Why momentum really works. Distill. http://distill.pub/2017/momentum (2017)
[26]
Gordon WJ and Wixom JA Shepard’s method of “metric interpolation” to bivariate and multivariate interpolation Math. Comput. 1978 32 141 253-264
[27]
Hornik K, Stinchcombe M, and White H Multilayer feedforward networks are universal approximators Neural. Netw. 1989 2 5 359-366
[28]
Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14, Montreal, pp. 1137–1145 (1995)
[29]
Kövari T and Pommerenke C On the distribution of fekete points Mathematika 1968 15 1 70-75
[30]
Lazos D, Sproul AB, and Kay M Optimisation of energy management in commercial buildings with weather forecasting inputs: a review Renew. Sustain. Energy. Rev. 2014 39 587-603
[31]
Lee DT and Schachter BJ Two algorithms for constructing a Delaunay triangulation Int. J. Comput. Inf. Sci. 1980 9 3 219-242
[32]
Lilliefors HW On the Kolmogorov-Smirnov test for normality with mean and variance unknown J. Am. Stat. Assoc. 1967 62 318 399-402
[33]
Lux, T.C.H., Pittman, R., Shende, M., Shende, A.: Applications of supervised learning techniques on undergraduate admissions data. In: Proceedings of the ACM International Conference on Computing Frontiers. ACM, pp. 412–417 (2016)
[34]
Lux, T.C.H., Watson, L.T., Chang, T.H., Bernard, J., Li, B., Yu, X., Xu, L., Back, G., Butt, A.R., Cameron, K.W., et al: Nonparametric Distribution Models for Predicting and Managing Computational Performance Variability. In: Southeastcon 2018. IEEE, pp. 1–7 (2018)
[35]
Lux, T.C.H., Watson, L.T., Chang, T.H., Bernard, J., Li, B., Yu, X., Xu, L., Back, G., Butt, A.R., Cameron, K.W., et al.: Novel meshes for multivariate interpolation and approximation. In: Proceedings of the ACMSE 2018 Conference. ACM, pp. 13 (2018)
[36]
Migliorati G, Nobile F, von Schwerin E, and Tempone R Approximation of quantities of interest in stochastic pdes by the random discrete lˆ2 projection on polynomial spaces SIAM J. Sci. Comput. 2013 35 3 A1440-A1460
[37]
Møller MF A scaled conjugate gradient algorithm for fast supervised learning Neural. Netw. 1993 6 4 525-533
[38]
Navidi, W.C.: Statistics for Engineers and Scientists, 4 edn. McGraw-Hill Education (2015)
[39]
Nobile F, Tamellini L, and Tempone R Convergence of quasi-optimal sparse-grid approximation of hilbert-space-valued functions: application to random elliptic pdes Numer. Math. 2016 134 2 343-388
[40]
Norcott, W.D.: Iozone filesystem benchmark. http://www.iozone.org. [Online; Accessed 12 Oct 2017] (2017)
[41]
Park JS Optimal latin-hypercube designs for computer experiments J. Stat. Plann. Inference. 1994 39 1 95-111
[42]
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, and Duchesnay E Scikit-learn: Machine learning in Python J. Mach. Learn. Res. 2011 12 2825-2830
[43]
Pozzolo, A.D., Caelen, O., Johnson, R.A., Bontempi, G.: Calibrating probability with undersampling for unbalanced classification. In: IEEE Symposium Series on Computational Intelligence. IEEE, pp. 159–166. https://www.kaggle.com/mlg-ulb/creditcardfraud. [Online; Accessed 25 Jan 2019] (2015)
[44]
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat.:400–407 (1951)
[45]
Rudy, J., Cherti, M.: Py-earth: a python implementation of multivariate adaptive regression splines. https://github.com/scikit-learn-contrib/py-earth. [Online; Accessed 09 Jul 2017] (2017)
[46]
Rumelhart DE, Hinton GE, Williams RJ, et al. Learning representations by back-propagating errors Cogn. Model. 1988 5 3 1
[47]
Shepard, D.: A two-dimensional interpolation function for irregularly-spaced data. In: Proceedings of the 1968 23rd ACM National Conference. ACM, pp 517–524 (1968)
[48]
Thacker WI, Zhang J, Watson LT, Birch JB, Iyer MA, and Berry MW Algorithm 905: Sheppack: modified Shepard algorithm for interpolation of scattered multivariate data ACM Trans. Math. Softw. (TOMS) 2010 37 3 34
[49]
Tsanas A, Little MA, McSharry PE, and Ramig LO Accurate telemonitoring of parkinson’s disease progression by noninvasive speech tests IEEE Trans. Biomed. Eng. 2010 57 4 884-893
[50]
Unther Greiner, G., Hormann, K.: Interpolating and approximating scattered 3d-data with hierarchical tensor product b-splines. In: Proceedings of Chamonix, p. 1 (1996)
[51]
Williams, G.J.: Weather dataset rattle package. In: Rattle: A Data Mining GUI for R, vol. 1. The R Journal, pp. 45–55. https://www.kaggle.com/jsphyg/weather-dataset-rattle-package. [Online; Accessed 25 Jan 2019] (2009)

Cited By

View all
  • (2024)Algorithm 1049: The Delaunay Density DiagnosticACM Transactions on Mathematical Software10.1145/370013450:4(1-21)Online publication date: 14-Oct-2024
  • (2024)Remark on Algorithm 1012: Computing Projections with Large DatasetsACM Transactions on Mathematical Software10.1145/365658150:2(1-8)Online publication date: 22-Apr-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Numerical Algorithms
Numerical Algorithms  Volume 88, Issue 1
Sep 2021
513 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 September 2021
Accepted: 28 October 2020
Received: 30 March 2019

Author Tags

  1. Approximation
  2. Regression
  3. Interpolation
  4. High dimension
  5. Error bound

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Algorithm 1049: The Delaunay Density DiagnosticACM Transactions on Mathematical Software10.1145/370013450:4(1-21)Online publication date: 14-Oct-2024
  • (2024)Remark on Algorithm 1012: Computing Projections with Large DatasetsACM Transactions on Mathematical Software10.1145/365658150:2(1-8)Online publication date: 22-Apr-2024

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media