Abstract
We address the solution of large-scale Bayesian optimal experimental design (OED) problems governed by partial differential equations (PDEs) with infinite-dimensional parameter fields. The OED problem seeks to find sensor locations that maximize the expected information gain (EIG) in the solution of the underlying Bayesian inverse problem. Computation of the EIG is usually prohibitive for PDE-based OED problems. To make the evaluation of the EIG tractable, we approximate the (PDE-based) parameter-to-observable map with a derivative-informed projected neural network (DIPNet) surrogate, which exploits the geometry, smoothness, and intrinsic low-dimensionality of the map using a small and dimension-independent number of PDE solves. The surrogate is then deployed within a greedy algorithm-based solution of the OED problem such that no further PDE solves are required. We analyze the EIG approximation error in terms of the generalization error of the DIPNet and show they are of the same order. Finally, the efficiency and accuracy of the method are demonstrated via numerical experiments on OED problems governed by inverse scattering and inverse reactive transport with up to 16,641 uncertain parameters and 100 experimental design variables, where we observe up to three orders of magnitude speedup relative to a reference double loop Monte Carlo method.
Similar content being viewed by others
Data Availability
Enquiries about data availability should be directed to the authors.
References
Uciński, D.: Optimal Measurement Methods for Distributed Parameter System Identification. CRC Press, Boca Raton (2005)
Loose, N., Heimbach, P.: Leveraging uncertainty quantification to design ocean climate observing systems. J. Adv. Model. Earth Syst., pp 1–29 (2021)
Ferrolino, A.R., Lope, J.E.C., Mendoza, R.G.: Optimal location of sensors for early detection of tsunami waves. In: International Conference on Computational Science, pp. 562–575 (2020). Springer
Huan, X., Marzouk, Y.M.: Simulation-based optimal Bayesian experimental design for nonlinear systems. J. Comput. Phys. 232(1), 288–317 (2013). https://doi.org/10.1016/j.jcp.2012.08.013
Huan, X., Marzouk, Y.M.: Gradient-based stochastic optimization methods in Bayesian experimental design. Int. J. Uncertain. Quantif. 4(6), 479–510 (2014)
Huan, X., Marzouk, Y.M.: Sequential bayesian optimal experimental design via approximate dynamic programming. arXiv preprint arXiv:1604.08320 (2016)
Alexanderian, A., Petra, N., Stadler, G., Ghattas, O.: A fast and scalable method for A-optimal design of experiments for infinite-dimensional Bayesian nonlinear inverse problems. SIAM J. Sci. Comput. 38(1), 243–272 (2016). https://doi.org/10.1137/140992564
Beck, J., Dia, B.M., Espath, L.F., Long, Q., Tempone, R.: Fast bayesian experimental design: Laplace-based importance sampling for the expected information gain. Comput. Methods Appl. Mech. Eng. 334, 523–553 (2018). https://doi.org/10.1016/j.cma.2018.01.053
Long, Q., Scavino, M., Tempone, R., Wang, S.: Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations. Comput. Methods Appl. Mech. Eng. 259, 24–39 (2013)
Long, Q., Motamed, M., Tempone, R.: Fast bayesian optimal experimental design for seismic source inversion. Comput. Methods Appl. Mech. Eng. 291, 123–145 (2015). https://doi.org/10.1016/j.cma.2015.03.021
Beck, J., Mansour Dia, B., Espath, L., Tempone, R.: Multilevel double loop Monte Carlo and stochastic collocation methods with importance sampling for Bayesian optimal experimental design. Int. J. Numer. Methods Eng. 121(15), 3482–3503 (2020)
Alexanderian, A., Petra, N., Stadler, G., Ghattas, O.: A-optimal design of experiments for infinite-dimensional Bayesian linear inverse problems with regularized \(\ell _0\)-sparsification. SIAM J. Sci. Comput. 36(5), 2122–2148 (2014). https://doi.org/10.1137/130933381
Alexanderian, A., Gloor, P.J., Ghattas, O.: On Bayesian A-and D-optimal experimental designs in infinite dimensions. Bayesian Anal. 11(3), 671–695 (2016). https://doi.org/10.1214/15-BA969
Saibaba, A.K., Alexanderian, A., Ipsen, I.C.: Randomized matrix-free trace and log-determinant estimators. Numerische Mathematik 137(2), 353–395 (2017)
Crestel, B., Alexanderian, A., Stadler, G., Ghattas, O.: A-optimal encoding weights for nonlinear inverse problems, with application to the Helmholtz inverse problem. Inverse Prob. 33(7), 074008 (2017)
Attia, A., Alexanderian, A., Saibaba, A.K.: Goal-oriented optimal design of experiments for large-scale Bayesian linear inverse problems. Inverse Prob. 34(9), 095009 (2018)
Wu, K., Chen, P., Ghattas, O.: A fast and scalable computational framework for large-scale and high-dimensional Bayesian optimal experimental design. arXiv preprint arXiv:2010.15196, to appear in SIAM Journal on Scientific Computing (2020)
Wu, K., Chen, P., Ghattas, O.: An efficient method for goal-oriented linear bayesian optimal experimental design: Application to optimal sensor placement. arXiv preprint arXiv:2102.06627, to appear in SIAM/AMS Journal on Uncertainty Quantification (2021)
Aretz-Nellesen, N., Chen, P., Grepl, M.A., Veroy, K.: A-optimal experimental design for hyper-parameterized linear Bayesian inverse problems. Numer. Math. Adv. Appl. ENUMATH 2020 (2020)
Aretz, N., Chen, P., Veroy, K.: Sensor selection for hyper-parameterized linear Bayesian inverse problems. PAMM 20(S1), 202000357 (2021)
Foster, A., Jankowiak, M., Bingham, E., Horsfall, P., Teh, Y.W., Rainforth, T., Goodman, N.: Variational Bayesian optimal experimental design. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 14036–14047. Curran Associates, Inc. (2019). https://proceedings.neurips.cc/paper/2019/file/d55cbf210f175f4a37916eafe6c04f0d-Paper.pdf
Kleinegesse, S., Gutmann, M.U.: Bayesian experimental design for implicit models by mutual information neural estimation. In: International Conference on Machine Learning, pp. 5316–5326 (2020). PMLR
Shen, W., Huan, X.: Bayesian sequential optimal experimental design for nonlinear models using policy gradient reinforcement learning. arXiv preprint arXiv:2110.15335 (2021)
O’Leary-Roseberry, T., Villa, U., Chen, P., Ghattas, O.: Derivative-informed projected neural networks for high-dimensional parametric maps governed by pdes. Comput. Methods Appl. Mech. Eng. 388, 114199 (2022)
O’Leary-Roseberry, T.: Efficient and dimension independent methods for neural network surrogate construction and training. PhD thesis, The University of Texas at Austin (2020)
O’Leary-Roseberry, T., Du, X., Chaudhuri, A., Martins, J.R., Willcox, K., Ghattas, O.: Adaptive projected residual networks for learning parametric maps from sparse data. arXiv preprint arXiv:2112.07096 (2021)
Flath, P.H., Wilcox, L.C., Akçelik, V., Hill, J., van Bloemen Waanders, B., Ghattas, O.: Fast algorithms for Bayesian uncertainty quantification in large-scale linear inverse problems based on low-rank partial Hessian approximations. SIAM J. Sci. Comput. 33(1), 407–432 (2011). https://doi.org/10.1137/090780717
Bui-Thanh, T., Burstedde, C., Ghattas, O., Martin, J., Stadler, G., Wilcox, L.C.: Extreme-scale UQ for Bayesian inverse problems governed by PDEs. In: SC12: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2012)
Bui-Thanh, T., Ghattas, O., Martin, J., Stadler, G.: A computational framework for infinite-dimensional Bayesian inverse problems Part I: The linearized case, with application to global seismic inversion. SIAM J. Sci. Comput. 35(6), 2494–2523 (2013). https://doi.org/10.1137/12089586X
Bui-Thanh, T., Ghattas, O.: An analysis of infinite dimensional Bayesian inverse shape acoustic scattering and its numerical approximation. SIAM/ASA J. Uncertain. Quantif. 2(1), 203–222 (2014). https://doi.org/10.1137/120894877
Kalmikov, A.G., Heimbach, P.: A Hessian-based method for uncertainty quantification in global ocean state estimation. SIAM J. Sci. Comput. 36(5), 267–295 (2014)
Hesse, M., Stadler, G.: Joint inversion in coupled quasistatic poroelasticity. J. Geophys. Res. Solid Earth 119(2), 1425–1445 (2014)
Isaac, T., Petra, N., Stadler, G., Ghattas, O.: Scalable and efficient algorithms for the propagation of uncertainty from data through inference to prediction for large-scale problems, with application to flow of the Antarctic ice sheet. J. Comput. Phys. 296, 348–368 (2015). https://doi.org/10.1016/j.jcp.2015.04.047
Cui, T., Law, K.J.H., Marzouk, Y.M.: Dimension-independent likelihood-informed MCMC. J. Comput. Phys. 304, 109–137 (2016)
Chen, P., Villa, U., Ghattas, O.: Hessian-based adaptive sparse quadrature for infinite-dimensional Bayesian inverse problems. Comput. Methods Appl. Mech. Eng. 327, 147–172 (2017)
Beskos, A., Girolami, M., Lan, S., Farrell, P.E., Stuart, A.M.: Geometric MCMC for infinite-dimensional inverse problems. J. Comput. Phys. 335, 327–351 (2017)
Zahm, O., Cui, T., Law, K., Spantini, A., Marzouk, Y.: Certified dimension reduction in nonlinear Bayesian inverse problems. arXiv preprint arXiv:1807.03712 (2018)
Brennan, M., Bigoni, D., Zahm, O., Spantini, A., Marzouk, Y.: Greedy inference with structure-exploiting lazy maps. Adv. Neural Inf. Process. Syst. 33 (2020)
Chen, P., Wu, K., Chen, J., O’Leary-Roseberry, T., Ghattas, O.: Projected Stein variational Newton: A fast and scalable Bayesian inference method in high dimensions. Adv. Neural Inf. Process, Syst (2019)
Chen, P., Ghattas, O.: Projected Stein variational gradient descent. Adv. Neural Inf. Process, Syst (2020)
Subramanian, S., Scheufele, K., Mehl, M., Biros, G.: Where did the tumor start? An inverse solver with sparse localization for tumor growth models. Inverse Prob. 36(4), 045006 (2020). https://doi.org/10.1088/1361-6420/ab649c
Babaniyi, O., Nicholson, R., Villa, U., Petra, N.: Inferring the basal sliding coefficient field for the Stokes ice sheet model under rheological uncertainty. Cryosphere 15(4), 1731–1750 (2021)
Ghattas, O., Willcox, K.: Learning physics-based models from data: perspectives from inverse problems and model reduction. Acta Numer. 30, 445–554 (2021). https://doi.org/10.1017/S0962492921000064
Stuart, A.M.: Inverse problems: a Bayesian perspective. Acta Numer. 19, 451–559 (2010). https://doi.org/10.1017/S0962492910000061
Petra, N., Martin, J., Stadler, G., Ghattas, O.: A computational framework for infinite-dimensional Bayesian inverse problems: Part II. Stochastic Newton MCMC with application to ice sheet flow inverse problems. SIAM J. Sci. Comput. 36(4), 1525–1555 (2014)
Bhattacharya, K., Hosseini, B., Kovachki, N.B., Stuart, A.M.: Model reduction and neural networks for parametric pdes. arXiv preprint arXiv:2005.03180 (2020)
Fresca, S., Manzoni, A.: POD-DL-ROM: enhancing deep learning-based reduced order models for nonlinear parametrized pdes by proper orthogonal decomposition. Comput. Methods Appl. Mech. Eng. 388, 114181 (2022)
Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., Anandkumar, A.: Neural operator: Learning maps between function spaces. arXiv preprint arXiv:2108.08481 (2021)
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., Anandkumar, A.: Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485 (2020)
Lu, L., Jin, P., Karniadakis, G.E.: Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193 (2019)
O’Leary-Roseberry, T., Chen, P., Villa, U., Ghattas, O.: Derivative-Informed Neural Operator: An Efficient Framework for High-Dimensional Parametric Derivative Learning. arXiv preprint arXiv:2206.10745 (2022)
Nelsen, N.H., Stuart, A.M.: The random feature model for input-output maps between banach spaces. SIAM J. Sci. Comput. 43(5), 3212–3243 (2021)
Nguyen, H.V., Bui-Thanh, T.: Model-constrained deep learning approaches for inverse problems. arXiv preprint arXiv:2105.12033 (2021)
Zahm, O., Constantine, P.G., Prieur, C., Marzouk, Y.M.: Gradient-based dimension reduction of multivariate vector-valued functions. SIAM J. Sci. Comput. 42(1), 534–558 (2020)
Manzoni, A., Negri, F., Quarteroni, A.: Dimensionality reduction of parameter-dependent problems through proper orthogonal decomposition. Ann. Math. Sci. Appl. 1(2), 341–377 (2016)
Quarteroni, A., Manzoni, A., Negri, F.: Reduced Basis Methods for Partial Differential Equations: An Introduction vol. 92. Springer (2015)
Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theory 14(1), 55–63 (1968)
Li, Q., Lin, T., Shen, Z.: Deep learning via dynamical systems: An approximation perspective. J. Eur, Math, Soc. (2022)
O’Leary-Roseberry, T., Alger, N., Ghattas, O.: Low rank saddle free Newton: A scalable method for stochastic nonconvex optimization. arXiv preprint arXiv:2002.02881 (2020)
Jagalur-Mohan, J., Marzouk, Y.: Batch greedy maximization of non-submodular functions: guarantees and applications to experimental design. J. Mach. Learn. Res. 22(252), 1–62 (2021)
Alnæs, M.S., Blechta, J., Hake, J., Johansson, A., Kehlet, B., Logg, A., Richardson, C., Ring, J., Rognes, M.E., Wells, G.N.: The FEniCS project version 1.5. Archive of Numerical Software 3(100) (2015). https://doi.org/10.11588/ans.2015.100.20553
O’Leary-Roseberry, T., Villa, U.: hippyflow: Dimension reduced surrogate construction for parametric PDE maps in Python (2021). https://doi.org/10.5281/zenodo.4608729
Villa, U., Petra, N., Ghattas, O.: hIPPYlib: An extensible software framework for large-scale inverse problems governed by PDEs; Part I: Deterministic inversion and linearized Bayesian inference. ACM Trans, Math, Softw. (2021)
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283 (2016)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
O’Leary-Roseberry, T., Alger, N., Ghattas, O.: Inexact Newton methods for stochastic nonconvex optimization with applications to neural network training. arXiv preprint arXiv:1905.06738 (2019)
Funding
This research was partially funded by DOE ASCR DE-SC0019303 and DE-SC0021239, DOD MURI FA9550-21-1-0084, and NSF DMS-2012453.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, K., O’Leary-Roseberry, T., Chen, P. et al. Large-Scale Bayesian Optimal Experimental Design with Derivative-Informed Projected Neural Network. J Sci Comput 95, 30 (2023). https://doi.org/10.1007/s10915-023-02145-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-023-02145-1