Abstract
Stochastic optimization problems with an objective function that is additive over a finite number of stages are addressed. Although Dynamic Programming allows one to formally solve such problems, closed-form solutions can be derived only in particular cases. The search for suboptimal solutions via two approaches is addressed: approximation of the value functions and approximation of the optimal decision policies. The approximations take on the form of linear combinations of basis functions containing adjustable parameters to be optimized together with the coefficients of the combinations. Two kinds of basis functions are considered: Gaussians with varying centers and widths and sigmoids with varying weights and biases. The accuracies of such suboptimal solutions are investigated via estimates of the error propagation through the stages. Upper bounds are derived on the differences between the optimal value of the objective functional and its suboptimal values corresponding to the use at each stage of approximate value functions and approximate policies. Conditions under which the number of basis functions required for a desired approximation accuracy does not grow “too fast” with respect to the dimensions of the state and random vectors are provided. As an example of application, a multidimensional problem of optimal consumption under uncertainty is investigated, where consumers aim at maximizing a social utility function. Numerical simulations are provided, emphasizing computational pros and cons of the two approaches (i.e., value-function approximation and optimal-policy approximation) using the above-mentioned two kinds of basis functions. To investigate the dependencies of the performances on dimensionality, the numerical analysis is performed for various numbers of consumers. In the simulations, discretization techniques exploiting low-discrepancy sequences are used. Both theoretical and numerical results give insights into the possibility of coping with the curse of dimensionality in stochastic optimization problems whose decision strategies depend on large numbers of variables.
Similar content being viewed by others
Notes
The expression “curse of dimensionality”, introduced by Bellman [6], refers to an unacceptably fast growth, with respect to a suitably-defined “dimension” of the problem at hand, of the computational time or memory requirements necessary to solve an instance of the problem itself.
I.e., with a finite number of variables and an infinite number of constraints.
In general, such expectations and maximizations cannot be performed exactly. These additional sources of error are due to the replacements of the conditional expectations by empirical means (computed on the basis of a sufficiently large number of samples) and to the use of nonlinear programming algorithms. Tools from statistical learning theory [28, 59, 85] and deterministic learning (see, e.g., [21] and the references therein) can be used to derive upper bounds on the required number of samples for a desired degree of accuracy. However, in this paper we do not address these issues, so in the following we shall simplify the analysis by assuming a large number of samples and neglecting the error in performing the maximizations and conditional expectations.
To see how \(\tilde{J}_{t}\) is chosen, see the proof of Proposition 5, in particular formula (52).
In particular, for deterministic infinite-horizon problems, in [72] it was shown that high-order differentiability may be lost at unstable stationary points of the dynamical system x t+1=g(x t ), where g is the (stationary) optimal policy.
References
Adda, J., Cooper, R.: Dynamic Economics: Quantitative Methods and Applications. MIT Press, Cambridge (2003)
Alessandri, A., Gaggero, M., Zoppoli, R.: Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes. IEEE Trans. Neural Netw. Learn. Syst. 23(6), 984–996 (2012)
Altman, E., Nain, P.: Optimal control of the M/G/1 queue with repeated vacations of the server. IEEE Trans. Autom. Control 38, 1766–1775 (1993)
Anderson, E.J., Nash, P.: Linear Programming in Infinite-Dimensional Spaces. Wiley, New York (1987)
Baird, L.: Residual algorithms: reinforcement learning with function approximation. In: Proc. 12th Int. Conf. on Machine Learning, pp. 30–37 (1995)
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Bellman, R., Dreyfus, S.: Functional approximations and dynamic programming. Math. Tables Other Aids Comput. 13, 247–251 (1959)
Bellman, R., Kalaba, R., Kotkin, B.: Polynomial approximation—a new computational technique in dynamic programming. Math. Comput. 17, 155–161 (1963)
Benveniste, L.M., Scheinkman, J.A.: On the differentiability of the value function in dynamic models of economics. Econometrica 47, 727–732 (1979)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 1. Athena Scientific, Belmont (2005)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 2. Athena Scientific, Belmont (2007)
Bertsekas, D.P., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
Bertsekas, D.P., Borkar, V.S., Nedic, A.: Improved temporal difference methods with linear function approximation. In: Si, J., Barto, A.G., Powell, W.B., Wunsch, D. (eds.) Handbook of Learning and Approximate Dynamic Programming, pp. 233–257. IEEE Press, New York (2004)
Bhattacharya, R., Majumdar, M.: Random Dynamical Systems: Theory and Applications. Cambridge University Press, Cambridge (2007)
Blume, L., Easley, D., O’Hara, M.: Characterization of optimal plans for stochastic dynamic programs. J. Econ. Theory 28, 221–234 (1982)
Buhmann, M.D.: Radial Basis Functions. Cambridge University Press, Cambridge (2003)
Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2010)
Cervellera, C., Muselli, M.: Efficient sampling in approximate dynamic programming algorithms. Comput. Optim. Appl. 38, 417–443 (2007)
Cervellera, C., Gaggero, M., Macciò, D.: Efficient kernel models for learning and approximate minimization problems. Neurocomputing 97, 74–85 (2012)
Cervellera, C., Gaggero, M., Macciò, D., Marcialis, R.: Quasi-random sampling for approximate dynamic programming. In: Proc. 2013 Int. Joint Conf. on Neural Networks, pp. 2567–2574 (2013)
Cervellera, C., Gaggero, M., Macciò, D.: Low-discrepancy sampling for approximate dynamic programming with local approximators. Comput. Oper. Res. 43, 108–115 (2014)
Chen, V.C.P., Ruppert, D., Shoemaker, C.A.: Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming. Oper. Res. 47, 38–53 (1999)
Cruz-Suárez, H., Montes-de-Oca, R.: Discounted Markov control processes induced by deterministic systems. Kybernetika 42, 647–664 (2006)
Cruz-Suárez, H., Montes-de-Oca, R.: An envelope theorem and some applications to discounted Markov decision processes. Math. Methods Oper. Res. 67, 299–321 (2008)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989)
Ekeland, I., Turnbull, T.: Infinite-Dimensional Optimization and Convexity. University of Chicago Press, Chicago (1983)
Fang, K.T., Wang, Y.: Number-Theoretic Methods in Statistics. Chapman & Hall, London (1994)
Farahmand, A., Ghavamzadeh, M., Szepesvári, C., Mannor, S.: Regularized fitted Q-iteration for planning in continuous-space Markovian decision problems. In: Proc. American Control Conf., pp. 725–730 (2009)
Foufoula-Georgiou, E., Kitanidis, P.K.: Gradient dynamic programming for stochastic optimal control of multidimensional water resources systems. Water Resour. Res. 24, 1345–1359 (1988)
Gaggero, M., Gnecco, G., Sanguineti, M.: Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results. J. Optim. Theory Appl. 156, 380–416 (2013)
Gaggero, M., Gnecco, G., Sanguineti, M.: Suboptimal policies for stochastic N-stage optimization problems: accuracy analysis and a case study from optimal consumption. In: El Ouardighi, F., Kogan, K. (eds.) Models and Methods in Economics and Management Science. International Series in Operations Research and Management Science, vol. 198, pp. 27–50. Springer, Berlin (2014)
Gelfand, I.M., Fomin, S.V.: Calculus of Variations. Prentice Hall, Englewood Cliffs (1963)
Giulini, S., Sanguineti, M.: Approximation schemes for functional optimization problems. J. Optim. Theory Appl. 140, 33–54 (2009)
Gnecco, G., Sanguineti, M.: Approximation error bounds via Rademacher’s complexity. Appl. Math. Sci. 2, 153–176 (2008)
Gnecco, G., Sanguineti, M.: Suboptimal solutions to dynamic optimization problems via approximations of the policy functions. J. Optim. Theory Appl. 146, 764–794 (2010)
Gnecco, G., Sanguineti, M., Gaggero, M.: Suboptimal solutions to team optimization problems with stochastic information structure. SIAM J. Optim. 22, 212–243 (2012)
Gribonval, R., Vandergheynst, P.: On the exponential convergence of matching pursuits in quasi-incoherent dictionaries. IEEE Trans. Inf. Theory 52, 255–261 (2006)
Grisvard, P.: Elliptic Problems in Nonsmooth Domains. Pitman, Boston (1985)
Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. Methuen, London (1964)
Hernandez-Lerma, O., Lasserre, J.: Approximation schemes for infinite linear programs. SIAM J. Optim. 8, 973–988 (1998)
Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I. Springer, Berlin (1996)
Ito, S., Wu, S.Y., Shiu, T.J., Teo, K.L.: A numerical approach to infinite-dimensional linear programming in L 1 spaces. J. Ind. Manag. Optim. 6, 15–18 (2010)
Johnson, S., Stedinger, J., Shoemaker, C., Li, Y., Tejada-Guibert, J.: Numerical solution of continuous-state dynamic programs using linear and spline interpolation. Oper. Res. 41, 484–500 (1993)
Judd, K.: Numerical Methods in Economics. MIT Press, Cambridge (1998)
Kainen, P., Kůrková, V., Sanguineti, M.: Minimization of error functionals over variable-basis functions. SIAM J. Optim. 14, 732–742 (2003)
Kainen, P.C., Kůrková, V., Sanguineti, M.: Dependence of computational models on input dimension: tractability of approximation and optimization tasks. IEEE Trans. Inf. Theory 58, 1203–1214 (2012)
Karp, L., Lee, I.H.: Learning-by-doing and the choice of technology: the role of patience. J. Econ. Theory 100, 73–92 (2001)
Kuhn, D.: Generalized Bounds for Convex Multistage Stochastic Programs. Springer, Berlin (2005)
Kůrková, V., Sanguineti, M.: Bounds on rates of variable-basis and neural-network approximation. IEEE Trans. Inf. Theory 47, 2659–2665 (2001)
Kůrková, V., Sanguineti, M.: Comparison of worst-case errors in linear and neural network approximation. IEEE Trans. Inf. Theory 48, 264–275 (2002)
Kůrková, V., Sanguineti, M.: Error estimates for approximate optimization by the extended Ritz method. SIAM J. Optim. 18, 461–487 (2005)
Kůrková, V., Sanguineti, M.: Approximate minimization of the regularized expected error over kernel models. Math. Oper. Res. 33, 747–756 (2008)
Kůrková, V., Sanguineti, M.: Geometric upper bounds on rates of variable-basis approximation. IEEE Trans. Inf. Theory 54, 5681–5688 (2008)
Lendaris, G.G., Neidhoefer, J.C.: Guidance in the choice of adaptive critics for control. In: Si, J., Barto, A.G., Powell, W.B., Wunsch, D. (eds.) Handbook of Learning and Approximate Dynamic Programming, pp. 97–124. IEEE Press, New York (2004)
Loomis, L.H.: An Introduction to Abstract Harmonic Analysis. Van Nostrand, Princeton (1953)
Mhaskar, H.N.: Neural networks for optimal approximation of smooth and analytic functions. Neural Comput. 8, 164–177 (1996)
Montrucchio, L.: Lipschitz continuous policy functions for strongly concave optimization problems. J. Math. Econ. 16, 259–273 (1987)
Montrucchio, L.: Thompson metric, contraction property and differentiability of policy functions. J. Econ. Behav. Organ. 33, 449–466 (1998)
Munos, R., Szepesvári, C.: Finite-time bounds for fitted value iteration. J. Mach. Learn. Res. 1, 815–857 (2008)
Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM, Philadelphia (1992)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006)
Philbrick, C.R., Jr., Kitanidis, P.K.: Improved dynamic programming methods for optimal control of lumped-parameter stochastic systems. Oper. Res. 49, 398–412 (2001)
Pinkus, A.: Approximation theory of the MLP model in neural networks. Acta Numer. 8, 143–195 (1999)
Poggio, T., Smale, S.: The mathematics of learning: dealing with data. Not. Am. Math. Soc. 50, 537–544 (2003)
Powell, W.B.: Approximate Dynamic Programming. Wiley-Interscience, Hoboken (2007)
Puterman, M.L., Shin, M.C.: Modified policy iteration algorithms for discounted Markov decision processes. Manag. Sci. 41, 1127–1137 (1978)
Rapaport, A., Sraidi, S., Terreaux, J.: Optimality of greedy and sustainable policies in the management of renewable resources. Optim. Control Appl. Methods 24, 23–44 (2003)
Rincon-Zapatero, J.P., Santos, M.S.: An envelope theorem and some applications to discounted Markov decision processes. J. Econ. Theory 144, 19481764 (2009)
Sahinidis, N.V.: Optimization under uncertainty: state-of-the-art and opportunities. Comput. Chem. Eng. 28, 971–983 (2004)
Santos, M.S.: Smoothness of policy function in discrete time economic models. Econometrica 59, 1365–1382 (1991)
Santos, M.S.: On high-order differentiability of the policy function. Econ. Theory 3, 565–570 (1993)
Santos, M.S., Vila, J.L.: Smoothness of the policy function in continuous-time economic models: the one-dimensional case. J. Econ. Dyn. Control 15, 741–753 (1988)
Schweitzer, P.J., Seidmann, A.: Generalized polynomial approximations in Markovian decision processes. J. Math. Anal. Appl. 110, 568–582 (1985)
Secomandi, N.: Optimal commodity trading with a capacitated storage asset. Manag. Sci. 56, 449–467 (2010)
Semmler, W., Sieveking, M.: Critical debt and debt dynamics. J. Econ. Dyn. Control 24, 1121–1144 (2000)
Si, J., Barto, A.G., Powell, W.B., Wunsch, D. (eds.): Handbook of Learning and Approximate Dynamic Programming. IEEE Press, New York (2004)
Singer, I.: Best Approximation in Normed Linear Spaces by Elements of Linear Subspaces. Springer, Berlin (1970)
Sobol’, I.M.: The distribution of points in a cube and the approximate evaluation of integrals. Ž. Vyčisl. Mat. Mat. Fiz. 7, 784–802 (1967)
Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton University Press, Princeton (1970)
Stokey, N.L., Lucas, R.E., Prescott, E.: Recursive Methods in Economic Dynamics. Harvard University Press, Cambridge (1989)
ten Hagen, S., Kröse, B.: Neural Q-learning. Neural Comput. Appl. 12, 81–88 (2003)
Tesauro, G.: Practical issues in temporal difference learning. Mach. Learn. 8, 257–277 (1992)
Tsitsiklis, J.N.: Perspectives on stochastic optimization over time. INFORMS J. Comput. 22, 18–19 (2010)
Tsitsiklis, J.N., Roy, B.V.: Feature-based methods for large scale dynamic programming. Mach. Learn. 22, 59–94 (1996)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
White, D.J.: Markov Decision Processes. Wiley, New York (1993)
Zoppoli, R., Parisini, T., Sanguineti, M.: Approximating networks and extended Ritz method for the solution of functional optimization problems. J. Optim. Theory Appl. 112, 403–439 (2002)
Acknowledgements
The authors would like to thank the anonymous referees for their fruitful comments and suggestions, and for having pointed out the related literature [5, 28, 59, 74], and [82]. Part of the discussion contained in Sect. 1.4 is by one of the reviewers, to whom we are grateful. G. Gnecco and M. Sanguineti were partially supported by the Progetto di Ricerca di Ateneo 2012 “Models and Computational Methods for High-Dimensional Data”, granted by University of Genoa.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs and technical lemmas
Appendix: Proofs and technical lemmas
Proof of Proposition 5
(i) We use a backward induction argument. For t=N−1,…,0, assume that at stage t+1, \(\tilde{J}_{t+1} \in \mathcal {F}_{V,t+1}\) is such that \(\| J_{t+1}-\tilde{J}_{t+1}\|_{{\mathrm{sup}}(\bar{Y}_{t+1})} \leq {\eta}_{t+1}\) for some η t+1≥0. In particular, as \(\tilde{J}_{N} = J_{N}\) one has η N =0. Let \(\hat{J}_{t}\) be defined as in formula (11). This can be written as \(\hat{J}_{t}=T_{t} \tilde{J}_{t+1}\), where T t is Bellman’s operator, defined for every bounded and continuous function k t+1 on X t+1×Z t+1 as
Thanks to Assumption 3, it follows by an application of [80, Theorem 9.6] that
Before moving to the t-th stage, one has to find an approximation \(\tilde{J}_{t} \in \mathcal {F}_{V,t}\) for J t =T t J t+1. Such an approximation has to be obtained from \(\hat{J}_{t}=T_{t} \tilde{J}_{t+1}\) (which, in general, may not belong to \(\mathcal {F}_{V,t}\)), because J t =T t J t+1 is unknown.
For a bounded and continuous function f on \(\bar{Y}_{t}\) and r>0, denote by B r (f) the closed ball of radius r in the sup-norm over \(\bar{Y}_{t}\) centered at f, i.e., \(B_{r}(f):=\{g \ \mathrm{bounded} \ \mathrm{and}\ \mathrm{continuous}\ \mathrm{function}\ \mathrm{on}\ \bar{Y}_{t}\,|\, \|f-g\|_{{\mathrm{sup}}(\bar{Y}_{t})} \leq r \}\). Let \(J_{t} \in B_{\beta\eta_{t+1}}(\hat{J}_{t})\). By assumption, there exists \(f_{t} \in \mathcal {F}_{V,t}\) such that \(\|J_{t}-f_{t}\|_{{\mathrm{sup}}(\bar{Y}_{t})} \leq \varepsilon_{t}\), i.e., \(f_{t} \in B_{\varepsilon_{t}}(J_{t}) \cap \mathcal {F}_{V,t}\). Since J t is unknown, one cannot simply set \(\tilde{J}_{t}:=f_{t}\) but one can defineFootnote 7 \(\tilde{J}_{t}\) as any element of the set
Hence, by the assumption on f t , the definition of \(\tilde{J}_{t}\), and the triangle inequality, we get
Moreover, by (51) we have
Starting from \(\tilde{J}_{N} = J_{N}\) and making explicit the dependence of η t from ε t ,…,ε N−1, after N−t iterations (t=0,…,N−1) we obtain
Then, we obtain \(\|J-\tilde{J}\|_{{\mathrm{sup}}(\bar{Y}_{0})}=\|J_{0}-\tilde{J}_{0}\|_{{\mathrm{sup}}(\bar{Y}_{0})} \leq \sum_{i=0}^{N-1}{(2\beta)^{i}\varepsilon_{i}}\).
(ii) In order to prove (19), we need to find, for each t, suitable positive constants η t , δ t and χ t such that
In the following we show how one can define such constants in a recursive way and we prove that
from which (19) follows. We divide this part of the proof in several steps.
- Step 1: stage N−1.:
-
We recall that the policy \(\tilde{g}_{V,N-1}(x_{N-1},z_{N-1})\) is defined as
$$\begin{aligned} & \tilde{g}_{V,N-1}(x_{N-1},z_{N-1}) \\ &\quad :=\arg\max_{y \in \mathcal{D}_{N-1}(x_{N-1},z_{N-1})} \Bigl[h_{N-1}(x_{N-1},y,z_{N-1}) + \beta \mathop {{\mathbb{E}}}_{z_N|z_{N-1}} \bigl\{ \tilde{J}_N (y,z_{N}) \bigr\} \Bigr]. \end{aligned}$$(54)Since \(\tilde{J}_{N}=J_{N}\), the policy \(\tilde{g}_{V,N-1}\) is optimal also for the problem (54) with \(\tilde{J}_{N}\) replaced by J N . Then we obtain \(\tilde{J}_{V,N-1}=J_{N-1}\). Concluding, we can set δ N−1:=0 and χ N−1:=η N−1.
- Step 2: stage N−2.:
-
The reoptimized policy \(\tilde{g}_{V,N-2}(x_{N-2},z_{N-2})\) is defined as
$$\begin{aligned} &\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}) \\ &\quad := \arg\max_{y \in \mathcal{D}_{N-2}(x_{N-2},z_{N-2})} \Bigl[h_{N-2}(x_{N-2},y,z_{N-2}) + \beta \mathop {{\mathbb{E}}}_{z_{N-1}|z_{N-2}} \bigl\{ \tilde{J}_{N-1} (y,z_{N-1}) \bigr\} \Bigr]. \end{aligned}$$(55)Now, we evaluate the distance of \(\tilde{J}_{V,N-2}\) from J N−2 in the sup-norm, then we determine δ N−2. Since \(\tilde{J}_{V,N-2}\) corresponds to the application of the suboptimal policies \(\tilde{g}_{V,N-2}\) and \(\tilde{g}_{V,N-1}\) at the stages N−2 and N−1, we get
$$ \tilde{J}_{V,N-2}(x_{N-2},z_{N-2}) \leq J_{N-2}(x_{N-2},z_{N-2}). $$(56)Moreover,
$$\begin{aligned} &\tilde{J}_{V,N-2}(x_{N-2},z_{N-2}) \\ &\quad =h_{N-2} \bigl(x_{N-2},\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \\ &\qquad {}+ \beta \mathop {{\mathbb{E}}}_{z_{N-1}|z_{N-2}} \bigl\{ \tilde{J}_{N-1} \bigl(\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \bigr\} \\ &\quad =h_{N-2}\bigl(x_{N-2},\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \\ &\qquad {}+ \beta \mathop {{\mathbb{E}}}_{z_{N-1}|z_{N-2}} \bigl\{ \tilde{J}_{N-1} \bigl( \tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \bigr\} \\ &\qquad {}+\beta \mathop {{\mathbb{E}}}_{z_{N-1}|z_{N-2}} \bigl\{ \tilde{J}_{V,N-1} \bigl( \tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \\ &\qquad {}-\tilde{J}_{N-1}\bigl(\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2} \bigr) \bigr\} \\ &\quad \geq J_{V,N-2}(x_{N-2},z_{N-2}) - \beta \eta_{N-1} - \beta \chi_{N-1}, \end{aligned}$$(57)where the last inequality in (57) is obtained observing that
$$\begin{aligned} &\bigl\vert \tilde{J}_{V,N-1} \bigl(\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \\ &\quad {}-\tilde{J}_{N-1}\bigl(\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2} \bigr)\bigr\vert \leq\chi_{N-1}, \\ & h_{N-2}\bigl(x_{N-2},\tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2} \bigr) \\ &\qquad {}+ \beta \mathop {{\mathbb{E}}}_{z_{N-1}|z_{N-2}} \bigl\{ \tilde{J}_{N-1} \bigl( \tilde{g}_{V,N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \bigr\} \\ &\quad =(T_{N-2} \tilde{J}_{N-1}) (x_{N-2},z_{N-2}), \\ & h_{N-2}\bigl(x_{N-2},g_{N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \\ &\qquad {}+\beta \mathop {{\mathbb{E}}}_{z_{N-1}|z_{N-2}} \bigl\{ J_{N-1}\bigl(g_{N-2}(x_{N-2},z_{N-2}),z_{N-2}\bigr) \bigr\} \\ &\quad =(T_{N-2} J_{N-1}) (x_{N-2},z_{N-2})= J_{N-2} (x_{N-2},z_{N-2}) \end{aligned}$$(where T N−2 is Bellman’s operator at the stage N−2), and
$$ \|T_{N-2} \tilde{J}_{N-1}-T_{N-2} J_{N-1}\|_{{\mathrm{sup}}(\bar{Y}_{N-2})} \leq \beta \|\tilde{J}_{N-1}-J_{N-1} \|_{{\mathrm{sup}}(\bar{Y}_{N-1})} \leq \beta \eta_{N-1}, $$(58)and the first inequality in (58) follows by an application of the contraction property of Bellman’s operator [80, Theorem 9.6].
Concluding, we obtain \(\|\tilde{J}_{V,N-2}-J_{N-2}\|_{{\mathrm{sup}}(\bar{Y}_{N-2})} \leq \delta_{N-2}:= \beta(\eta_{N-1}+\chi_{N-1})\) and, by an application of the triangle inequality, \(\|\tilde{J}_{V,N-2}-\tilde{J}_{N-2}\|_{{\mathrm{sup}}(\bar{Y}_{N-2})} \leq \chi_{N-2}:=\eta_{N-2}+\delta_{N-2}\).
- Step 3: other stages.:
-
Similarly, we get the following recursive formulas for the other stages (t=N−3,…,0): \(\|\tilde{J}_{V,t}-J_{t}\|_{{\mathrm{sup}}(\bar{Y}_{t})} \leq \delta_{t}:= \beta(\eta_{t+1}+\chi_{t+1})\) and \(\|\tilde{J}_{V,t}-\tilde{J}_{t}\|_{{\mathrm{sup}}(\bar{Y}_{t})} \leq \chi_{t}:=\eta_{t}+\delta_{t}\).
- Step 4: dependence of δ 0 from the errors η 0,…,η N−1.:
-
$$ \begin{aligned} &\delta_{N-1}=0, \\ &\chi_{N-1}=\eta_{N-1}, \\ &\delta_{N-2}=\beta(\eta_{N-1}+\chi_{N-1})=2\beta \eta_{N-1}, \\ &\chi_{N-2}=\eta_{N-2}+\delta_{N-2}= \eta_{N-2}+2\beta \eta_{N-1}, \\ &\delta_{N-3}=\beta(\eta_{N-2}+\chi_{N-2})=2\beta \eta_{N-2} + 2 \beta^2 \eta_{N-1}, \\ &\chi_{N-3}=\eta_{N-3}+\delta_{N-3}= \eta_{N-3}+2\beta \eta_{N-2} + 2 \beta^2 \eta_{N-1}, \\ &\delta_{N-4}=\beta(\eta_{N-3}+\chi_{N-3})=2\beta \eta_{N-3} + 2 \beta^2 \eta_{N-2} + 2 \beta^3 \eta_{N-1}, \\ &\vdots \\ &\delta_{0}=2 \beta \eta_1+2 \beta^2 \eta_2 + 2 \beta^3 \eta_3 + \cdots + 2 \beta^{N-1} \eta_{N-1}=2 \sum_{i=1}^{N-1} \beta^i \eta_i. \end{aligned} $$(59)
- Step 5: dependence of δ 0 from the errors ε 1,…,ε N−1.:
-
By (59) and the expressions (53) for η 0,…,η N−1, we make explicit the dependence of δ 0 from the errors ε 1,…,ε N−1, as shown in the following:
$$\begin{aligned} \delta_{0} =& 2 \sum_{i=1}^{N-1} \sum_{j=0}^{N-1-i} \beta^i (2\beta)^j \varepsilon_{i+j} \\ =& 2 \bigl[\beta \varepsilon_1 + \bigl(\beta^2 + 2 \beta^2\bigr) \varepsilon_2 + \bigl(\beta^3 + 2 \beta^3 + 4 \beta^3\bigr) \varepsilon_3 + \cdots \\ &{}+ \bigl(\beta^{N-1} + 2 \beta^{N-1} + \cdots + 2^{N-2} \beta^{N-1}\bigr) \varepsilon_{N-1} \bigr] \\ =& 2 \sum_{i=1}^{N-1} \Biggl(\sum_{j=0}^{i-1} 2^j \Biggr) \beta^i \varepsilon_i = 2 \sum_{i=1}^{N-1} \bigl(2^i-1\bigr) \beta^i \varepsilon_i =2 \sum_{i=1}^{N-1} (2 \beta)^i \varepsilon_i - 2 \sum_{i=1}^{N-1} \beta^i \varepsilon_i, \end{aligned}$$which concludes the proof. □
Proof of Proposition 6
We compare the behavior of the two dynamical systems x t+1=g t (x t ,z t ) and \(\tilde{x}_{t+1}=\tilde{g}_{P,t}(\tilde{x}_{t},z_{t})\), for the same initial state \(\tilde{x}_{0}=x_{0}\) and the same sequence of random vectors z t . By the Lipschitz continuity of h t and h N , we get
The Lipschitz continuity of g t and the triangle inequality give
Let \(\tilde{\theta}_{t} :=\|x_{t}-\tilde{x}_{t}\|\). Then \(\tilde{\theta}_{0}=0\) and Eq. (61) together with \(\|g_{t} -\tilde{g}_{P,t}\|_{{\mathrm{sup}}(\bar{Y}_{t})} \leq {\varepsilon}_{t}\) provide \(\tilde{\theta}_{t+1} \leq L_{g_{t}}\tilde{\theta}_{t}+{\varepsilon}_{t}\) for t=0,…,N−1. So, the upper bound \(\|J {-}\,\tilde{J}_{P}\|_{{\mathrm{sup}}(\bar{Y}_{0})}\leq \sum_{t=0}^{N-1} \beta^{t} L_{h_{t}} \sqrt{\theta_{t}^{2}+\theta_{t+1}^{2}}+\beta^{N} L_{h_{N}}\theta_{N}\) is obtained by (60) and the definition of θ t (which does not depend on the particular sequence of random vectors z t ). As θ t is nonnegative for t=0,…,N−1, we have \(\sqrt{\theta_{t}^{2}+\theta_{t+1}^{2}} \leq \theta_{t} +\theta_{t+1}\). This, together with the upper bound \(\|J -\tilde{J}_{P}\|_{{\mathrm{sup}}(\bar{Y}_{0})}\leq \sum_{t=0}^{N-1} \beta^{t} L_{h_{t}} \sqrt{\theta_{t}^{2}+\theta_{t+1}^{2}}+\beta^{N} L_{h_{N}}\theta_{N}\) and the linear difference equation \(\theta_{t+1}=L_{g_{t}}\theta_{t}+{\varepsilon}_{t}\), yields the estimate \(\|J -\tilde{J}_{P}\|_{{\mathrm{sup}}(\bar{Y}_{0})} \leq \sum_{t=0}^{N-1} \varGamma_{t} {\varepsilon}_{t}\). □
The next two lemmas provide smoothness properties of value functions and optimal policies. Lemma 17 (which gives high-order continuity properties) will be used in the proof of Lemma 18, which, in turn, will be exploited in the proofs of Propositions 7 and 8.
Lemma 17
Under Assumption 2, \(J_{t} \in \mathcal {C}^{m}(X_{t} \times Z_{t})\), t=0,…,N, and the optimal policy g t satisfies \(g_{t} \in \mathcal {C}^{m-1}(X_{t} \times Z_{t}, \mathbb {R}^{d})\), t=0,…,N−1.
Proof of Lemma 17
The regularity conditions stated in Assumption 2 (among these, Assumption 2(iii), which is used only in this part of the proof) guarantee that solving Problem \(\varSigma_{N}^{\mathrm{stoch}}\) is equivalent to solving iteratively Bellman’s equations (this follows, e.g., by an application of [48, Theorem 2.6]). The remaining of the proof works by backward induction as described in the following.
Suppose that, for t=0,…,N−1 and the value of α t+1>0 given in Assumption 2(iv) for t=N−1 and (v) for t=0,…,N−2, one knows that \(J_{t+1}(x_{t+1},z_{t+1}) \in \mathcal {C}^{m}(X_{t+1} \times Z_{t+1})\) and is α t+1-concave in x t+1 (i.e., if one fixes z t+1, then J t+1(x t+1,z t+1) is α t+1-concave as a function of x t+1). For t=N−1 this is guaranteed by Assumption 2(iv). We show that also \(J_{t}(x_{t},z_{t}) \in \mathcal {C}^{m}(X_{t} \times Z_{t})\) is α t -concave in its first argument (i.e., the function J t (⋅,z t ) is α t -concave for every fixed z t ∈Z t ), with the same value of α t >0 given in Assumption 2(v).
Let \((x_{t}, z_{t}) \in \operatorname {int}(X_{t} \times Z_{t})\) and \(J^{e}_{t+1}(y,z_{t}):=\mathop {{\mathbb{E}}}_{z_{t+1}|z_{t}} \{J_{t+1}(y,z_{t+1}) \}\). By computing the conditional expected value, it is easy to see that \(J^{e}_{t+1}(y,z_{t})\) is α t+1-concave in y. As \(J_{t+1} \in \mathcal {C}^{m}(X_{t+1} \times Z_{t+1})\) and \(p_{z_{t+1}|{z_{t}}} \in \mathcal {C}^{m}(Z_{t+1} \times Z_{t})\), by interchanging derivatives and conditional expectations we get \(J^{e}_{t+1} \in \mathcal {C}^{m}(X_{t+1} \times Z_{t})\). Since by hypothesis the optimal policy g t is interior, the first-order optimality condition
holds. Then, by the Implicit Function Theorem we get
and
where the matrix \(\nabla^{2}_{2,2} (h_{t}(x_{t},g_{t}(x_{t},z_{t}),z_{t}) )+ \beta \nabla^{2} J^{e}_{t+1}(g_{t}(x_{t},z_{t}))\) is non-singular. Indeed, by the concavity of h t with respect to its second argument (which holds by Assumption 2(v)), the matrix \(\nabla^{2}_{2,2} (h_{t}(x_{t},g_{t}(x_{t},z_{t}),z_{t}))\) is negative semi-definite, and the matrix \(\nabla^{2}_{1,1} J^{e}_{t+1}(g_{t}(x_{t},z_{t}),z_{t})\) is negative definite since \(J^{e}_{t+1}\) is α t+1-concave in the first argument and α t+1>0. Hence, the matrix \(\nabla^{2}_{2,2} (h_{t}(x_{t},g_{t}(x_{t},z_{t}),z_{t}) )+ \beta \nabla^{2} J^{e}_{t+1}(g_{t}(x_{t},z_{t}))\) is non-singular and so (63) and (64) are well defined.
By differentiating (63) and (64) up to the order m, we get \(g_{t} \in C^{m-1}(\operatorname {int}(X_{t} \times Z_{t}), \mathbb {R}^{d})\). As the expressions obtained for its partial derivatives up to the order m−1 are bounded and continuous not only on \(\operatorname {int}(X_{t} \times Z_{t})\) but on the whole X t ×Z t , one has \(g_{t} \in C^{m-1}(X_{t} \times Z_{t}, \mathbb {R}^{d})\).
Since \(J_{t}(x_{t},z_{t})=h_{t}(x_{t},g_{t}(x_{t}),z_{t})+\beta J^{e}_{t+1}(g_{t}(x_{t},z_{t}),z_{t})\), by differentiating (62) we get
where the last equality holds by the first-order optimality condition (62). Similarly, we have
By differentiating (65) and (66) up to the order m, we get \(J_{t} \in \mathcal {C}^{m}(\operatorname {int}(X_{t} \times Z_{t}))\). Likewise for the optimal policies, this extends to \(J_{t} \in \mathcal {C}^{m}(X_{t} \times Z_{t})\).
In order to conclude the backward induction step, it remains to show that J t is α t -concave in its first argument for the same value of α t >0 in Assumption 2(v). By [58, Proposition 1], for every \(a_{0} \in \mathbb {R}^{d}\) and z t fixed we get
By Assumption 2(v) and the concavity of \(J^{e}_{t+1}\) with respect to the first argument, it follows that \(a_{0}^{T}\nabla^{2}_{1,1} J_{t}(x_{t},z_{t})a_{0} \leq -\alpha_{t} \|a_{0}\|^{2}\). As a twice continuously differentiable function is α-concave iff the maximum eigenvalue of its Hessian is at most −α, it follows that J t is α t -concave in its first argument. □
Recall that for r>0 and a positive integer l, the Bessel potential of order r on \(\mathbb {R}^{l}\), denoted by G r (x), is the inverse Fourier transform of the function \({\hat {G}_{r}}(\omega):= (2\pi)^{-\frac{d}{2}}(1+\|{\omega}\|^{2})^{-r/2}\). It is continuous when r>l [79, p. 132]. For r>0 and 1≤p≤∞, \(\mathcal {B}^{r}_{p}(\mathbb {R}^{l})\) is the Bessel potential space, whose elements are functions u such that u=f∗G r , where \(f \in \mathcal {L}_{p}(\mathbb {R}^{l})\) [79, p. 134].
For an open set \(\varOmega \subseteq \mathbb {R}^{l}\), a positive integer m, and 1≤p≤∞, by \(\mathcal {W}_{p}^{m}(\varOmega)\) we denote the Sobolev space of functions whose partial derivatives up to the order m are in \(\mathcal {L}_{p}(\varOmega)\).
Lemma 18
Let Assumption 2 be satisfied. Then the following hold.
-
Case 1: m≥2 and 1<p<∞.
-
(i)
For every t=0,…,N there exists a function \(\bar{J}^{p}_{t} \in \mathcal {B}^{m}_{p}(\mathbb {R}^{d+s})\) such that \(J_{t}=\bar{J}^{p}_{t}|_{X_{t} \times Z_{t}}\).
-
(ii)
For every t=0,…,N−1 and j=1,…,d there exists a function \(\bar{g}_{t,j}^{p} \in \mathcal {B}^{m-1}_{p}(\mathbb {R}^{d+s})\) such that, for the component g t,j of the optimal policy g t , one has \(g_{t,j}=\bar{g}^{p}_{t,j}|_{X_{t} \times Z_{t}}\).
-
(i)
-
Case 2: m≥2 and p=1.
-
(iii)
If m=2, then (i) holds with \(\mathcal {B}^{m}_{p}(\mathbb {R}^{d+s})\) replaced by \(\mathcal {B}^{2}_{1}(\mathbb {R}^{d+s})\).
-
(iv)
If m>2 is even, then (i) holds with \(\mathcal {B}^{m}_{p}(\mathbb {R}^{d+s})\) replaced by \(\mathcal {B}^{m}_{1}(\mathbb {R}^{d+s})\) and (ii) holds with \(\mathcal {B}^{m-1}_{p}(\mathbb {R}^{d+s})\) replaced by \(\mathcal {B}^{m-2}_{1}(\mathbb {R}^{d+s})\).
-
(v)
If m is odd, then (i) holds with \(\mathcal {B}^{m}_{p}(\mathbb {R}^{d+s})\) replaced by \(\mathcal {B}^{m-1}_{1}(\mathbb {R}^{d+s})\) and (ii) holds with \(\mathcal {B}^{m-1}_{p}(\mathbb {R}^{d+s})\) replaced by \(\mathcal {B}^{m-1}_{1}(\mathbb {R}^{d+s})\).
-
(iii)
Proof of Lemma 18
-
(i)
By Lemma 17 and the compactness of X t ×Z t , we get \(J_{t} \in \mathcal {C}^{m}(X_{t} \times Z_{t}) \subset \mathcal {W}^{m}_{p}(\operatorname {int}(X_{t} \times Z_{t}))\). As X t is bounded and convex, by [38, Corollary 1.2.2.3, p. 12] it has a Lipschitz boundary. By Assumption 2(i), (ii) X t ×Z t has a Lipschitz boundary, too. So, by Sobolev’s extension theorem [79, Theorem 5, p. 181], which holds thanks to the regularity of the boundary, J t can be extended for every 1≤p≤∞ on the whole \(\mathbb {R}^{d+s}\) to a function \(\bar{J}_{t}^{p} \in \mathcal {W}^{m}_{p}(\mathbb {R}^{d+s})\). By the equivalence between Sobolev spaces and Bessel potential spaces [79, Theorem 3, p. 135], which holds for 1<p<∞, we have \(\bar{J}^{p} \in \mathcal {B}^{m}_{p}(\mathbb {R}^{d+s})\) which, when restricted to X t ×Z t , proves (i).
-
(ii)
By Lemma 17, \(g_{t,j} \in \mathcal {C}^{m-1}(X_{t} \times Z_{t}) \subset \mathcal {W}^{m-1}_{p}(\operatorname {int}(X_{t} \times Z_{t}))\). Then the statement follows by similar arguments as in the last part of the proof of (i).
-
(iii), (iv), and (v)
As d+s>1, for l even the inclusion \(\mathcal {W}^{l}_{1}(\mathbb {R}^{d+s}) \subset \mathcal {B}^{l}_{1}(\mathbb {R}^{d+s})\) holds [79, Remark 6.6(b), p. 160]. Then (iii), (iv), and (v) follow by combining this with \(J_{t} \in \mathcal {W}^{m}_{1}(\mathbb {R}^{d+s})\) and \(g_{t,j} \in \mathcal {W}^{m-1}_{1}(\mathbb {R}^{d+s})\). □
The next lemma, which is proven in [34, Sect. 5] and will be used in the proof of Proposition 7, states an approximation property that holds for functions belonging to the Bessel potential space \(\mathcal {B}^{r}_{1}(\mathbb {R}^{l})\), when r>l.
Lemma 19
[34, Corollary 5.2]
Let l be a positive integer and r>l. Then for every \(f \in \mathcal {B}^{r}_{1}(\mathbb {R}^{l})\) and every positive integer n there exist \(t_{1}, \ldots, t_{n} \in \mathbb {R}^{l}\), b 1,…,b n >0, δ 1,…,δ n ∈{−1,+1}, and an absolute constant C>0 such that
where \(K(r,l):= 2^{-l/2}\,\frac{\varGamma(r/2-d/2)}{\varGamma(r/2)}\|\lambda\|_{\mathcal {L}_{1}(\mathbb {R}^{l})}\) and \(\varGamma(z) := \int_{0}^{\infty}y^{z-1}e^{-y}\,dy\) is the Gamma function.
Proof of Proposition 7
-
(i)
Let d+s be even. By Lemma 18, there exists a function \(\bar{J}^{o,1}_{t} \in \mathcal {B}^{d+s+2}_{1}(\mathbb {R}^{d+s})\) such that \(J_{t}=\bar{J}^{o,1}_{t}|_{X_{t} \times Z_{t}}\). Since the order r=d+s+2 of the Bessel potential associated to \(\mathcal {B}^{r}_{1}(\mathbb {R}^{d+s})\) is larger than the number of variables d+s, we conclude the proof by Lemma 19.
For d+s odd, by Lemma 18 there exists a function \(\bar{J}_{t} \in \mathcal {B}^{d+s+1}_{1}(\mathbb {R}^{d+s})\) such that \(J^{o,1}_{t}=\bar{J}^{o,1}_{t}|_{X_{t} \times Z_{t}}\). As the order r=d+s+1 of the Bessel potential associated to \(\mathcal {B}^{r}_{1}(\mathbb {R}^{d+s})\) is larger than the number of variables d+s, we conclude the proof by Lemma 19.
-
(ii)
It is proven like (i) by replacing m, J t , and \(\bar{J}^{o,1}_{t}\) with m−1, g t , and \(\bar{g}^{o,1}_{t}\), resp. □
The next lemma, which is an immediate consequence of [56, Theorem 2.1], will be used in the proof of Proposition 8. Recall that in Sect. 4 we have defined
Lemma 20
Let r and l be positive integers and \(\psi \in \mathcal {S}^{\prime}\). Then for every \(f \in \mathcal {C}^{r}(\mathbb {R}^{l})\) and every positive integer n there exist \(a_{1}, \ldots, a_{n} \in \mathbb {R}^{l}\), \(b_{1}, \ldots, b_{n} \in \mathbb {R}\), \(\delta_{1}, \ldots, \delta_{n} \in \mathbb {R}\), and a constant C>0 (we emphasize that C does not depend on f) such that
Proof of Proposition 8
One proceeds as the proof of Proposition 7, using Lemma 20 instead of Lemma 18 and noting that for r≥⌊l/2⌋+1, the second member of (68) can be replaced by Cn −1/2. □
Proof of Theorem 9
One combines Proposition 7(i) with Proposition 5(i) and (ii). □
Proof of Theorem 10
Proceeding as in the proof of Lemma 17, we obtain that the t-th value function J t and the t-th optimal policy g t are Lipschitz continuous. Hence, for t=0,…,N−1 and any sequence of ε t >0, Proposition 6(ii) gives
for any sequence of functions \(\gamma_{t} \in \mathcal {G}(n_{t,1},\ldots,n_{t,d}; \bar{Y}_{t}, X_{t+1})\) such that \(\|g_{t} -\gamma_{t}\|_{{\mathrm{sup}}(\bar{Y}_{t})} \leq {\varepsilon}_{t}\) (t=0,…,N−1).
By Proposition 7(ii), for every t=0,…,N−1 and j=1,…,d, there exists C t,j >0 such that, for every positive integer n t,j , there exist \(\tau_{t,j,1}, \ldots,\tau_{t,j,n_{t,j}} \in \mathbb {R}^{d}\), \(b_{t,j,1}, \ldots, b_{t,j,n_{t,j}}>0\), and \(\delta_{t,j,1}, \ldots, \delta_{t,j,n_{t,j}} \in \mathbb {R}\) for which
By (16), for every continuous function \(\gamma_{t}: \bar{Y}_{t} \to \mathbb {R}^{d}\) we have
Since \(g_{t}(\bar{y}_{t}) \in X_{t+1}\) and \(\operatorname {Prj}_{\mathcal {D}_{t}(x_{t},z_{t})}\) is a nonexpansive map, from (70) and (71) we get
Hence, for every t=0,…,N−1, there exists \({\tilde{g}}_{t} \in \mathcal {G}(n_{t,1},\ldots,n_{t,d}; \bar{Y}_{t}, X_{t+1})\) such that
We conclude by taking \({\varepsilon}_{t} = \sqrt{\sum_{j=1}^{d} \frac{C_{t,j}^{2}}{n_{t,j}}}\) in (69). □
Proofs of Theorems 11 and 12
They are along the lines of the proofs of Theorems 9 and 10, resp., by exploiting Proposition 8 instead of Proposition 7. □
Proof of Proposition 14
We first derive some conditions on the sets A t,j , then we show that the budget constraints (46) are satisfied if and only if the sets A t,j are chosen as in Assumption 13 (or are suitable subsets).
As the labour incomes y t,j and the stochastic interest rates r t,j are bounded from above, for t=1,…,N we have
(the upper bound is achieved when all the r t,j are equal to \(r_{t,j}^{\mathrm{max}}\) and all the consumptions c t,j are equal to 0), so the corresponding feasible sets A t,j are bounded from above by \(a_{t,j}^{\mathrm{max}}\). Boundedness from below of each A t,j follows from the budget constraints (46), which for c k,j =0 (k=t,…,N) are equivalent to
for t=N and
i.e.,
for t=0,…,N−1. For k=t,…,N−1, one can show that the maximum value of the right-hand side in the lower bound (74) occurs when each r k,j is equal to \(r_{k,j}^{\mathrm{max}}\) (indeed, by computing its partial derivatives with respect to r k,j it is easy to see that they are nonnegative). So, in order to satisfy the budget constraints (46) independently of the specific sequence of stochastic interest rates, the constraints (73) and
have to be satisfied. Then the maximal sets A t that satisfy the budget constraints (46) have the form described in Assumption 13. □
Proof of Proposition 16
-
(a)
Assumption 2(i), (ii), and (iii). These follow immediately by the choices of A t and R t , and by Assumption 15(i). By construction, the sets \(\bar{A}_{t}\) are compact, convex, and have nonempty interiors, since they are Cartesian products of nonempty closed intervals. The same holds for the sets \(\bar{D}_{t}\), since by (48) they are the intersections between \(\bar{A}_{t} \times R_{t} \times \bar{A}_{t+1}\) and the sets D t , which are compact, convex, and have nonempty interiors, too.
-
(b)
Assumption 2(iv). Recall that for Problem OCUd we have
$$h_N(a_N)= u (a_N+y_N ). $$Then, \(h_{N} \in \mathcal {C}^{m}(\bar{A}_{N})\) and by Assumption 15(ii) is α N -concave with α N =α.
-
(c)
Assumption 2(v). Recall that for Problem OCUd and t=0,…,N−1, we have
$$ h_t(a_t,a_{t+1})=u \biggl(\frac{(1+r_t) \circ\,(a_t+y_t)-a_{t+1}}{1+r_t} \biggr)+\sum_{j=1}^d v_{t,j}(a_{t,j}). $$Then, \(h_{t} \in \mathcal {C}^{m}(\bar{D}_{t})\) by Assumption 15(ii), (iii). As u and v t,j are twice continuously differentiable, the second part of Assumption 2(v) means that there exists some α t >0 such that, for each fixed r t , the function
$$u \biggl(\frac{(1+r_t) \circ\,(a_t+y_t)-a_{t+1}}{1+r_t} \biggr)+\sum_{j=1}^d v_{t,j}(a_{t,j})+ \frac{1}{2}\alpha_t \|a_t\|^2 $$has negative semi-definite Hessian with respect to the variables a t and a t+1. Assumption 15(ii) and straightforward computations show that, for fixed r t , the function
$$u \biggl(\frac{(1+r_t) \circ\,(a_t+y_t)-a_{t+1}}{1+r_t} \biggr) $$has negative semi-definite Hessian. By Assumption 15(iii), for each j=1,…,d and α t,j ∈(0,β t,j ], \(v_{t,j}(a_{t,j})+ \frac{1}{2}\alpha_{t,j} a_{t,j}^{2}\) has negative semi-definite Hessian, too. So, Assumption 2(v) is satisfied for every α t ∈(0,min j=1,…,d {β t,j }].
-
(d)
Assumption 2(vi). This is Assumption 15(i), with the obvious replacements of X t and D t . □
Rights and permissions
About this article
Cite this article
Gaggero, M., Gnecco, G. & Sanguineti, M. Approximate dynamic programming for stochastic N-stage optimization with application to optimal consumption under uncertainty. Comput Optim Appl 58, 31–85 (2014). https://doi.org/10.1007/s10589-013-9614-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-013-9614-z