Abstract
The aim of this work is to present a sampling-based algorithm designed to solve various classes of stochastic differential games. The foundation of the proposed approach lies in the formulation of the game solution in terms of a decoupled pair of forward and backward stochastic differential equations (FBSDEs). In light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the nonlinear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained. These representations are in form of decoupled systems of FBSDEs, which may be solved numerically.
Similar content being viewed by others
Notes
A process \(H_s\) is called square-integrable if \( \mathbb {E}\big [\int _{t}^{T}H_s^2\mathrm {d}s\big ] < \infty \) for any \(T>t\).
The Isaacs condition renders the viscosity solutions of the upper and lower value functions equal, thus making the order of maximization/minimization inconsequential.
While X is a function of s and \(\omega \), we shall use \(X_s\) for notational brevity.
Here, \(Y_i^m\) denotes the quantity \(Y^m_{i+1} +\varDelta t_ih(t_{i+1},X^m_{i+1},Y^m_{i+1},Z^m_{i+1})\), which is the \(Y^m_i\) sample value before the conditional expectation operator has been applied.
Whenever the m index is not present, the entirety with respect to this index is to be understood.
References
Athans M, Falb P (2007) Optimal control—an introduction to the theory and its applications. Dover Publications Inc, New York
Barles G, Souganidis P (1991) Convergence of approximation schemes for fully nonlinear second order equations. Asymptot Anal 4(3):271–283
Beard R, Saridis G, Wen J (1997) Galerkin approximation of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177
Bender C, Denk R (2007) A forward scheme for backward SDEs. Stoch Process Appl 117:1793–1812
Berkovitz L (1961) A variational approach to differential games. RAND Corporation Report
Bouchard B, Touzi N (2004) Discrete time approximation and Monte Carlo simulation of BSDEs. Stoch Process Appl 111:175–206
Bouchard B, Elie R, Touzi N (2009) Discrete-time approximation of BSDEs and probabilistic schemes for fully nonlinear PDEs. Radon Ser Comput Appl Math 8:91–124
Buckdahn R, Li J (2008) Stochastic differential games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM J Control Optim 47(1):444–475
Chassagneux JF, Richou A (2016) Numerical simulation of quadratic BSDEs. Ann Appl Probab 26(1):262–304
Da Lio F, Ley O (2006) Uniqueness results for second-order Bellman-Isaacs equations under quadratic growth assumptions and applications. SIAM J Control Optim 45(1):74–106
Delbaen F, Hu Y, Richou A (2011) On the uniqueness of solutions to quadratic BSDEs with convex generators and unbounded terminal conditions. Annales de l’Institut Henri Poincarè, Probabilitès et Statistiques 47(2):559–574
Dixon M, Edelbaum T, Potter J, Vandervelde W (1970) Fuel optimal reorientation of axisymmetric spacecraft. J Spacecr Rockets 7(11):1345–1351
Douglas J, Ma J, Protter P (1996) Numerical methods for forward-backward stochastic differential equations. Ann Appl Probab 6:940–968
Duncan T, Pasik-Duncan B (2015) Some stochastic differential games with state dependent noise. In: 54th IEEE conference on decision and control, Osaka, Japan, December 15–18
Dvijotham K, Todorov E (2013) Linearly solvable optimal control. In: Lewis FL, Liu D (eds) Reinforcement learning and approximate dynamic programming for feedback control, pp 119–141. https://doi.org/10.1002/9781118453988.ch6
El Karoui N, Peng S, Quenez MC (1997) Backward stochastic differential equations in finance. Math Finance 7:1–71
Exarchos I, Theodorou E (2018) Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87:159–165
Fahim A, Touzi N, Warin X (2011) A probabilistic numerical method for fully nonlinear parabolic PDEs. Ann Appl Probab 21(4):1322–1364
Fleming W, Soner H (2006) Controlled Markov processes and viscosity solutions, 2nd edn. Stochastic modelling and applied probability. Springer, Berlin
Fleming W, Souganidis P (1989) On the existence of value functions of two player zero-sum stochastic differential games. Indiana University Mathematics Journal, New York
Gobet E, Labart C (2007) Error expansion for the discretization of backward stochastic differential equations. Stoch Process Appl 117:803–829
Gorodetsky A, Karaman S, Marzouk Y (2015) Efficient high-dimensional stochastic optimal motion control using tensor-train decomposition. In: Robotics: science and systems (RSS)
Györfi L, Kohler M, Krzyzak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer series in statistics. Springer, New York
Hamadene S, Lepeltier JP (1995) Zero-sum stochastic differential games and backward equations. Syst Control Lett 24:259–263
Ho Y, Bryson A, Baron S (1965) Differential games and optimal pursuit-evasion strategies. IEEE Trans Autom Control 10:385–389
Horowitz MB, Burdick JW (2014) Semidefinite relaxations for stochastic optimal control policies. In: American control conference, Portland, June 4–6 pp 3006–3012
Horowitz MB, Damle A, Burdick JW (2014) Linear Hamilton Jacobi Bellman equations in high dimensions. In: 53rd IEEE conference on decision and control, Los Angeles, California, USA, December 15–17
Isaacs R (1965) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Willey, New York
Kappen HJ (2005) Linear theory for control of nonlinear stochastic systems. Phys Rev Lett 95:200201
Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New York
Kloeden P, Platen E (1999) Numerical solution of stochastic differential equations, vol 23 of Applications in Mathematics, Stochastic modelling and applied probability, 3rd edn. Springer, Berlin
Kobylanski M (2000) Backward stochastic differential equations and partial differential equations with quadratic growth. Ann Probab 28(2):558–602. https://doi.org/10.1214/aop/1019160253
Kushner H (2002) Numerical approximations for stochastic differential games. SIAM J Control Optim 41:457–486
Kushner H, Chamberlain S (1969) On stochastic differential games: sufficient conditions that a given strategy be a saddle point, and numerical procedures for the solution of the game. J Math Anal Appl 26:560–575
Lasserre JB, Henrion D, Prieur C, Trelat E (2008) Nonlinear optimal control via occupation measures and LMI-relaxations. SIAM J Control Optim 47(4):1643–1666
Lemor JP, Gobet E, Warin X (2006) Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations. Bernoulli 12(5):889–916
Lepeltier JP, Martìn JS (1998) Existence for BSDE with superlinear-quadratic coefficient. Stoch Int J Probab Stoch Process 63(3–4):227–240
Longstaff FA, Schwartz RS (2001) Valuing American options by simulation: a simple least-squares approach. Rev Financ Stud 14:113–147
Ma J, Yong J (1999) Forward-backward stochastic differential equations and their applications. Springer, Berlin
Ma J, Protter P, Yong J (1994) Solving forward-backward stochastic differential equations explicitly—a four step scheme. Probab Theory Relat Fields 98:339–359
Ma J, Shen J, Zhao Y (2008) On numerical approximations of forward-backward stochastic differential equations. SIAM J Numer Anal 46(5):2636–2661
McEneaney WM (2007) A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs. SIAM J Control Optim 46(4):1239–1276
Milstein GN, Tretyakov MV (2006) Numerical algorithm for forward-backward stochastic differential equations. SIAM J Sci Comput 28(2):561–582
Morimoto J, Atkeson C (2002) Minimax differential dynamic programming: An application to robust biped walking. In: Advances in neural information processing systems (NIPS), Vancouver, British Columbia, Canada, December 9–14
Morimoto J, Zeglin G, Atkeson C (2003) Minimax differential dynamic programming: Application to a biped walking robot. In: IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, NV, 2: 1927–1932, October 27–31
Nagahara M, Quevedo DE, Nešić D (2016) Maximum hands-off control: a paradigm of control effort minimization. IEEE Trans Autom Control 61(3):735–747
Nagahara M, Quevedo DE, Nešić D (2013) Maximum hands-off control and \(L^1\) optimality. In: 52nd IEEE conference on decision and control, Florence, Italy, December 10–13, pp 3825–3830
Øksendal B (2007) Stochastic differential equations—an introduction with applications, 6th edn. Springer, Berlin
Ramachandran KM, Tsokos CP (2012) Stochastic differential games. Atlantis Press, Paris
Seywald H, Kumar RR, Deshpande SS, Heck ML (1994) Minimum fuel spacecraft reorientation. J Guid Control Dyn 17(1):21–29
Song Q, Yin G, Zhang Z (2008) Numerical solutions for stochastic differential games with regime switching. IEEE Trans Autom Control 53:509–521
Sun W, Theodorou EA, Tsiotras P (2015) Game-theoretic continuous time differential dynamic programming. In: American Control Conference, Chicago, July 1–3, pp 5593–5598
Theodorou EA, Buchli J, Schaal S (2010) A generalized path integral control approach to reinforcement learning. J Mach Learn Res 11:3137–3181
Xiu D (2010) Numerical methods for stochastic computations—a spectral method approach. Princeton University Press, Princeton
Yong J, Zhou XY (1999) Stochastic controls: hamiltonian systems and HJB equations. Springer, New York
Zhang J (2004) A numerical scheme for BSDEs. Ann Appl Probab 14(1):459–488
Zhang J (2017) Backward stochastic differential equations. Probability theory and stochastic modelling. Springer, Berlin
Acknowledgements
Funding was provided by Army Research Office (W911NF-16-1-0390) and National Science Foundation (CMMI-1662523).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Exarchos, I., Theodorou, E. & Tsiotras, P. Stochastic Differential Games: A Sampling Approach via FBSDEs. Dyn Games Appl 9, 486–505 (2019). https://doi.org/10.1007/s13235-018-0268-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13235-018-0268-4