Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Stochastic Differential Games: A Sampling Approach via FBSDEs

  • Published:
Dynamic Games and Applications Aims and scope Submit manuscript

Abstract

The aim of this work is to present a sampling-based algorithm designed to solve various classes of stochastic differential games. The foundation of the proposed approach lies in the formulation of the game solution in terms of a decoupled pair of forward and backward stochastic differential equations (FBSDEs). In light of the nonlinear version of the Feynman–Kac lemma, probabilistic representations of solutions to the nonlinear Hamilton–Jacobi–Isaacs equations that arise for each class are obtained. These representations are in form of decoupled systems of FBSDEs, which may be solved numerically.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. A process \(H_s\) is called square-integrable if \( \mathbb {E}\big [\int _{t}^{T}H_s^2\mathrm {d}s\big ] < \infty \) for any \(T>t\).

  2. The Isaacs condition renders the viscosity solutions of the upper and lower value functions equal, thus making the order of maximization/minimization inconsequential.

  3. While X is a function of s and \(\omega \), we shall use \(X_s\) for notational brevity.

  4. Here, \(Y_i^m\) denotes the quantity \(Y^m_{i+1} +\varDelta t_ih(t_{i+1},X^m_{i+1},Y^m_{i+1},Z^m_{i+1})\), which is the \(Y^m_i\) sample value before the conditional expectation operator has been applied.

  5. Whenever the m index is not present, the entirety with respect to this index is to be understood.

References

  1. Athans M, Falb P (2007) Optimal control—an introduction to the theory and its applications. Dover Publications Inc, New York

    Google Scholar 

  2. Barles G, Souganidis P (1991) Convergence of approximation schemes for fully nonlinear second order equations. Asymptot Anal 4(3):271–283

    MathSciNet  MATH  Google Scholar 

  3. Beard R, Saridis G, Wen J (1997) Galerkin approximation of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177

    Article  MathSciNet  MATH  Google Scholar 

  4. Bender C, Denk R (2007) A forward scheme for backward SDEs. Stoch Process Appl 117:1793–1812

    Article  MathSciNet  MATH  Google Scholar 

  5. Berkovitz L (1961) A variational approach to differential games. RAND Corporation Report

  6. Bouchard B, Touzi N (2004) Discrete time approximation and Monte Carlo simulation of BSDEs. Stoch Process Appl 111:175–206

    Article  MATH  Google Scholar 

  7. Bouchard B, Elie R, Touzi N (2009) Discrete-time approximation of BSDEs and probabilistic schemes for fully nonlinear PDEs. Radon Ser Comput Appl Math 8:91–124

    MathSciNet  MATH  Google Scholar 

  8. Buckdahn R, Li J (2008) Stochastic differential games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM J Control Optim 47(1):444–475

    Article  MathSciNet  MATH  Google Scholar 

  9. Chassagneux JF, Richou A (2016) Numerical simulation of quadratic BSDEs. Ann Appl Probab 26(1):262–304

    Article  MathSciNet  MATH  Google Scholar 

  10. Da Lio F, Ley O (2006) Uniqueness results for second-order Bellman-Isaacs equations under quadratic growth assumptions and applications. SIAM J Control Optim 45(1):74–106

    Article  MathSciNet  MATH  Google Scholar 

  11. Delbaen F, Hu Y, Richou A (2011) On the uniqueness of solutions to quadratic BSDEs with convex generators and unbounded terminal conditions. Annales de l’Institut Henri Poincarè, Probabilitès et Statistiques 47(2):559–574

    Article  MathSciNet  MATH  Google Scholar 

  12. Dixon M, Edelbaum T, Potter J, Vandervelde W (1970) Fuel optimal reorientation of axisymmetric spacecraft. J Spacecr Rockets 7(11):1345–1351

    Article  Google Scholar 

  13. Douglas J, Ma J, Protter P (1996) Numerical methods for forward-backward stochastic differential equations. Ann Appl Probab 6:940–968

    Article  MathSciNet  MATH  Google Scholar 

  14. Duncan T, Pasik-Duncan B (2015) Some stochastic differential games with state dependent noise. In: 54th IEEE conference on decision and control, Osaka, Japan, December 15–18

  15. Dvijotham K, Todorov E (2013) Linearly solvable optimal control. In: Lewis FL, Liu D (eds) Reinforcement learning and approximate dynamic programming for feedback control, pp 119–141. https://doi.org/10.1002/9781118453988.ch6

  16. El Karoui N, Peng S, Quenez MC (1997) Backward stochastic differential equations in finance. Math Finance 7:1–71

    Article  MathSciNet  MATH  Google Scholar 

  17. Exarchos I, Theodorou E (2018) Stochastic optimal control via forward and backward stochastic differential equations and importance sampling. Automatica 87:159–165

    Article  MathSciNet  MATH  Google Scholar 

  18. Fahim A, Touzi N, Warin X (2011) A probabilistic numerical method for fully nonlinear parabolic PDEs. Ann Appl Probab 21(4):1322–1364

    Article  MathSciNet  MATH  Google Scholar 

  19. Fleming W, Soner H (2006) Controlled Markov processes and viscosity solutions, 2nd edn. Stochastic modelling and applied probability. Springer, Berlin

    MATH  Google Scholar 

  20. Fleming W, Souganidis P (1989) On the existence of value functions of two player zero-sum stochastic differential games. Indiana University Mathematics Journal, New York

    MATH  Google Scholar 

  21. Gobet E, Labart C (2007) Error expansion for the discretization of backward stochastic differential equations. Stoch Process Appl 117:803–829

    Article  MathSciNet  MATH  Google Scholar 

  22. Gorodetsky A, Karaman S, Marzouk Y (2015) Efficient high-dimensional stochastic optimal motion control using tensor-train decomposition. In: Robotics: science and systems (RSS)

  23. Györfi L, Kohler M, Krzyzak A, Walk H (2002) A distribution-free theory of nonparametric regression. Springer series in statistics. Springer, New York

    Book  MATH  Google Scholar 

  24. Hamadene S, Lepeltier JP (1995) Zero-sum stochastic differential games and backward equations. Syst Control Lett 24:259–263

    Article  MathSciNet  MATH  Google Scholar 

  25. Ho Y, Bryson A, Baron S (1965) Differential games and optimal pursuit-evasion strategies. IEEE Trans Autom Control 10:385–389

    Article  MathSciNet  Google Scholar 

  26. Horowitz MB, Burdick JW (2014) Semidefinite relaxations for stochastic optimal control policies. In: American control conference, Portland, June 4–6 pp 3006–3012

  27. Horowitz MB, Damle A, Burdick JW (2014) Linear Hamilton Jacobi Bellman equations in high dimensions. In: 53rd IEEE conference on decision and control, Los Angeles, California, USA, December 15–17

  28. Isaacs R (1965) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Willey, New York

    MATH  Google Scholar 

  29. Kappen HJ (2005) Linear theory for control of nonlinear stochastic systems. Phys Rev Lett 95:200201

    Article  MathSciNet  Google Scholar 

  30. Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New York

    MATH  Google Scholar 

  31. Kloeden P, Platen E (1999) Numerical solution of stochastic differential equations, vol 23 of Applications in Mathematics, Stochastic modelling and applied probability, 3rd edn. Springer, Berlin

    Google Scholar 

  32. Kobylanski M (2000) Backward stochastic differential equations and partial differential equations with quadratic growth. Ann Probab 28(2):558–602. https://doi.org/10.1214/aop/1019160253

    Article  MathSciNet  MATH  Google Scholar 

  33. Kushner H (2002) Numerical approximations for stochastic differential games. SIAM J Control Optim 41:457–486

    Article  MathSciNet  MATH  Google Scholar 

  34. Kushner H, Chamberlain S (1969) On stochastic differential games: sufficient conditions that a given strategy be a saddle point, and numerical procedures for the solution of the game. J Math Anal Appl 26:560–575

    Article  MathSciNet  MATH  Google Scholar 

  35. Lasserre JB, Henrion D, Prieur C, Trelat E (2008) Nonlinear optimal control via occupation measures and LMI-relaxations. SIAM J Control Optim 47(4):1643–1666

    Article  MathSciNet  MATH  Google Scholar 

  36. Lemor JP, Gobet E, Warin X (2006) Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations. Bernoulli 12(5):889–916

    Article  MathSciNet  MATH  Google Scholar 

  37. Lepeltier JP, Martìn JS (1998) Existence for BSDE with superlinear-quadratic coefficient. Stoch Int J Probab Stoch Process 63(3–4):227–240

    MathSciNet  MATH  Google Scholar 

  38. Longstaff FA, Schwartz RS (2001) Valuing American options by simulation: a simple least-squares approach. Rev Financ Stud 14:113–147

    Article  MATH  Google Scholar 

  39. Ma J, Yong J (1999) Forward-backward stochastic differential equations and their applications. Springer, Berlin

    MATH  Google Scholar 

  40. Ma J, Protter P, Yong J (1994) Solving forward-backward stochastic differential equations explicitly—a four step scheme. Probab Theory Relat Fields 98:339–359

    Article  MathSciNet  MATH  Google Scholar 

  41. Ma J, Shen J, Zhao Y (2008) On numerical approximations of forward-backward stochastic differential equations. SIAM J Numer Anal 46(5):2636–2661

    Article  MathSciNet  MATH  Google Scholar 

  42. McEneaney WM (2007) A curse-of-dimensionality-free numerical method for solution of certain HJB PDEs. SIAM J Control Optim 46(4):1239–1276

    Article  MathSciNet  MATH  Google Scholar 

  43. Milstein GN, Tretyakov MV (2006) Numerical algorithm for forward-backward stochastic differential equations. SIAM J Sci Comput 28(2):561–582

    Article  MathSciNet  MATH  Google Scholar 

  44. Morimoto J, Atkeson C (2002) Minimax differential dynamic programming: An application to robust biped walking. In: Advances in neural information processing systems (NIPS), Vancouver, British Columbia, Canada, December 9–14

  45. Morimoto J, Zeglin G, Atkeson C (2003) Minimax differential dynamic programming: Application to a biped walking robot. In: IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, NV, 2: 1927–1932, October 27–31

  46. Nagahara M, Quevedo DE, Nešić D (2016) Maximum hands-off control: a paradigm of control effort minimization. IEEE Trans Autom Control 61(3):735–747

    Article  MathSciNet  MATH  Google Scholar 

  47. Nagahara M, Quevedo DE, Nešić D (2013) Maximum hands-off control and \(L^1\) optimality. In: 52nd IEEE conference on decision and control, Florence, Italy, December 10–13, pp 3825–3830

  48. Øksendal B (2007) Stochastic differential equations—an introduction with applications, 6th edn. Springer, Berlin

    MATH  Google Scholar 

  49. Ramachandran KM, Tsokos CP (2012) Stochastic differential games. Atlantis Press, Paris

    Book  MATH  Google Scholar 

  50. Seywald H, Kumar RR, Deshpande SS, Heck ML (1994) Minimum fuel spacecraft reorientation. J Guid Control Dyn 17(1):21–29

    Article  MATH  Google Scholar 

  51. Song Q, Yin G, Zhang Z (2008) Numerical solutions for stochastic differential games with regime switching. IEEE Trans Autom Control 53:509–521

    Article  MathSciNet  MATH  Google Scholar 

  52. Sun W, Theodorou EA, Tsiotras P (2015) Game-theoretic continuous time differential dynamic programming. In: American Control Conference, Chicago, July 1–3, pp 5593–5598

  53. Theodorou EA, Buchli J, Schaal S (2010) A generalized path integral control approach to reinforcement learning. J Mach Learn Res 11:3137–3181

    MathSciNet  MATH  Google Scholar 

  54. Xiu D (2010) Numerical methods for stochastic computations—a spectral method approach. Princeton University Press, Princeton

    Book  MATH  Google Scholar 

  55. Yong J, Zhou XY (1999) Stochastic controls: hamiltonian systems and HJB equations. Springer, New York

    Book  MATH  Google Scholar 

  56. Zhang J (2004) A numerical scheme for BSDEs. Ann Appl Probab 14(1):459–488

    Article  MathSciNet  MATH  Google Scholar 

  57. Zhang J (2017) Backward stochastic differential equations. Probability theory and stochastic modelling. Springer, Berlin

    Book  Google Scholar 

Download references

Acknowledgements

Funding was provided by Army Research Office (W911NF-16-1-0390) and National Science Foundation (CMMI-1662523).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis Exarchos.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Exarchos, I., Theodorou, E. & Tsiotras, P. Stochastic Differential Games: A Sampling Approach via FBSDEs. Dyn Games Appl 9, 486–505 (2019). https://doi.org/10.1007/s13235-018-0268-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13235-018-0268-4

Keywords

Navigation