Abstract
The asset management of an insurance company is more complex than traditional portfolio management due to the presence of obligations that the insurance company must fulfill toward the clients. These obligations, commonly referred to as liabilities, are payments whose magnitude and occurrence are a byproduct of insurance contracts with the clients, and of portfolio performances.
In particular, while clients must be refunded in case of adverse events, such as car accidents or death, they also contribute to a common financial portfolio to earn annual returns. Customer withdrawals might increase whenever these returns are too low or, in the presence of an annual minimum guaranteed, the company might have to integrate the difference. Hence, in this context, any investment strategy cannot omit the inter-dependency between financial assets and liabilities.
To deal with this problem, we present a stochastic model that combines portfolio returns with the liabilities generated by the insurance products offered by the company. Furthermore, we propose a risk-adjusted optimization problem to maximize the capital of the company over a pre-determined time horizon.
Since traditional financial tools are inadequate for such a setting, we develop the model as a Markov Decision Process. In this way, we can use Reinforcement Learning algorithms to solve the underlying optimization problem. Finally, we provide experiments that show how the optimal asset allocation can be found by training an agent with the algorithm Deep Deterministic Policy Gradient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The other three assets are omitted as they go to zero very quickly.
References
Black, F., Litterman, R.: Global portfolio optimization. Financ. Anal. J. 48(5), 28–43 (1992)
Buhler, H., Gonon, L., Teichmann, J., Wood, B.: Deep hedging (2018)
Cox, J.C., Ingersoll, J.E., Ross, S.A.: A theory of the term structure of interest rates. Econometrica 53(2), 385–407 (1985). ISSN 00129682, 14680262
De Asis, K., Chan, A., Pitis, S., Sutton, R.S., Graves, D.: Fixed-horizon temporal difference methods for stable reinforcement learning. arXiv preprint arXiv:1909.03906 (2019)
Denardo, E.V.: On linear programming in a Markov decision problem. Manag. Sci. 16(5), 281–288 (1970)
Doob, J.L.: The Brownian movement and stochastic equations. Ann. Math. 351–369 (1942)
Fontoura, A., Haddad, D., Bezerra, E.: A deep reinforcement learning approach to asset-liability management. In: 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), pp. 216–221. IEEE (2019)
Halperin, I.: QLBS: Q-learner in the Black-Scholes(-Merton) worlds. arXiv preprint arXiv:1712.04609 (2017)
Jangmin, O., Lee, J., Lee, J.W., Zhang, B.T.: Adaptive stock trading with dynamic asset allocation using reinforcement learning. Inf. Sci. 176(15), 2121–2147 (2006)
Jiang, N., Kulesza, A., Singh, S., Lewis, R.: The dependence of effective planning horizon on model accuracy. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015, Richland, SC, pp. 1181–1189. International Foundation for Autonomous Agents and Multiagent Systems (2015). ISBN 9781450334136
Krabichler, T., Teichmann, J.: Deep replication of a runoff portfolio (2020)
Leibowitz, M., Fabozzi, F.J., Sharpe, W.: Investing: The Collected Works of Martin L. Leibowitz. Probus Professional Pub (1992)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(39), 1–40 (2016)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Mansley, C., Weinstein, A., Littman, M.: Sample-based planning for continuous action Markov decision processes. In: Twenty-First International Conference on Automated Planning and Scheduling (2011)
Marcus, S.I., Fernández-Gaucherand, E., Hernández-Hernandez, D., Coraluppi, S., Fard, P.: Risk sensitive Markov decision processes. In: Byrnes, C.I., Datta, B.N., Martin, C.F., Gilliam, D.S. (eds.) Systems and Control in the Twenty-First Century. PSCT, vol. 22, pp. 263–279. Springer, Heidelberg (1997). https://doi.org/10.1007/978-1-4612-4120-1_14
Markowitz, H.: Portfolio selection. J. Financ. 7(1), 77–91 (1952)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). ISSN 00280836
Moody, J., Saffell, M.: Learning to trade via direct reinforcement. IEEE Trans. Neural Netw. 12(4), 875–89 (2001)
Moody, J., Wu, L., Liao, Y., Saffell, M.: Performance functions and reinforcement learning for trading systems and portfolios. J. Forecast. 17(5–6), 441–470 (1998)
Nevmyvaka, Y., Feng, Y., Kearns, M.: Reinforcement learning for optimized trade execution. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 673–680. Association for Computing Machinery, New York (2006)
Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement learning for humanoid robotics. In: IEEE-RAS International Conference on Humanoid Robots (Humanoids2003), Karlsruhe, Germany, 29–30 September (2003). CLMC
Plappert, M., et al.: Parameter space noise for exploration. arXiv preprint arXiv:1706.01905 (2017)
de Prado, M.L.: Advances in Financial Machine Learning, 1st edn. Wiley, Hoboken (2018)
Silver, D., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–503 (2016)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Wang, H., Zhou, X.Y.: Continuous-time mean-variance portfolio selection: a reinforcement learning framework. Math. Financ. 30(4), 1273–1308 (2020)
Acknowledgements
The research was conducted under a cooperative agreement between ISI Foundation, Intesa Sanpaolo Innovation Center, and Intesa Sanpaolo Vita. The authors would like to thank Lauretta Filangieri, Antonino Galatà, Giuseppe Loforese, Pietro Materozzi and Luigi Ruggerone for their useful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Abrate, C. et al. (2021). Continuous-Action Reinforcement Learning for Portfolio Allocation of a Life Insurance Company. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-86514-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86513-9
Online ISBN: 978-3-030-86514-6
eBook Packages: Computer ScienceComputer Science (R0)