Abstract
Portfolio selection, as an important topic in the finance community, has attracted increased attention from artificial intelligence practitioners. Recently, the reinforcement learning (RL) paradigm, with the self-learning and model-free property, provides a promising candidate to solve complex portfolio selection tasks. Traditional research on RL-based portfolio selection focuses on batch-mode stationary problems, where all the market data is assumed to be available for the one-time training process. However, the real-world financial markets are often dynamic where the streaming data increments with new patterns keep emerging continually. In the paper, we address the continual portfolio selection problem in such dynamic environments. We propose to utilize the incremental RL approach with a two-step solution for efficiently adjusting the existing portfolio policy to a new one when the market changes as a new data increment comes. The first step, policy relaxation, forces the agent to execute a relaxed policy for encouraging a sufficient exploration in the new market. The second step, importance weighting, puts emphasis on learning samples consisting of more new information for stimulating the existing portfolio policy to more rapidly adapt to the new market. Evaluation results on real-world portfolio tasks verify the effectiveness and superiority of our method for addressing the continual portfolio selection in dynamic environments.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Poloniex’s official API: https://poloniex.com/support/api/.
References
Markowitz H (1952) Portfolio selection. J Finance 7(1):77–91
Kelly J Jr (1956) A new interpretation of information rate. Bell Syst Tech J 35(4):917–926
Wang X, Wang B, Liu S, Li H, Wang T, Watada J (2021) Fuzzy portfolio selection based on three-way decision and cumulative prospect theory. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-021-01402-9
Hu S, Li F, Liu Y, Wang S (2020) A self-adaptive preference model based on dynamic feature analysis for interactive portfolio optimization. Int J Mach Learn Cybern 11(6):1253–1266
Mohagheghi V, Mousavi SM (2021) A new multi-period optimization model for resilient-sustainable project portfolio evaluation under interval-valued pythagorean fuzzy sets with a case study. Int J Mach Learn Cybern 12(12):3541–3560
Shen W, Wang J (2015) Transaction costs-aware portfolio optimization via fast Lowner–John ellipsoid approximation. In: Proceedings of AAAI conference on artificial intelligence, Austin, Texas, USA, vol 29
Agarwal A, Hazan E, Kale S, Schapire RE (2006) Algorithms for portfolio management based on the Newton method. In: Proceedings of international conference on machine learning, Pittsburgh, Pennsylvania, USA, pp 9–16
Huang D, Zhou J, Li B, Hoi SC, Zhou S (2016) Robust median reversion strategy for online portfolio selection. IEEE Trans Knowl Data Eng 28(9):2480–2493
Khedmati M, Azin P (2020) An online portfolio selection algorithm using clustering approaches and considering transaction costs. Expert Syst Appl 159(113):546
Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybern 11(4):747–750
Li J, Huang C, Qi J, Qian Y, Liu W (2017) Three-way cognitive concept learning via multi-granularity. Inf Sci 378:244–263
Yu D, Xu Z, Wang X (2020) Bibliometric analysis of support vector machines research trend: a case study in china. Int J Mach Learn Cybern 11(3):715–728
Zhi H, Li J (2018) Influence of dynamical changes on concept lattice and implication rules. Int J Mach Learn Cybern 9(5):795–805
Wang XZ, Wang R, Xu C (2017) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48(2):703–715
Tang Y, Fan M, Li J (2016) An information fusion technology for triadic decision contexts. Int J Mach Learn Cybern 7(1):13–24
Niaki STA, Hoseinzade S (2013) Forecasting S &P 500 index using artificial neural networks and design of experiments. J Ind Eng Int 9(1):1–9
Das P, Johnson N, Banerjee A (2014) Online portfolio selection with group sparsity. In: Proceedings of AAAI conference on artificial intelligence, Québec City, Québec, Canada, vol 28
Heaton JB, Polson NG, Witte JH (2017) Deep learning for finance: deep portfolios. Appl Stoch Models Bus Ind 33(1):3–12
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. MIT press, Cambridge
Jiang Z, Xu D, Liang J (2017) A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Vinyals O, Babuschkin I, Czarnecki WM et al (2019) Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782):350–354
Silver D, Schrittwieser J, Simonyan K et al (2017) Mastering the game of Go without human knowledge. Nature 550(7676):354–359
Hwangbo J, Lee J, Dosovitskiy A, Bellicoso D, Tsounis V, Koltun V, Hutter M (2019) Learning agile and dynamic motor skills for legged robots. Sci Robot 4(26):eaau5872
Li JA, Dong D, Wei Z, Liu Y, Pan Y, Nori F, Zhang X (2020) Quantum reinforcement learning during human decision-making. Nat Hum Behav 4:294–307
Zhang Y, Zhao P, Li B, Wu Q, Huang J, Tan M (2020) Cost-sensitive portfolio selection via deep reinforcement learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE20202979700
Xu K, Zhang Y, Ye D, Zhao P, Tan M (2020) Relation-aware transformer for portfolio policy learning. In: Proceedings of international joint conference on artificial intelligence, Yokohama, Japan, pp 4647–4653
Park H, Sim MK, Choi DG (2020) An intelligent financial portfolio trading strategy using deep q-learning. Expert Syst Appl 158(113):573
Liang Q, Zhu M, Zheng X, Wang Y (2021) An adaptive news-driven method for CVaR-sensitive online portfolio selection in non-stationary financial markets. In: Proceedings of AAAI conference on artificial intelligence, Vancouver, British Columbia, Canada, pp 2708–2715
He H, Chen S, Li K, Xu X (2011) Incremental learning from stream data. IEEE Trans Neural Netw 22(12):1901–1914
Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2019) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363
Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531
Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern Part C (Appl Rev) 31(4):497–508
Carpenter GA, Grossberg S (1988) The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21(3):77–88
Yu H, Lu J, Zhang G (2020) Online topology learning by a Gaussian membership-based self-organizing incremental neural network. IEEE Trans Neural Netw Learn Syst 31(10):3947–3961
Ross DA, Lim J, Lin RS, Yang MH (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77(1–3):125–141
Yang S, Yao X (2008) Population-based incremental learning with associative memory for dynamic environments. IEEE Trans Evol Comput 12(5):542–561
Kulić D, Ott C, Lee D, Ishikawa J, Nakamura Y (2012) Incremental learning of full body motion primitives and their sequencing through human motion observation. Int J Robot Res 31(3):330–345
Wang Z, Li HX (2019) Incremental spatiotemporal learning for online modeling of distributed parameter systems. IEEE Trans Syst Man Cybern Syst 49(12):2612–2622
Pratama M, Lu J, Lughofer E, Zhang G, Er MJ (2017) An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Trans Fuzzy Syst 25(5):1175–1192
Pratama M, Lu J, Anavatti S, Lughofer E, Lim CP (2016) An incremental meta-cognitive-based scaffolding fuzzy neural network. Neurocomputing 171:89–105
Wang Z, Chen C, Li HX, Dong D, Tarn TJ (2019) Incremental reinforcement learning with prioritized sweeping for dynamic environments. IEEE/ASME Trans Mechatron 24(2):621–632
Wang Z, Li HX, Chen C (2020) Incremental reinforcement learning in continuous spaces via policy relaxation and importance weighting. IEEE Trans Neural Netw Learn Syst 31(6):1870–1883
Wang Z, Chen C, Dong D (2021) Lifelong incremental reinforcement learning with online Bayesian inference. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS20213055499
Li B, Hoi SC (2014) Online portfolio selection: a survey. ACM Comput Surv 46(3):1–36
Sutton RS, McAllester DA, Singh SP, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of advances in neural information processing systems, Denver, USA, pp 1057–1063
Martens J (2010) Deep learning via Hessian-free optimization. Int Conf Mach Learn 27:735–742
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of international conference on machine learning, Sydney, NSW, Australia, pp 1126–1135
Rakelly K, Zhou A, Finn C, Levine S, Quillen D (2019) Efficient off-policy meta-reinforcement learning via probabilistic context variables. In: Proceedings of international conference on machine learning, Long Beach, California, USA, pp 5331–5340
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant no. 62006111, and the Natural Science Foundation of Jiangsu Province of China under Grant BK20200330.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, S., Wang, B., Li, H. et al. Continual portfolio selection in dynamic environments via incremental reinforcement learning. Int. J. Mach. Learn. & Cyber. 14, 269–279 (2023). https://doi.org/10.1007/s13042-022-01639-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01639-y