Abstract
Alleviating human suffering in disasters is one of the main objectives of humanitarian logistics. The lack of emergency rescue materials is the root cause of this suffering and must be considered when making emergency supply distribution decision. As large-scale disasters often cause varying degrees of damage to different influenced areas, which will cause differences in both human suffering and the demand for emergency supply in influenced areas. This paper considers a novel emergency supply distribution scenario in humanitarian logistics, which takes into account these differences. In the scenario, besides the economic goals such as minimizing costs, the humanitarian goal of alleviating the suffering of survivors is treated as one of the main bases of emergency supply distribution decision making. We first apply Markov Decision Process to establish the formulation of the emergency supply distribution problem. Then, to acquire the optimal resource allocation policy that can reduce the economic cost while decreasing the suffering of survivors, a Deep Q-Network-based approach for emergency supply distribution in Humanitarian Logistics (DHL) is developed. Numerical results demonstrate DHL has better performance and lower time complexity to solve the problem by comparing with other baselines.
Similar content being viewed by others
Availability of data and material
Not applicable.
Code availability
Not applicable.
References
https://www.emdat.be. Accessed 11 Feb 2022
UNDRR (2020) Human cost of disasters: An overview of the last 20 years (2000–2019). Geneva. https://www.undrr.org/publication/human-cost-disasters-overview-last-20-years-2000-2019. Accessed 11 Feb 2022
Nappi MML, Souza JC (2015) Disaster management: hierarchical structuring criteria for selection and location of temporary shelters. Nat Hazards 75:2421–2436
Das R, Hanaoka S (2014) An agent-based model for resource allocation during relief distribution. J Humanit Logist Supply Chain Manag 4(2):265–285
Fiedrich F, Gehbauer F, Rickers U (2000) Optimized resource allocation for emergency response after earthquake disasters. Safety Sci 35(1–3):41–57
Wex F, Schryen G, Feuerriegel S, Neumann D (2014) Emergency response in natural disaster management: Allocation and scheduling of rescue units. Eur J Oper Res 235(3):697–708
Alem D, Clark A, Moreno A (2016) Stochastic network models for logistics planning in disaster relief. Eur J Oper Res 255(1):187–206
Chen YX, Tadikamalla PR, Shang J, Song Y (2020) Supply allocation: bi-level programming and differential evolution algorithm for Natural Disaster Relief. Clust Comput 23(1):203–217
Wang Y, Sun B (2021) Multiperiod optimal emergency material allocation considering road network damage and risk under uncertain conditions. Oper Res 1–36
Yu L, Zhang C, Jiang J, Yang H, Shang H (2021) Reinforcement learning approach for resource allocation in humanitarian logistics. Expert Syst Appl 173
Yu L, Yang H, Miao L, Zhang C (2018) Rollout algorithms for resource allocation in humanitarian logistics. IISE Trans 51(8):887–909
Yu L, Zhang C, Yang H, Miao L (2018) Novel methods for resource allocation in humanitarian logistics considering human suffering. Comput Ind Eng 119:1–20
Silva MA, Leiras A (2021) The Deprivation Cost in Humanitarian Logistics: A Systematic Review. In: International Joint conference on Industrial Engineering and Operations Management. pp 279–301
Sutton RS, Barto AG (1998) Reinforcement learning: An introduction. MIT press, Cambridge
Kohl N, Stone P (2004) Policy gradient reinforcement learning for fast quadrupedal locomotion. In: IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA'04. 2004, vol 3. IEEE, pp 2619–2624
Tesauro G (1995) Temporal difference learning and TD-Gammon. Commun ACM 38(3):58–68
Strehl AL, Li L, Wiewiora E, Langford J, Littman ML (2006) PAC model-free reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning. pp 881–888
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: A brief survey. IEEE Signal Process Mag 34(6):26–38
He Y, Zhao N, Yin H (2017) Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach. IEEE Trans Veh Technol 67(1):44–55
Xiong X, Zheng K, Lei L, Hou L (2020) Resource allocation based on deep reinforcement learning in IoT edge computing. IEEE J Sel Areas Commun 38(6):1133–1146
Yu P, Zhou F, Zhang X, Qiu X, Kadoch M, Cheriet M (2020) Deep learning-based resource allocation for 5G broadband TV service. IEEE Trans Broadcast 66(4):800–813
Hu X, Liu S, Chen R, Wang W, Wang C (2018) A deep reinforcement learning-based framework for dynamic resource allocation in multibeam satellite systems. IEEE Commun Lett 22(8):1612–1615
Xiong Z, Zhang Y, Niyato D, Deng R, Wang P, Wang LC (2019) Deep reinforcement learning for mobile 5G and beyond: Fundamentals, applications, and challenges. IEEE Veh Technol Mag 14(2):44–52
Zhang Y, Yao J, Guan H (2017) Intelligent cloud resource management with deep reinforcement learning. IEEE Cloud Comput 4(6):60–69
Liu N, Li Z, Xu J, Xu Z, Lin S, Qiu Q, ... Wang Y (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, pp 372–382
Du Y, Zhang F, Xue L (2018) A kind of joint routing and resource allocation scheme based on prioritized memories-deep Q network for cognitive radio ad hoc networks. Sensors 18(7):2119
Xiong Z, Zhang Y, Lim WYB, Kang J, Niyato D, Leung C, Miao C (2020) UAV-assisted wireless energy and data transfer with deep reinforcement learning. IEEE Trans Cogn Commun Netw 7(1):85–99
Zhang W, Yang D, Wu W, Peng H, Zhang N, Zhang H, Shen X (2021) Optimizing federated learning in distributed industrial iot: A multi-agent approach. IEEE J Sel Areas Commun 39(12):3688–3703
Sheu JB (2007) An emergency logistics distribution approach for quick response to urgent relief demand in disasters. Transp Res Part E: Logist Transp Rev 43(6):687–709
Huang K, Jiang Y, Yuan Y, Zhao L (2015) Modeling multiple humanitarian objectives in emergency response to large-scale disasters. Transp Res Part E: Logist Transp Rev 75:1–17
Holguín-Veras J, Pérez N, Jaller M, Van Wassenhove LN, Aros-Vera F (2013) On the appropriate objective function for post-disaster humanitarian logistics models. J Oper Manag 31(5):262–280
Wiering MA, Van Otterlo M (2012) Reinforcement learning: State-of-the-Art. Springer, Berlin, Germany
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey. J Artif Intell Res 4:237–285
Watkins CJCH (1989) Learning from delayed rewards. PhD Thesis, University of Cambridge, England
Qiu C, Wang X, Yao H, Du J, Yu FR, Guo S (2020) Networking Integrated Cloud–Edge–End in IoT: A Blockchain-Assisted Collective Q-Learning Approach. IEEE Internet Things J 8(16):12694–12704
Mohammed A, Nahom H, Tewodros A, Habtamu Y, Hayelom G (2020) Deep reinforcement learning for computation offloading and resource allocation in blockchain-based multi-UAV-enabled mobile edge computing. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, pp 295–299
Ke HC, Wang H, Zhao HW, Sun WJ (2021) Deep reinforcement learning-based computation offloading and resource allocation in security-aware mobile edge computing. Wirel Net 27(5):3357–3373
Zhang R, Xiong K, Lu Y, Gao B, Fan P, Letaief KB (2022) Joint Coordinated Beamforming and Power Splitting Ratio Optimization in MU-MISO SWIPT-Enabled HetNets: A Multi-Agent DDQN-Based Approach. IEEE J Sel Areas Commun 40(2):677–693
Šemrov D, Marsetič R, Žura M, Todorovski L, Srdic A (2016) Reinforcement learning approach for train rescheduling on a single-track railway. Transport Res Part B: Meth 86:250–267
Konar A, Chakraborty IG, Singh SJ, Jain LC, Nagar AK (2013) A deterministic improved Q-learning for path planning of a mobile robot. IEEE Transact Syst Man Cyber: Syst 43(5):1141–1153
Funding
We would like to thank the editors and anonymous reviewers for their helpful comments. The work of J. Fan, X. Chang and H.Kang was supported by the Fundamental Research Funds for the Central Universities 2020YJS042 and by the Beijing Municipal Natural Science Foundation (No. M22037). The work of J. Mišić and V. B. Mišić was supported by Natural Science and Engineering Research Council (NSERC) of Canada through their respective Discovery Grants.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fan, J., Chang, X., Mišić, J. et al. DHL: Deep reinforcement learning-based approach for emergency supply distribution in humanitarian logistics. Peer-to-Peer Netw. Appl. 15, 2376–2389 (2022). https://doi.org/10.1007/s12083-022-01353-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-022-01353-0