research-article

Efficient and Effective Express via Contextual Cooperative Reinforcement Learning

Authors:

Qiang YangAuthors Info & Claims

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 510 - 519

https://doi.org/10.1145/3292500.3330968

Published: 25 July 2019 Publication History

Abstract

Express systems are widely deployed in many major cities. Couriers in an express system load parcels at transit station and deliver them to customers. Meanwhile, they also try to serve the pick-up requests which come stochastically in real time during the delivery process. Having brought much convenience and promoted the development of e-commerce, express systems face challenges on courier management to complete the massive number of tasks per day. Considering this problem, we propose a reinforcement learning based framework to learn a courier management policy. Firstly, we divide the city into independent regions, in each of which a constant number of couriers deliver parcels and serve requests cooperatively. Secondly, we propose a soft-label clustering algorithm named Balanced Delivery-Service Burden (BDSB) to dispatch parcels to couriers in each region. BDSB guarantees that each courier has almost even delivery and expected request-service burden when departing from transit station, giving a reasonable initialization for online management later. As pick-up requests come in real time, a Contextual Cooperative Reinforcement Learning (CCRL) model is proposed to guide where should each courier deliver and serve in each short period. Being formulated in a multi-agent way, CCRL focuses on the cooperation among couriers while also considering the system context. Experiments on real-world data from Beijing are conducted to confirm the outperformance of our model.

References

[1]

C. Chen, D. Zhang, X. Ma, B. Guo, L. Wang, Y. Wang, E. Sha. CROWDDELIVER: Planning City-Wide Package Delivery Paths Leveraging the Crowd of Taxi. IEEE Trans. Intelligent Transportation Systems, 2017.

[2]

J. Mclnerney, A. Rogers, N. R. Jennings. Crowdsourcing Physical Package Delivery Using the Existing Routine Mobility of a Local Population. In Proc. Orange D4D Challenge, 2014.

[3]

A. Sadlilek, J. Krumm, E. Horvitz. Crowdphysics: Planned and Opportunistic Crowdsourcing for Physical Tasks. In Proc. ICWSM, 2013.

[4]

S. Zhang, L. Qin, Y. Zheng, H. Cheng. Effective and Efficient: Large-scale Dynamic City Express. Transaction on Knowledge and Data Engineering.

Digital Library

[5]

K. Lin, R. Zhao, Z. Xu, J. Zhou. Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning. In Proc. SIGKDD, 2018.

Digital Library

[6]

N. Garg, S. Ranu. Rout Recommendations for Idle Taxi Drivers: Find Me the Shortest Route to a Customer. In Proc. SIGKDD, 2018.

Digital Library

[7]

Y. Li, Y. Zheng, Q. Yang. Dynamic Bike Reposition: A Spatio-Transit Reinforcement Learning Approach. In Proc. SIGKDD, 2018.

Digital Library

[8]

H. Wei, G. Zheng, H. Yao, Z. Li. IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control. In Proc. SIGKDD, 2018.

Digital Library

[9]

P. Hulot, D. Aloise, S. D. Jena. Towards Station-Level Demand Prediction for Effective Rebalancing in Bike-Sharing Systems. In Proc. SIGKDD, 2018.

Digital Library

[10]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller. Playing Atari with Deep Reinforcement Learning. arXiv, 2013.

[11]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis. Human-Level Control through Deep Reinforcement Learning. Nature, 2015.

[12]

J. Liu, L. Sun, W. Chen, H. Xiong. Rebalancing Bike Sharing Systems: A Multi-source Data Smart Optimization. In Proc. SIGKDD, 2016.

Digital Library

[13]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. v. d. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 2016.

[14]

S. Ghosh, M. Trick, P. Varakantham. Robust Reposition to Counter Unpredictable Demand in Bike Sharing Systems. In Proc. IJCAI, 2016.

Digital Library

[15]

S. Chen, Y. Yu, Q. Da, J. Tan, H. Huang, H. Tang. Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation. In Proc. SIGKDD, 2018.

Digital Library

[16]

Y. Hu, Q. Da, A. Zeng, Y. Yu, Y. Xu. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. In Proc. SIGKDD, 2018.

Digital Library

[17]

R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.

Digital Library

[18]

J. Hopcroft, R. Tarjan. Algorithm 447: Efficient Algorithms for Graph Manipulation. Communication of the ACM, 1973.

Digital Library

Cited By

Lu YWang SYang YWang HGuo BZhang DWang SHe TSerra ESpezzano F(2024)DECO: Cooperative Order Dispatching for On-Demand Delivery with Real-Time Encounter DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680084(4734-4742)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680084
Wu QLi YZhu GMei BXu JXu M(2024)Prediction-Aware Adaptive Task Assignment for Spatial CrowdsourcingIEEE Transactions on Mobile Computing10.1109/TMC.2024.342339623:12(13048-13061)Online publication date: Dec-2024
https://doi.org/10.1109/TMC.2024.3423396
Wang SGuo BDing YWang GHe SZhang DHe T(2024)Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand DeliveryIEEE Transactions on Mobile Computing10.1109/TMC.2023.334281523:8(8175-8192)Online publication date: Aug-2024
https://doi.org/10.1109/TMC.2023.3342815
Show More Cited By

Index Terms

Efficient and Effective Express via Contextual Cooperative Reinforcement Learning
1. Information systems
  1. Information systems applications
    1. Spatial-temporal systems

Recommendations

Cooperative Multi-Agent Reinforcement Learning in Express System
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Express systems are widely deployed in many major cities. One type of important tasks in the system is to pick up packages from customers in time. As pick-up requests come in real time and there are many couriers picking up packages, how to dispatch ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Relational Reinforcement Learning

Relational reinforcement learning is presented, a learning technique that combines reinforcement learning with relational learning or inductive logic programming. Due to the use of a more expressive representation language to represent states, actions ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2019

3305 pages

ISBN:9781450362016

DOI:10.1145/3292500

General Chairs:
Ankur Teredesai
KenSci
,
Vipin Kumar
University of Minnesota
,
Program Chairs:
Ying Li
EV Analysis Corporation
,
Rómer Rosales
LinkedIn
,
Evimaria Terzi
Boston University
,
George Karypis
University of Minnesota

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '19

Sponsor:

KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 4 - 8, 2019

AK, Anchorage, USA

Acceptance Rates

KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

32
Total Citations
View Citations
869
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)5

Reflects downloads up to 09 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lu YWang SYang YWang HGuo BZhang DWang SHe TSerra ESpezzano F(2024)DECO: Cooperative Order Dispatching for On-Demand Delivery with Real-Time Encounter DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680084(4734-4742)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680084
Wu QLi YZhu GMei BXu JXu M(2024)Prediction-Aware Adaptive Task Assignment for Spatial CrowdsourcingIEEE Transactions on Mobile Computing10.1109/TMC.2024.342339623:12(13048-13061)Online publication date: Dec-2024
https://doi.org/10.1109/TMC.2024.3423396
Wang SGuo BDing YWang GHe SZhang DHe T(2024)Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand DeliveryIEEE Transactions on Mobile Computing10.1109/TMC.2023.334281523:8(8175-8192)Online publication date: Aug-2024
https://doi.org/10.1109/TMC.2023.3342815
Li YLi YPeng YFu XXu JXu M(2024)Auction-Based Crowdsourced First and Last Mile LogisticsIEEE Transactions on Mobile Computing10.1109/TMC.2022.321988123:1(180-193)Online publication date: Jan-2024
https://doi.org/10.1109/TMC.2022.3219881
Zhang JShan EWu LYin JYang LGao Z(2024)An End-to-End Predict-Then-Optimize Clustering Method for Stochastic Assignment ProblemsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.338502925:9(12605-12620)Online publication date: Sep-2024
https://doi.org/10.1109/TITS.2024.3385029
Rajeh TLuo ZJaved MAlhaek FLi T(2024)A Clustering-Based Multi-Agent Reinforcement Learning Framework for Finer-Grained Taxi DispatchingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.337082025:9(11269-11281)Online publication date: Sep-2024
https://doi.org/10.1109/TITS.2024.3370820
Zhang MZeng YWang KLi YWu QXu M(2024)Learned Unmanned Vehicle Scheduling for Large-Scale Urban LogisticsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.335168725:7(7933-7944)Online publication date: Jul-2024
https://doi.org/10.1109/TITS.2024.3351687
WANG FZHANG HDU SHUA MZHONG G(2024)C-SPPO: A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problemChinese Journal of Aeronautics10.1016/j.cja.2024.09.005Online publication date: Sep-2024
https://doi.org/10.1016/j.cja.2024.09.005
Zhang DWang MMango JLi XXu X(2024)A survey on applications of reinforcement learning in spatial resource allocationComputational Urban Science10.1007/s43762-024-00127-z4:1Online publication date: 7-Jun-2024
https://doi.org/10.1007/s43762-024-00127-z
Guo BWang SWang HLiu YKong FZhang DHe TSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Towards Equitable Assignment: Data-Driven Delivery Zone Partition at Last-mile LogisticsProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599915(4078-4088)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599915
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents