Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3292500.3330968acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Efficient and Effective Express via Contextual Cooperative Reinforcement Learning

Published: 25 July 2019 Publication History

Abstract

Express systems are widely deployed in many major cities. Couriers in an express system load parcels at transit station and deliver them to customers. Meanwhile, they also try to serve the pick-up requests which come stochastically in real time during the delivery process. Having brought much convenience and promoted the development of e-commerce, express systems face challenges on courier management to complete the massive number of tasks per day. Considering this problem, we propose a reinforcement learning based framework to learn a courier management policy. Firstly, we divide the city into independent regions, in each of which a constant number of couriers deliver parcels and serve requests cooperatively. Secondly, we propose a soft-label clustering algorithm named Balanced Delivery-Service Burden (BDSB) to dispatch parcels to couriers in each region. BDSB guarantees that each courier has almost even delivery and expected request-service burden when departing from transit station, giving a reasonable initialization for online management later. As pick-up requests come in real time, a Contextual Cooperative Reinforcement Learning (CCRL) model is proposed to guide where should each courier deliver and serve in each short period. Being formulated in a multi-agent way, CCRL focuses on the cooperation among couriers while also considering the system context. Experiments on real-world data from Beijing are conducted to confirm the outperformance of our model.

References

[1]
C. Chen, D. Zhang, X. Ma, B. Guo, L. Wang, Y. Wang, E. Sha. CROWDDELIVER: Planning City-Wide Package Delivery Paths Leveraging the Crowd of Taxi. IEEE Trans. Intelligent Transportation Systems, 2017.
[2]
J. Mclnerney, A. Rogers, N. R. Jennings. Crowdsourcing Physical Package Delivery Using the Existing Routine Mobility of a Local Population. In Proc. Orange D4D Challenge, 2014.
[3]
A. Sadlilek, J. Krumm, E. Horvitz. Crowdphysics: Planned and Opportunistic Crowdsourcing for Physical Tasks. In Proc. ICWSM, 2013.
[4]
S. Zhang, L. Qin, Y. Zheng, H. Cheng. Effective and Efficient: Large-scale Dynamic City Express. Transaction on Knowledge and Data Engineering.
[5]
K. Lin, R. Zhao, Z. Xu, J. Zhou. Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning. In Proc. SIGKDD, 2018.
[6]
N. Garg, S. Ranu. Rout Recommendations for Idle Taxi Drivers: Find Me the Shortest Route to a Customer. In Proc. SIGKDD, 2018.
[7]
Y. Li, Y. Zheng, Q. Yang. Dynamic Bike Reposition: A Spatio-Transit Reinforcement Learning Approach. In Proc. SIGKDD, 2018.
[8]
H. Wei, G. Zheng, H. Yao, Z. Li. IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control. In Proc. SIGKDD, 2018.
[9]
P. Hulot, D. Aloise, S. D. Jena. Towards Station-Level Demand Prediction for Effective Rebalancing in Bike-Sharing Systems. In Proc. SIGKDD, 2018.
[10]
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller. Playing Atari with Deep Reinforcement Learning. arXiv, 2013.
[11]
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis. Human-Level Control through Deep Reinforcement Learning. Nature, 2015.
[12]
J. Liu, L. Sun, W. Chen, H. Xiong. Rebalancing Bike Sharing Systems: A Multi-source Data Smart Optimization. In Proc. SIGKDD, 2016.
[13]
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. v. d. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 2016.
[14]
S. Ghosh, M. Trick, P. Varakantham. Robust Reposition to Counter Unpredictable Demand in Bike Sharing Systems. In Proc. IJCAI, 2016.
[15]
S. Chen, Y. Yu, Q. Da, J. Tan, H. Huang, H. Tang. Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation. In Proc. SIGKDD, 2018.
[16]
Y. Hu, Q. Da, A. Zeng, Y. Yu, Y. Xu. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. In Proc. SIGKDD, 2018.
[17]
R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
[18]
J. Hopcroft, R. Tarjan. Algorithm 447: Efficient Algorithms for Graph Manipulation. Communication of the ACM, 1973.

Cited By

View all
  • (2024)DECO: Cooperative Order Dispatching for On-Demand Delivery with Real-Time Encounter DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680084(4734-4742)Online publication date: 21-Oct-2024
  • (2024)Prediction-Aware Adaptive Task Assignment for Spatial CrowdsourcingIEEE Transactions on Mobile Computing10.1109/TMC.2024.342339623:12(13048-13061)Online publication date: Dec-2024
  • (2024)Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand DeliveryIEEE Transactions on Mobile Computing10.1109/TMC.2023.334281523:8(8175-8192)Online publication date: Aug-2024
  • Show More Cited By

Index Terms

  1. Efficient and Effective Express via Contextual Cooperative Reinforcement Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2019
    3305 pages
    ISBN:9781450362016
    DOI:10.1145/3292500
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. constrained clustering
    2. express system
    3. reinforcement learning

    Qualifiers

    • Research-article

    Conference

    KDD '19
    Sponsor:

    Acceptance Rates

    KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)54
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 09 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DECO: Cooperative Order Dispatching for On-Demand Delivery with Real-Time Encounter DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680084(4734-4742)Online publication date: 21-Oct-2024
    • (2024)Prediction-Aware Adaptive Task Assignment for Spatial CrowdsourcingIEEE Transactions on Mobile Computing10.1109/TMC.2024.342339623:12(13048-13061)Online publication date: Dec-2024
    • (2024)Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand DeliveryIEEE Transactions on Mobile Computing10.1109/TMC.2023.334281523:8(8175-8192)Online publication date: Aug-2024
    • (2024)Auction-Based Crowdsourced First and Last Mile LogisticsIEEE Transactions on Mobile Computing10.1109/TMC.2022.321988123:1(180-193)Online publication date: Jan-2024
    • (2024)An End-to-End Predict-Then-Optimize Clustering Method for Stochastic Assignment ProblemsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.338502925:9(12605-12620)Online publication date: Sep-2024
    • (2024)A Clustering-Based Multi-Agent Reinforcement Learning Framework for Finer-Grained Taxi DispatchingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.337082025:9(11269-11281)Online publication date: Sep-2024
    • (2024)Learned Unmanned Vehicle Scheduling for Large-Scale Urban LogisticsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.335168725:7(7933-7944)Online publication date: Jul-2024
    • (2024)C-SPPO: A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problemChinese Journal of Aeronautics10.1016/j.cja.2024.09.005Online publication date: Sep-2024
    • (2024)A survey on applications of reinforcement learning in spatial resource allocationComputational Urban Science10.1007/s43762-024-00127-z4:1Online publication date: 7-Jun-2024
    • (2023)Towards Equitable Assignment: Data-Driven Delivery Zone Partition at Last-mile LogisticsProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599915(4078-4088)Online publication date: 6-Aug-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media