Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

ORSuite: Benchmarking Suite for Sequential Operations Models

Published: 20 January 2022 Publication History

Abstract

Reinforcement learning (RL) has received widespread attention across multiple communities, but the experiments have focused primarily on large-scale game playing and robotics tasks. In this paper we introduce ORSuite, an open-source library containing environments, algorithms, and instrumentation for operational problems. Our package is designed to motivate researchers in the reinforcement learning community to develop and evaluate algorithms on operational tasks, and to consider the true multi-objective nature of these problems by considering metrics beyond cumulative reward.

References

[1]
Daron Acemoglu, Victor Chernozhukov, Iv´an Werning, and Michael D Whinston. Optimal targeted lockdowns in a multi-group sir model. Working Paper 27102, National Bureau of Economic Research, May 2020.
[2]
Alekh Agarwal, Nan Jiang, Sham M Kakade, and Wen Sun. Reinforcement learning: Theory and algorithms. 2020.
[3]
Siddhartha Banerjee, Daniel Freund, and Thodoris Lykouris. Pricing and optimization in shared vehicle systems: An approximation framework. Operations Research, 2021.
[4]
Siddhartha Banerjee, Yash Kanoria, and Pengyu Qian. Dynamic assignment control of a closed queueing network under complete resource pooling. arXiv e-prints, pages arXiv--1803, 2018.
[5]
Allan Borodin, Nathan Linial, and Michael E Saks. An optimal on-line algorithm for metrical task system. Journal of the ACM (JACM), 39(4):745--763, 1992.
[6]
Anton Braverman, Jim G Dai, Xin Liu, and Lei Ying. Empty-car routing in ridesharing systems. Operations Research, 2019.
[7]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym, 2016.
[8]
Luce Brotcorne, Gilbert Laporte, and Frederic Semet. Ambulance location and relocation models. European journal of operational research, 147(3):451--463, 2003.
[9]
Food Bank of the Southern Tier of New York. https://www.foodbankst.org/, 2020.
[10]
Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. Stable baselines. https://github.com/hill-a/stable-baselines, 2018.
[11]
Christian D Hubbs, Hector D Perez, Owais Sarwar, Nikolaos V Sahinidis, Ignacio E Grossmann, and John M Wassick. Or-gym: A reinforcement learning library for operations research problem. arXiv preprint arXiv:2008.06319, 2020.
[12]
William Ogilvy Kermack, A. G. McKendrick, and Gilbert Thomas Walker. A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London, 115:772, 1927. Series A, Containing Papers of a Mathematical and Physical Character.
[13]
Anup Malani, Satej Soman, Sam Asher, Paul Novosad, Clement Imbert, Vaidehi Tandel, Anish Agarwal, Abdullah Alomar, Arnab Sarker, Devavrat Shah, Dennis Shen, Jonathan Gruber, Stuti Sachdeva, David Kaiser, and Luis M.A. Bettencourt. Adaptive control of covid-19 outbreaks in india: Local, gradual, and trigger-based exit paths from lockdown. Working Paper 27532, National Bureau of Economic Research, July 2020.
[14]
Hongzi Mao, Parimarjan Negi, Akshay Narayan, Hanrui Wang, Jiacheng Yang, Haonan Wang, Ryan Marcus, ravichandra addanki, Mehrdad Khani Shirkoohi, Songtao He, Vikram Nathan, Frank Cangialosi, Shaileshh Venkatakrishnan, Wei-Hung Weng, Song Han, Tim Kraska, and Dr.Mohammad Alizadeh. Park: An open platform for learning-augmented computer systems. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
[15]
Matthew S Maxwell, Mateo Restrepo, Shane G Henderson, and Huseyin Topaloglu. Approximate dynamic programming for ambulance redeployment. INFORMS Journal on Computing, 22(2):266--281, 2010.
[16]
New York State Department of Health. New york state's covid-19 vaccination program. October 2020.
[17]
W Powell. Reinforcement learning and stochastic optimization, 2019.
[18]
Antoine Prouvost, Justin Dumouchelle, Maxime Gasse, Didier Ch´etelat, and Andrea Lodi. Ecole: A library for learning inside milp solvers. arXiv preprint arXiv:2104.02828, 2021.
[19]
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
[20]
Samuel Ridler. Jemss. https://github.com/uoa-ems-research/JEMSS.jl, 2021.
[21]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016.
[22]
Sean Sinclair, Christopher Archer, Carrie Rucker, Max Solberg, Mayleen Cortez, Shashank Pathak, Siddhartha Banerjee, and Christina Yu. Orsuite. https://github.com/cornell-orie/ORSuite, 2021.
[23]
Sean Sinclair, Tianyu Wang, Gauri Jain, Siddhartha Banerjee, and Christina Yu. Adaptive discretization for model-based reinforcement learning. Advances in Neural Information Processing Systems, 33, 2020.
[24]
Sean R. Sinclair, Siddhartha Banerjee, and Christina Lee Yu. Adaptive discretization for episodic reinforcement learning in metric spaces. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 3(3):1--44, Dec 2019.
[25]
Sean R. Sinclair, Siddhartha Banerjee, and Christina Lee Yu. Sequential fair allocation: Achieving the optimal envy-efficiency tradeoff curve, 2021.
[26]
Sean R Sinclair, Gauri Jain, Siddhartha Banerjee, and Christina Lee Yu. Sequential fair allocation of limited resources under stochastic demands. arXiv preprint 60 Performance Evaluation Review, Vol. 49, No. 2, September 2021 arXiv:2011.14382, 2020.
[27]
Aleksandrs Slivkins. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1--2):1--286, 2019.
[28]
Zhao Song and Wen Sun. Efficient model-free reinforcement learning in metric spaces. arXiv preprint arXiv:1905.00475, 2019.
[29]
Robert Sugden. Is fairness good? a critique of varian's theory of fairness. Nous, pages 505--511, 1984.
[30]
Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.
[31]
Hal R. Varian. Equity, envy, and efficiency. Journal of Economic Theory, 9(1):63--91, September 1974.
[32]
Hal R Varian. Two problems in the theory of fairness. Journal of Public Economics, 5(3--4):249--260, 1976.

Cited By

View all
  • (2024)Adaptivity, Structure, and Objectives in Sequential Decision-MakingACM SIGMETRICS Performance Evaluation Review10.1145/3639830.363984651:3(38-41)Online publication date: 5-Jan-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 49, Issue 2
September 2021
73 pages
ISSN:0163-5999
DOI:10.1145/3512798
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 January 2022
Published in SIGMETRICS Volume 49, Issue 2

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)4
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Adaptivity, Structure, and Objectives in Sequential Decision-MakingACM SIGMETRICS Performance Evaluation Review10.1145/3639830.363984651:3(38-41)Online publication date: 5-Jan-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media