Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3535782.3535803acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmsieConference Proceedingsconference-collections
research-article

Research on Order Acceptance Strategy for Military manufactures Based on Semi-Markov Average Reward Reinforcement Learning

Published: 18 July 2022 Publication History

Abstract

Order acceptance is an important decision-making issue that military manufacturers need to consider, which is crucial for military manufacturers to complete their production tasks and improve their benefits. In this paper, we first analyze the customer orders pattern and production pattern of military manufacturers, and study the order acceptance decision algorithm for the military products orders with the characteristics such as small orders, multiple batches, strong dynamics, emphasis on customer priority, and complex inventory requirements. The order acceptance problem model of traditional manufacturing enterprises mainly considers factors such as delay penalty cost, rejection cost and production cost under static state conditions, On the basis of these factors, we further take the inventory cost of orders completed before the lead time and multiple customer priorities as influencing factors of order acceptance problem into consideration under dynamic demand. Combine with the idea of reward management, we solve the order acceptance problem with a semi-Markov decision process (SMDP) model using a model-free reinforcement learning method: Semi-Markov Average Reward Technique (SMART). Finally, We present a detailed study of this algorithm on a military order acceptance problem in the simulation environment. We compare the average reward, order acceptance rate and product acceptance rate of different algorithms, and verify the superiority of the order acceptance strategy model proposed in this paper over the traditional decision model.

References

[1]
SLOTNICK S, MORTON T. Order acceptance with weighted tardiness [J]. Computers and Operations Research, 2007, 34 (10) :3029 - 3042.
[2]
LAWLER E, LENSTRA J, RINNOOY K A. Recent developments in deterministic sequencing and scheduling: a survey [M]. DEMPSTER M, LENSTRA J, RINNOOY K A, editors. Deterministic and stochastic scheduling, 1982: 35 - 73.
[3]
Bruce L. Miller. A Queueing Reward System with Several Customer Classes[J]. Institute of Management Sciences, 1969, 16(3).
[4]
Arash Abedi,Weihang Zhu. An advanced order acceptance model for hybrid production strategy[J].Journal of Manufacturing System,2020,55:82-93.
[5]
Tarik Aouam,Kobe Geryl,Kunal Kumar,Nadjib Brahimi. production planning with order acceptance and demand uncertainty[J].Computers & Operations Research,2018,91:145-159.
[6]
Zhang X,Ma S-H. Order acceptance strategy based on limited production capacity and output caching[J]. Industrial Engineering and Management,2008(02):34-38+43+79.
[7]
Fan, L.F.,Chen, X. Order pricing and acceptance strategies for MTO companies based on revenue management[J]. Systems Engineering,2011,29(02):87-93.
[8]
Li Xin,Ventura Jose A. Exact algorithms for a joint order acceptance and scheduling problem[J]. International Journal of Production Economics,2019(prepublish).
[9]
Gao, Huahua, Dan, Bin, Yan, Jian. Integrated decision making of order selection and scheduling for MTO enterprises considering demand time series correlation[J]. Journal of Management Engineering,2017,31(03):108-116.
[10]
Walter O. Rom,Susan A. Slotnick. Order acceptance using genetic algorithms[J]. Computers and Operations Research,2008,36(6).
[11]
Fabrice Talla Nobibon,Roel Leus. Exact algorithms for a generalization of the order acceptance and scheduling problem in a single-machine environment [J]. Computers and Operations Research,2010,38(1).
[12]
Bahriye Cesaret,Ceyda Oğuz,F. Sibel Salman. A tabu search algorithm for order acceptance and scheduling[J]. Computers and Operations Research,2010,39(6).
[13]
Xueping Li,Jiao Wang,Rapinder Sawhney. Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems[J]. European Journal of Operational Research,2012,221(1).
[14]
Facundo Arredondo, Ernesto Martinez. Learning and adaptation of a policy for dynamic order acceptance in make-to-order manufacturing[J]. Computers & Industrial Engineering,2009,58(1).
[15]
Hao, Zuan, Yu, Jianjun, Zhou, Wenhui. Order acceptance strategy of order production method companies based on average reinforcement learning[J]. Computer Applications,2013,33(04):976-979.
[16]
Wang Lei,Xu Shaoyun,Zhao Yang,Zhao Qiuhong. A multi-node order acceptance model with finite buffer and algorithm[J] China Management Science,2015,23(12):135-141.
[17]
Charnsirisakskul K,Griffin P M,Keskinocak P.Order selection and scheduling with leadtime flexibilith[J]. IIE Transactions,2004,36(7):697-707.
[18]
Wang Xiaohuan,Wang Ningning,Fan Zhiping. Order acceptance strategy for order production-based firms based on reinforcement learning[J]. Systems Engineering Theory and Practice,2014,34(12):3121-3129.
[19]
H. F. Rahman, M. N. Janardhanan and I. E. Nielsen. real-time order acceptance and scheduling problems in a flow shop Environment using hybrid GA-PSO algorithm[J].IEEE Access,2019,7:112742-112755.
[20]
Gao Yang,Zhou Ruyi,Wang Hao,Cao Zhixin. Research on average reward reinforcement learning algorithm[J]. Journal of Computer Science,2007(08):1372-1378.
[21]
HERBOTS J, HERROELEN W, LEUS R. Dynamic order acceptance and capacity planning on a single bottleneck resource [J].Naval Research Logistics,2007,54(8): 874-889.
[22]
HING M, van HARTEN M. Reinforcement learning versus heuris-tics for order acceptance on a single resource[J].Journal of Heuristics,2007, 13( 2) : 167- 187.
[23]
Charnsirisakskul K,Griffin P M,Keskinocak P. Order selection and scheduling with leadtime flexibility[J].IIE Transactions,2004,36(7):697-707.
[24]
Darken C,Moody J. Learning rate schedules for faster stochastic gradient search[M].IEEE Press,1992.
[25]
Kaelbling L P, Littman M L,Moore A P. Reinforcement learning: a survey Journal of Artificial Intelligence Research,1996,4:237-285.
[26]
Sutton R S, Barto A G. Reinforcement learning :An Introduction.Cambridge MA:MIT Press,1998.
[27]
Richard S. Sutton, Doina Precup, Satinder Singh. between MDPs and semi-MDPs A framework for temporal abstraction in reinforcement learning [J]. Artifificial Intelligence, 1999,112 :181-211
[28]
Tapas K, Das, Abhijit Gosavi. Solving Semi-Markov Decision Problems using Average Reward Reinforcement Learning [J]. Management Science, 1999, 45:455-620

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MSIE '22: Proceedings of the 4th International Conference on Management Science and Industrial Engineering
April 2022
497 pages
ISBN:9781450395816
DOI:10.1145/3535782
© 2022 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MSIE 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 52
    Total Downloads
  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)7
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media