Abstract
This study investigates the application of reinforcement learning for the adaptive tuning of neighborhood probabilities in stochastic multi-neighborhood search. The aim is to provide a more flexible and robust tuning method for heterogeneous scenarios than traditional offline tuning. We propose a novel mix of learning components for multi-neighborhood Simulated Annealing, which considers both cost- and time-effectiveness of moves. To assess the performance of our approach we employ two real-world case studies in timetabling, namely examination timetabling and sports timetabling, for which multi-neighborhood Simulated Annealing has already obtained remarkable results using offline tuning techniques. Experimental data show that our approach obtains better results than the analogous algorithm that uses state-of-the-art offline tuning on benchmarking datasets while requiring less tuning effort.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The running time was scaled based on the ratio of performance on the different machines.
References
Adriaensen, S., et al.: Automated dynamic algorithm configuration. J. Artif. Intell. Res. 75, 1633–1699 (2022)
Alicastro, M., Ferone, D., Festa, P., Fugaro, S., Pastore, T.: A reinforcement learning iterated local search for makespan minimization in additive manufacturing machine scheduling problems. Comput. Oper. Res. 131, 105272 (2021)
Anagnostopoulos, A., Michel, L., Van Hentenryck, P., Vergados, Y.: A simulated annealing approach to the traveling tournament problem. J. Sched. 9(2), 177–193 (2006)
Bartz-Beielstein, T., et al.: Benchmarking in optimization: best practice and open issues. CoRR, arXiv abs/2007.03488 (2020)
Bellio, R., Ceschia, S., Di Gaspero, L., Schaerf, A.: Two-stage multi-neighborhood simulated annealing for uncapacitated examination timetabling. Comput. Oper. Res. 132, 105300 (2021)
Bellio, R., Ceschia, S., Di Gaspero, L., Schaerf, A., Urli, T.: Feature-based tuning of simulated annealing applied to the curriculum-based course timetabling problem. Comput. Oper. Res. 65, 83–92 (2016)
Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290(2), 405–421 (2021)
Burke, E.K., Hyde, M.R., Kendall, G., Ochoa, G., Özcan, E., Woodward, J.: A classification of hyper-heuristic approaches: revisited. In: Gendreau, M., Potvin, J.Y. (eds.) Handbook of Metaheuristics. International Series in Operations Research & Management Science, vol. 272, pp. 453–477. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91086-4_14
Carter, M., Laporte, G., Lee, S.: Examination timetabling: algorithmic strategies and applications. J. Oper. Res. Soc. 74, 373–383 (1996)
Ceschia, S., Di Gaspero, L., Schaerf, A.: Educational timetabling: problems, benchmarks, and state-of-the-art results. Eur. J. Oper. Res. 308(1), 1–18 (2023)
Doerr, B., Doerr, C.: Theory of parameter control for discrete black-box optimization: provable performance gains through dynamic parameter choices. In: Doerr, B., Neumann, F. (eds.) Theory of Evolutionary Computation. NCS, pp. 271–321. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29414-4_6
Fialho, A., Da Costa, L., Schoenauer, M., Sebag, M.: Analyzing bandit-based adaptive operator selection mechanisms. Ann. Math. Artif. Intell. 60(1–2), 25–64 (2010)
Franzin, A., Stützle, T.: Revisiting simulated annealing: a component-based analysis. Comput. Oper. Res. 104, 191–206 (2019)
Glover, F., Kochenberger, G.: Handbook of Metaheuristics, vol. 57. Springer, Berlin, Heidelberg (2006). ISBN 978-1402072635
Gunawan, A., Lau, H.C., Lu, K.: Adopt: combining parameter tuning and adaptive operator ordering for solving a class of orienteering problems. Comput. Ind. Eng. 121, 82–96 (2018)
Huang, C., Li, Y., Yao, X.: A survey of automatic parameter tuning methods for metaheuristics. IEEE Trans. Evolut. Comput. 24(2), 201–216 (2020)
Hutter, F., Hoos, H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (eds.) Learning and Intelligent Optimization. LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40
Kadioglu, S., Malitsky, Y., Sellmann, M., Tierney, K.: ISAC-instance-specific algorithm configuration. In: ECAI 2010, pp. 751–756. IOS Press (2010)
Kallestad, J., Hasibi, R., Hemmati, A., Sörensen, K.: A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems. Eur. J. Oper. Res. 309(1), 446–468 (2023)
Karimi-Mamaghan, M., Mohammadi, M., Meyer, P., Karimi-Mamaghan, A., Talbi, E.G.: Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: a state-of-the-art. Eur. J. Oper. Res. 296(2), 393–422 (2022)
Kheiri, A., Gretsista, A., Keedwell, E., Lulli, G., Epitropakis, M., Burke, E.: A hyper-heuristic approach based upon a hidden Markov model for the multi-stage nurse rostering problem. Comput. Oper. Res. 130, 105221 (2021)
KhudaBukhsh, A., Xu, L., Hoos, H., Leyton-Brown, K.: SATenstein: automatically building local search SAT solvers from components. Artif. Intell. 232, 20–42 (2016)
Lamaz-Fernandez, C., Martinez-Sykora, A., Potts, C.: Scheduling double round-robin sports tournaments. In: Proceedings of the 13th International Conference on the Practice and Theory of Automated Timetabling, vol. 2 (2021)
Li, J., Pardalos, P.M., Sun, H., Pei, J., Zhang, Y.: Iterated local search embedded adaptive neighborhood selection approach for the multi-depot vehicle routing problem with simultaneous deliveries and pickups. Expert Syst. Appl. 42(7), 3551–3561 (2015)
López-Ibáñez, M., Dubois-Lacoste, J., Cáceres, L.P., Birattari, M., Stützle, T.: The irace package: iterated racing for automatic algorithm configuration. Oper. Res. Perspect. 3, 43–58 (2016)
Lü, Z., Hao, J.K.: Adaptive neighborhood search for nurse rostering. Eur. J. Oper. Res. 218(3), 865–876 (2012)
Mara, S., Norcahyo, R., Jodiawan, P., Lusiantoro, L., Rifai, A.P.: A survey of adaptive large neighborhood search algorithms and applications. Comput. Oper. Res. 146(105903) (2022)
Maron, O., Moore, A.: The racing algorithm: model selection for lazy learners. Artif. Intell. Rev. 11(1), 193–225 (1997)
Mischek, F., Musliu, N.: Reinforcement learning for cross-domain hyper-heuristics. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence IJCAI-22, pp. 4793–4799 (2022)
Mosadegh, H., Ghomi, S., Süer, G.: Stochastic mixed-model assembly line sequencing problem: mathematical modeling and Q-learning based simulated annealing hyper-heuristics. Eur. J. Oper. Res. 282(2), 530–544 (2020)
Pisinger, D., Ropke, S.: Large neighborhood search. In: Gendreau, M., Potvin, J.Y. (eds.) Handbook of Metaheuristics. International Series in Operations Research & Management Science, vol. 272, pp. 99–127. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91086-4_4
Rosati, R.M., Petris, M., Di Gaspero, L., Schaerf, A.: Multi-neighborhood simulated annealing for the sports timetabling competition ITC2021. J. Sched. 25(3), 301–319 (2022)
dos Santos, J., de Melo, J., Neto, A., Aloise, D.: Reactive search strategies using reinforcement learning, local search algorithms and variable neighborhood search. Expert Syst. Appl. 41(10), 4939–4949 (2014)
Schneider, M., Hoos, H.: Quantifying homogeneity of instance sets for algorithm configuration. In: Hamadi, Y., Schoenauer, M. (eds.) Learning and Intelligent Optimization. LION 2012. LNCS, vol. 7219, pp. 190–204. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34413-8_14
Shahmardan, A., Sajadieh, M.: Truck scheduling in a multi-door cross-docking center with partial unloading-reinforcement learning-based simulated annealing approaches. Comput. Ind. Eng. 139, 106134 (2020)
Smith-Miles, K., Baatar, D., Wreford, B., Lewis, R.: Towards objective measures of algorithm performance across instance space. Comput. Oper. Res. 45, 12–24 (2014)
Song, H., Triguero, I., Özcan, E.: A review on the self and dual interactions between machine learning and optimisation. Progress Artif. Intell. 8(2), 143–165 (2019)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Talbi, E.G.: Machine learning into metaheuristics: a survey and taxonomy. ACM Comput. Surv. (CSUR) 54(6), 1–32 (2021)
Toffolo, T.A., Christiaens, J., Van Malderen, S., Wauters, T., Vanden Berghe, G.: Stochastic local search with learning automaton for the swap-body vehicle routing problem. Comput. Oper. Res. 89, 68–81 (2018)
Van Bulck, D., Goossens, D.: The international timetabling competition on sports timetabling (ITC2021). Eur. J. Oper. Res. 308(3), 1249–1267 (2023)
Acknowledgments
This research has been funded by the Italian Ministry of University and Research under the action PRIN 2020, project “Models and algorithms for the optimization of integrated healthcare management”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ceschia, S., Di Gaspero, L., Rosati, R.M., Schaerf, A. (2024). Reinforcement Learning for Multi-Neighborhood Local Search in Combinatorial Optimization. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14506. Springer, Cham. https://doi.org/10.1007/978-3-031-53966-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-53966-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53965-7
Online ISBN: 978-3-031-53966-4
eBook Packages: Computer ScienceComputer Science (R0)