Abstract
In this paper, we study the use of reinforcement learning in adaptive operator selection within the Iterated Local Search metaheuristic for solving the well-known NP-Hard Traveling Salesman Problem. This metaheuristic basically employs single local search and perturbation operators for finding the (near-) optimal solution. In this paper, by incorporating multiple local search and perturbation operators, we explore the use of reinforcement learning, and more specifically Q-learning as a machine learning technique, to intelligently select the most appropriate search operator(s) at each stage of the search process. The Q-learning is separately used for both local search operator selection and perturbation operator selection. The performance of the proposed algorithms is tested through a comparative analysis against a set of benchmark algorithms. Finally, we show that intelligently selecting the search operators not only provides better solutions with lower optimality gaps but also accelerates the convergence of the algorithms toward promising solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmadi, E., Goldengorin, B., Süer, G.A., Mosadegh, H.: A hybrid method of 2-TSP and novel learning-based GA for job sequencing and tool switching problem. Appl. Soft Comput. 65, 214–229 (2018)
Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290(2), 405–421 (2021)
Burke, E.K., Hyde, M.R., Kendall, G., Ochoa, G., Özcan, E., Woodward, J.R.: A classification of hyper-heuristic approaches: revisited. In: Gendreau, M., Potvin, J.-Y. (eds.) Handbook of Metaheuristics. ISORMS, vol. 272, pp. 453–477. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91086-4_14
Calvet, L., de Armas, J., Masip, D., Juan, A.A.: Learnheuristics: hybridizing metaheuristics with machine learning for optimization with dynamic inputs. Open Math. 15(1), 261–280 (2017)
El Krari, M., El Benani, B., et al.: Breakout local search for the travelling salesman problem. Comput. Inform. 37(3), 656–672 (2018)
Fialho, Á.: Adaptive operator selection for optimization. Ph.D. thesis, Université Paris Sud - Paris XI (2010)
Karimi-Mamaghan, M., Mohammadi, M., Jula, P., Pirayesh, A., Ahmadi, H.: A learning-based metaheuristic for a multi-objective agile inspection planning model under uncertainty. Eur. J. Oper. Res. 285(2), 513–537 (2020)
Karimi-Mamaghan, M., Mohammadi, M., Pasdeloup, B., Billot, R., Meyer, P.: An online learning-based metaheuristic for solving combinatorial optimization problems. In: 21ème congrès annuel de la société Française de Recherche Opérationnelle et d’Aide à la Décision (ROADEF) (2020)
Karimi-Mamaghan, M., Mohammadi, M., Pirayesh, A., Karimi-Mamaghan, A.M., Irani, H.: Hub-and-spoke network design under congestion: a learning based metaheuristic. Transp. Res. Part E: Logist. Transp. Rev. 142, 102069 (2020)
Karimi-Mamaghan, M., Mohammadi, M., Meyer, P., Karimi-Mamaghan, A.M., Talbi, E.G.: Machine Learning at the service of Meta-heuristics for solving Combinatorial Optimization Problems: A state-of-the-art. Eur. J. Oper. Res. (2021)
Karp, R.M.: On the computational complexity of combinatorial problems. Networks 5(1), 45–68 (1975)
Lourenço, H.R., Martin, O.C., Stützle, T.: Iterated local search. In: Glover, F., Kochenberger, G.A. (eds.) Handbook of Metaheuristics. ISOR, vol. 57, pp. 320–353. Springer, Heidelberg (2003). https://doi.org/10.1007/0-306-48056-5_11
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
Mohammadi, M., Jula, P., Tavakkoli-Moghaddam, R.: Reliable single-allocation hub location problem with disruptions. Transp. Res. Part E: Logist. Transp. Rev. 123, 90–120 (2019)
Mohammadi, M., Tavakkoli-Moghaddam, R., Siadat, A., Dantan, J.Y.: Design of a reliable logistics network with hub disruption under uncertainty. Appl. Math. Model. 40(9–10), 5621–5642 (2016)
Mosadegh, H., Ghomi, S.F., Süer, G.A.: Stochastic mixed-model assembly line sequencing problem: mathematical modeling and q-learning based simulated annealing hyper-heuristics. Eur. J. Oper. Res. 282(2), 530–544 (2020)
Pasdeloup, B., Karimi-Mamaghan, M., Mohammadi, M., Meyer, P.: Autoencoder-based generation of individuals in population-based metaheuristics. In: ROADEF 2020: 21ème Congrès Annuel de la Société Française de Recherche Opérationnelle et d’Aide à la Décision (2020)
Peng, B., Zhang, Y., Gajpal, Y., Chen, X.: A memetic algorithm for the green vehicle routing problem. Sustainability 11(21), 6055 (2019)
Ridge, E., Kudenko, D.: Tuning an algorithm using design of experiments. In: Bartz-Beielstein, T., Chiarandini, M., Paquete, L., Preuss, M. (eds.) Experimental Methods for the Analysis of Optimization Algorithms, pp. 265–286. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-02538-9_11
Sakurai, Y., Takada, K., Kawabe, T., Tsuruta, S.: A method to control parameters of evolutionary algorithms by using reinforcement learning. In: 2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems, pp. 74–79. IEEE (2010)
dos Santos, J.P.Q., de Melo, J.D., Neto, A.D.D., Aloise, D.: Reactive search strategies using reinforcement learning, local search algorithms and variable neighborhood search. Expert Syst. Appl. 41(10), 4939–4949 (2014)
Song, H., Triguero, I., Özcan, E.: A review on the self and dual interactions between machine learning and optimisation. Progr. Artif. Intell. 8(2), 143–165 (2019). https://doi.org/10.1007/s13748-019-00185-z
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Talbi, E.G.: Metaheuristics: From Design to Implementation, vol. 74. Wiley, Hoboken (2009)
Talbi, E.G.: Combining metaheuristics with mathematical programming, constraint programming and machine learning. Ann. Oper. Res. 240(1), 171–215 (2016). https://doi.org/10.1007/s10479-015-2034-y
Turkeš, R., Sörensen, K., Hvattum, L.M.: Meta-analysis of metaheuristics: quantifying the effect of adaptiveness in adaptive large neighborhood search. Eur. J. Oper. Res. 292(2), 423–442 (2021)
Watkins, C.J.C.H.: Learning from delayed rewards (1989)
Wauters, T., Verbeeck, K., De Causmaecker, P., Berghe, G.V.: Boosting metaheuristic search using reinforcement learning. In: Talbi, E.G. (ed.) Hybrid Metaheuristics. Studies in Computational Intelligence, vol. 434, pp. 433–452. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-30671-6_17
Zhalechian, M., Tavakkoli-Moghaddam, R., Zahiri, B., Mohammadi, M.: Sustainable design of a closed-loop location-routing-inventory supply chain network under mixed uncertainty. Transp. Res. Part E: Logist. Transp. Rev. 89, 182–214 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Karimi-Mamaghan, M., Pasdeloup, B., Mohammadi, M., Meyer, P. (2021). A Learning-Based Iterated Local Search Algorithm for Solving the Traveling Salesman Problem. In: Dorronsoro, B., Amodeo, L., Pavone, M., Ruiz, P. (eds) Optimization and Learning. OLA 2021. Communications in Computer and Information Science, vol 1443. Springer, Cham. https://doi.org/10.1007/978-3-030-85672-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-85672-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85671-7
Online ISBN: 978-3-030-85672-4
eBook Packages: Computer ScienceComputer Science (R0)