Abstract
The world has witnessed a surfeit of usage of Artificial Intelligence systems for a long time. Nowadays, most of the problems are transforming from logical solutions into statistical domains. This requires the implementation of machine learning algorithms to mine useful data from the statistical datasets which in turn demands high-end computing. Generally, machine learning algorithms utilize Gradient Descent as a tool to find the optimal solution of computationally expensive problems. This gave rise to the development of optimization algorithms like Momentum, RMSProp, Adam and the like, which could speed up the convergence to the global optimum besides increasing the learning accuracy. However, nowadays the supervised machine learning models got more data intensive which increased their computational cost, putting the efficiency of these algorithms into question. In this context, a new optimization algorithm namely, the Tangent-Cut Optimizer (TC-Opt) has been proposed which can converge faster than the traditional optimization algorithms for supervised machine learning models. Furthermore, the proposed work brings forward a phenomenon that intertwines the statistical and logical decision-making model into a single unit while shedding light on a new heuristic approach named “Hybrid Heuristics”. The proposed algorithm has been implemented on the standard dataset of Boston House Pricing Dataset for linear regression and MNIST image dataset of handwritten digits from 0 to 9 for logistic regression and its performance has been compared with the existing algorithms. Finally, the robustness and high accuracy of the proposed optimization algorithm have been proved and demonstrated in the presentation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
No explicit data is available for this work. All experimental data has been provided in the article itself and some more important information related to the work has been provided through the supplementary material.
Code availability
Custom codes are available. No external software was used for experimentation.
References
Akhtar A, (2019) Evolution of Ant Colony Optimization Algorithm—A brief literature review. Preprint at https://arxiv.org/abs/1908.08007[cs.NE] 1–11
Akram S, Ann Q (2015) Newton Raphson Method. Int J Sci Eng Res 6(7):1748–1752
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In 19th International Conference on Computational Statistics, Paris, France 177–186
Chakhlevitch K, Cowling P (2008) Hyperheuristics: recent developments. Adapt Multilevel Metaheuristics Stud Comput Intell 136:3–29
Chee J, Toulis P (2018) Convergence diagnostics for stochastic gradient descent with constant learning rate. https://arxiv.org/abs/1710.06382[stat.ML] 1–36
Cowling P, Kendall G, Soubeiga E (2001) A hyperheuristic approach to scheduling a sales summit. In: Burke E, Erben W (eds) International conference on the practice and theory of automated timetabling III, vol 2079. Springer. Berlin, Heidelberg, pp 176–190
D’Angelo G, Palmieri F (2021) GGA: a modified genetic algorithm with gradient-based local search for solving constrained optimization problems. Inf Sci 547(8):136–162
Dauphin YN, Vries H, Chung J, Bengio Y (2015) Equilibrated adaptive learning rates for non-convex optimization. Preprint at https://arxiv.org/abs/1502.04390[cs.LG] 1–10
Doncieux S, Mouret JB, Bredeche N, Padois V (2011) Evolutionary robotics exploring new horizons. In: Doncieux S, Bredèche N, Mouret J-B (eds) New horizons in evolutionary robotics. Springer, Berlins
Dozat T (2016) Incorporating Nesterov Momentum into Adam. International Conference on Learning Representation 2016, San Juan, PR 1–4
Duan H, Qiao P (2014) Pigeon-inspired optimization: a new swarm intelligence optimizer for robot path planning. Int J Intell Comput Cybern 7(1):24–37
Gitman I, Dilipkumar D, Parr B (2018) Convergence analysis of gradient descent algorithms with proportional updates. Preprint at https://arxiv.org/abs/1801.03137 [cs.LG] 1–12
Henderson D, Jacobson SH, Johnson AW (2003) The theory and practice of simulated annealing. In: Glover F, Kochenberger GA (eds) Handbook of metaheuristics. International series in operations research and management science. Springer, Boston
Hennig P, Kiefel M (2013) Quasi-Newton methods: a new direction. J Mach Learn Res 14(1):843–865
Kamrani AK, Gonzalez R (2002) A heuristic genetic algorithm methodology. Proceedings of the 5th Biannual World Automation Congress, Orlando, FL 71–76
Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc. of IEEE International Conference on Neural Networks, Perth, WA 4: 1942–1948
Khirirat S, Feyzmahdavian HR, Johansson M (2017) Mini-batch gradient descent: Faster convergence under data sparsity. In 2017 IEEE 56th Annual Conference on Decision and Control, Melbourne, VIC 2880–2887
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. International Conference on Learning Representations, San Diego, CA, 1–13
Kokash N (2005) An introduction to heuristic algorithms. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.105.8050. Accessed 22 May 2020 1–8
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Lemaréchal C (2012) Cauchy and the gradient method. Doc Math 17:251–254
Lewis R (2009) A general-purpose hill-climbing method for order independent minimum grouping problems: a case study in graph colouring and bin packing. Comput Oper Res 36(7):2295–2310
Li X, Zhao H, Jin L (2012) An Algorithm for Global Optimization Problems Based on ABC-BFGS. In 2012 Fifth International Joint Conference on Computational Sciences and Optimization, Harbin 881–884
Liepins GE, Hilliard MR (1989) Genetic algorithms: foundations and applications. Ann Oper Res 21:31–57
Liu L, Jiang H, He P, Chen W, Liu X, Gao J, Han J (2020) On the Variance of the Adaptive Learning Rate and Beyond. Preprint at https://arxiv.org/abs/1908.03265[cs.LG] 1–14
Lourenço HR, Martin OC, Stützle T (2003) Iterated local search. In: Glover F, Kochenberger GA (eds) Handbook of Metaheuristics. International Series in Operations Research and Management Science. Springer, Boston
Najafabadi M, Khoshgoftaar T, Villanustre F, Holt J (2017) Large-scale distributed L-BFGS. J Big Data 4(22):1–17
Nath S, Sing JK, Sarkar SK, (2018) Performance comparison of PSO and its new variants in the context of VLSI global routing. Particle swarm optimization with applications, Pakize Erdoğmuş, IntechOpen 1–21. https://doi.org/10.5772/intechopen.72811. Available from: https://www.intechopen.com/books/particle-swarm-optimization-with-applications/performance-comparison-of-pso-and-its-new-variants-in-the-context-ofvlsi-global-routing
Nath S, Gupta S, Biswas S, Banerjee R, Sing JK, Sarkar SK (2020) GPSO Hybrid Algorithm for Rectilinear Steiner Tree Optimization. In 2020 IEEE VLSI DEVICE CIRCUIT AND SYSTEM, Kolkata, WB 365–369
Nesterov Y (1983) A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Soviet Math Dokl 27(2):543–547
Noel MM (2012) A new gradient based particle swarm optimization algorithm for accurate computation of global minimum. Appl Soft Comput 12(1):353–359
Rice JR (1976) The algorithm selection problem. Adv Comput 15:65–118
Ross P (2005) Hyper-Heuristics. In: Burke EK, Kendall G (eds) Search methodologies. Springer, Boston. https://doi.org/10.1007/0-387-28356-0_17
Ruder S (2017) An overview of gradient descent optimization algorithms. Preprint at https://arxiv.org/abs/1609.04747 [cs.LG] 1–14
Ryser-Welch P, Miller JF (2014) A review of hyper-heuristic frameworks. In 50th Annual Convention of the AISB 1–7
Salvatierra MA (2013) Quasi-Newton optimization algorithm to solve molecular distance geometry problems. In 2013 XXXIX Latin American Computing Conference, Naiguata 1–3
Smith-Miles K, Lopes L (2012) Measuring instance difficulty for combinatorial optimization problems. Comput Oper Res 39(5):875–889
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, Atlanta, Ga, 28(3): 1139–1147
Veenhuis C (2010) Binary invasive weed optimization. In 2010 Second World Congress on Nature and Biologically Inspired Computing, Fukuoka 449–454
Voudouris C, Tsang EPK (2003) Guided local search. In: Glover F, Kochenberger GA (eds) Handbook of metaheuristics, international series in operations research and management science. Springer, Boston
Funding
None.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors of this article declare no competing interest related to this research work.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Biswas, S., Nath, S., Dey, S. et al. Tangent-cut optimizer on gradient descent: an approach towards Hybrid Heuristics. Artif Intell Rev 55, 1121–1147 (2022). https://doi.org/10.1007/s10462-021-09984-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-021-09984-0