The Performance of Various Optimizers in Machine Learning

The Performance of Various Optimizers in Machine Learning

© 2021 by IJETT Journal
Volume-69 Issue-7
Year of Publication : 2021
Authors : Rajendra.P, Pusuluri V.N.H, Gunavardhana Naidu.T
DOI :  10.14445/22315381/IJETT-V69I7P209

How to Cite?

Rajendra.P, Pusuluri V.N.H, Gunavardhana Naidu.T, "The Performance of Various Optimizers in Machine Learning," International Journal of Engineering Trends and Technology, vol. 69, no. 7, pp. 64-68, 2021. Crossref,

The primary goal of the optimizers is to speed up the training and helps to boost the efficiency of the models. Optimization methods are the engines underlying deep neural networks that enable them to learn from data. When faced with the training of a neural network, the decision of which optimizer to select seems to be shrouded in mystery, since in the general literature around optimizers require a lot of mathematical baggage. To define a practical criterion, the authors carried out a series of experiments to see the performance of different optimizers in canonical problems of machine learning. So we can choose an optimizer easily.

Optimizers, Machine Learning, Neural Network, Gradient Descent, Adaptive methods.

[1] Rajendra, P., Subbarao, A., Ramu, G. et al., Prediction of drug solubility on parallel computing architecture by support vector machines, Netw Model Anal Health Inform Bioinforma 7 (13) (2018).
[2] Murthy N, Saravana R, Rajendra P., Critical comparison of northeast monsoon rainfall for different regions through analysis of means technique, Mausam, 69 (2018) 411–418
[3] Narasimha Murthy, K.V., Saravana, R. & Rajendra, P., Unobserved component modeling for seasonal rainfall patterns in Rayalaseema region, India 1951–2015. Meteorol Atmos Phys 131 (2019) 1387–1399.
[4] Rao AS, Sainath S, Rajendra P, Ramu G., Mathematical modeling of hydromagnetic Casson non-Newtonian nanofluid convection slip flows from an isothermal sphere. Nonlinear Eng, 8(1) 645– 660.
[5] Kim, Sung Eun & Seo, Il.Artificial Neural Network ensemble modeling with conjunctive data clustering for water quality prediction in rivers, Journal of Hydro-environment Research, (2015). Doi: 9. 10.1016/j.jher.2014.09.006.
[6] A. S. Nemirovsky and D. B. Yudin., Problem Complexity and Method Efficiency in Optimization, John Wiley & Sons,(1983).
[7] N. L. Roux, M. Schmidt, and F. R. Bach., A stochastic gradient method with an exponential convergence rate for finite training sets, in Advances in Neural Information Processing Systems, (2012) 2663–2671.
[8] Mizutani E, Dreyfus SE.. Second-order stagewise backpropagation for Hessian-matrix analyses and investigation of negative curvature, Neural Netw, 21(2-3) (2008) 193-203.Doi: 10.1016/j.neunet.2007.12.038.
[9] Defazio, A., Bach, F. R. & Lacoste-Julien, S., SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives, In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence & K. Q. Weinberger (eds.), (2014)1646- 1654. NIPS.
[10] Dauphin, Y. N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., & Bengio, Y., Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, Advances in Neural Information Processing Systems, (2014) 2933-2941.
[11] LeCun, Y., Bengio, Y. & Hinton, G., Deep learning, Nature, 521(2015) 436–444.
[12] Boston housing price dataset: (Accessed on 24/03/2021)
[13] Fashion MNIST dataset: fashionmnist (Accessed on 24/03/2021)
[14] Sentiment analysis IMDB dataset: (Accessed on 24/03/2021)
[15] Rajendra, P., Brahmajirao, V., Modeling of dynamical systems through deep learning, Biophys Rev,12 (2020) 1311– 1320.