Abstract
In this work, we carry out the hyper-parameters tuning of a Machine Learning (ML) Recommender Systems (RS) which utilizes an Artificial Neural Network (ANN), called CATA++. We have performed tuning of the activation function, weight initialization and training epochs of CATA++ in order to improve both training and performance. During the experiments, a variety of state-of-the-art activation functions have been tested: ReLU, LeakyReLU, ELU, SineReLU, GELU, Mish, Swish and Flatten-T Swish. Additionally, various weight initializers have been tested, such as: XavierGlorot, Orthogonal, He, Lecun. Moreover, we ran experiments with different epochs number from 10 to 150. We have used data from CiteULike and AMiner Citation Network. The recorded metrics (Recall, nDCG) indicate that hyper-parameters tuning can reduce notably the necessary training time, while the recommendation performance is significantly improved (up to +44.2% Recall).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Alfarhood, M., Cheng, J.: CATA++: a collaborative dual attentive autoencoder method for recommending scientific articles. IEEE Access 8, 183633–183648 (2020)
Tang, J., et al.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998 (2008)
Stergiopoulos, V., Tsianaka, T., Tousidou, E.: AMiner citation-data preprocessing for recommender systems on Scientific Publications. In: Proceedings of 25th Pan-Hellenic Conference on Informatics, pp. 23–27. ACM (2021)
Stergiopoulos, V., Vassilakopoulos, M., Tousidou, E., Corral, A.: An application of ANN hyper-parameters tuning in the field of recommender systems. Technical report, Data Structuring & Engineering Laboratory, University of Thessaly, Volos, Greece (2022). https://faculty.e-ce.uth.gr/mvasilako/techrep2022.pdf
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of 27th International Conference on Machine Learning, pp. 807–814, Haifa (2010)
Pedamonti, D.: Comparison of non-linear activation functions for deep neural networks on MNIST classification task. CoRR:1804.02763 (2018)
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of 30th International Conference on Machine Learning (2013)
Clevert, D., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv:1511.07289v5 and ICLR (Poster) (2016)
Rodrigues, W.: SineReLU - An Alternative to the ReLU Activation Function (2018). https://wilder-rodrigues.medium.com/sinerelu-an-alternative-to-the-relu-activation-function-e46a6199997d. Accessed 5 Mar 2022
Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUS). arXiv:1606.08415v4 (2020)
Misra, D.: Mish: a self regularized non-monotonic neural activation function. CoRR:1908.08681 (2019)
Ramachandran, P., Zoph, B., Le, Q.: Searching for activation functions. CoRR:1710.05941 (2017) and ICLR’ 2018 (Workshop) (2018)
Chieng, H., Wahid, N., Pauline, O., Perla, S.: Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning. Int. J. Adv. Intell. Inform. 4(2), 76–86 (2018)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Chia Laguna Resort, Sardinia, Italy, vol. 9 of JMLR: W &CP 9 (2010)
He, K., et al.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv:1312.6120 and ICLR 2014 (2014)
LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_3
Eger, S., Youssef, P., Gurevych, I.: Is it time to swish? Comparing deep learning activation functions across NLP tasks. CoRR:1901.02671 (2019) and EMNLP 2018, pp. 4415–4424 (2018)
Kumar, S.K.: On weight initialization in deep neural networks. CoRR:1704.08863 (2017)
Acknowledgements
The work of M. Vassilakopoulos and A. Corral was funded by the MINECO research project [TIN2017-83964-R] and the Junta de Andalucia research project [P20_00809].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Stergiopoulos, V., Vassilakopoulos, M., Tousidou, E., Corral, A. (2022). Hyper-parameters Tuning of Artificial Neural Networks: An Application in the Field of Recommender Systems. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-15743-1_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15742-4
Online ISBN: 978-3-031-15743-1
eBook Packages: Computer ScienceComputer Science (R0)