Abstract
Regressions on Sequential Data (RSD) are widely used in different disciplines. This paper proposes DeepRSD, which utilizes several different neural networks to result in an effective end-to-end learning method for RSD problems. There have been several variants of deep Recurrent Neural Networks (RNNs) in classification problems. The main functional part of DeepRSD is the stacked bi-directional RNNs, which is the most suitable deep RNN model for sequential data. We explore several conditions to ensure a plausible training of DeepRSD. More importantly, we propose an alternative dropout to improve its generalization. We apply DeepRSD to two different real-world problems and achieve state-of-the-art performances. Through comparisons with state-of-the-art methods, we conclude that DeepRSD can be a competitive method for RSD problems.
X. Wang—Work done in University of Wollongong. Now the author is in Bitmain Technologies Inc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
https://www.npowerjobs.com/graduates/forecasting-challenge. Data are publicly available. Competition results are also published on this webpage.
- 3.
References
Agostinelli, F., Hoffman, M., Sadowski, P., Baldi, P.: Learning activation functions to improve deep neural networks. arXiv preprint arXiv:1412.6830 (2014)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(Feb), 281–305 (2012)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. arXiv preprint arXiv:1603.02754 (2016)
Gal, Y.: A theoretically grounded application of dropout in recurrent neural networks. arXiv preprint arXiv:1512.05287 (2015)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS, vol. 9, pp. 249–256 (2010)
Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)
Graves, A., Jaitly, N., Mohamed, A.-R.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 273–278. IEEE (2013)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
McGovern, A., Gagne, D.J., Basara, J., Hamill, T.M., Margolin, D.: Solar energy prediction: an international contest to initiate interdisciplinary research on compelling meteorological problems. Bull. Am. Meteorol. Soc. 96(8), 1388–1395 (2015)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate \( {\cal{O}}(1/k^2)\). In: Soviet Mathematics Doklady, vol. 27, pp. 372–376 (1983)
Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026 (2013)
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: ICML, vol. 28, no. 3, pp. 1310–1318 (2013)
Qin, T., Liu, T.-Y., Zhang, X.-D., Wang, D.-S., Li, H.: Global ranking using continuous conditional random fields. In: Advances in Neural Information Processing Systems, pp. 1281–1288 (2009)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cognit. Model. 5(3), 1 (1988)
Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120 (2013)
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sutskever, I., Martens, J., Dahl, G.E., Hinton, G.E.: On the importance of initialization and momentum in deep learning. In: ICML, vol. 28, no. 3, pp. 1139–1147 (2013)
Wytock, M., Kolter, Z.: Sparse Gaussian conditional random fields: algorithms, theory, and application to energy forecasting. In: International Conference on Machine Learning, pp. 1265–1273 (2013)
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, X., Zhang, M., Ren, F. (2018). DeepRSD: A Deep Regression Method for Sequential Data. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11012. Springer, Cham. https://doi.org/10.1007/978-3-319-97304-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-97304-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97303-6
Online ISBN: 978-3-319-97304-3
eBook Packages: Computer ScienceComputer Science (R0)