Abstract
Recent research has found that deep learning architectures show significant improvements over traditional shallow algorithms when mining high dimensional datasets. When the choice of algorithm employed, hyper-parameter setting, number of hidden layers and nodes within a layer are combined, the identification of an optimal configuration can be a lengthy process. Our work provides a framework for building deep learning architectures via a stepwise approach, together with an evaluation methodology to quickly identify poorly performing architectural configurations. Using a dataset with high dimensionality, we illustrate how different architectures perform and how one algorithm configuration can provide input for fine-tuning more complex models.
Research funded by In-MINDD, an EU FP7 project, Grant Agreement Number 304979.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Parameters not learned by the algorithm but instead passed as input.
- 2.
ensures features with large data values does not overly impact the model.
References
Arauzo-Azofra, A., Aznarte, J.L., Bentez, J.M.: Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst. Appl. 38(7), 8170–8177 (2011)
Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Ian Goodfellow, J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)
Bellazzi, R., Zupan, B.: Predictive data mining in clinical medicine: current issues and guidelines. Int. J. Med. Inform. 77(2), 81–97 (2008)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010. Oral Presentation
Camous, F., McCann, D., Roantree, M.: Capturing personal health data from wearable sensors. In: International Symposium on Applications and the Internet, SAINT 2008, pp. 153–156. IEEE (2008)
Deckers, K., Boxtel, M.P.J., Schiepers, O.J.G., Vugt, M., Sánchez, J.L.M., Anstey, K.J., Brayne, C., Dartigues, J.-F., Engedal, K., Kivipelto, M., et al.: Target risk factors for dementia prevention: a systematic review and delphi consensus study on the evidence from observational studies. Int. J.Geriatr. Psychiatry 30(3), 234–246 (2014)
Donnelly, N., Irving, K., Roantree, M.: Cooperation across multiple healthcare clinics on the cloud. In: Magoutis, K., Pietzuch, P. (eds.) DAIS 2014. LNCS, vol. 8460, pp. 82–88. Springer, Heidelberg (2014)
Fakhraei, S., Soltanian-Zadeh, H., Fotouhi, F., Elisevich, K.: Confidence in medical decision making: application in temporal lobe epilepsy data mining. In: Proceedings of the 2011 Workshop on Data Mining for Medicine and Healthcare, pp. 60–63. ACM (2011)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Hinton, G.: A practical guide to training restricted boltzmann machines. Momentum 9(1), 926 (2010)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Humphrey, E.J., Bello, J.P., LeCun, Y.: Feature learning and deep architectures: new directions for music informatics. J. Intell. Inf. Syst. 41(3), 461–481 (2013)
van Boxtel, M.P.J., Ponds, R.H.W.M., Jolles, J., Houx, P.J.: The Maastricht Aging Study: Determinants of Cognitive Aging. Neuropsych Publishers, Maastricht (1995)
Larochelle, H., Erhan, D., Courville, A., Bergstra, J., Bengio, Y.: An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 473–480. ACM, New York, NY, USA (2007)
Liang, Z., Zhang, G., Huang, J.X., Hu, Q.V.: Deep learning for healthcare decision making with EMRs. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 556–559. IEEE (2014)
Roantree, M., O’Donoghue, J., O’Kelly, N., Pierce, M., Irving, K., Van Boxtel, M., Köhler, S.: Mapping longitudinal studies to risk factors in an ontology for dementia. Health Inf. J., pp. 1–13 (2015)
Roantree, M., Shi, J., Cappellari, P., O’Connor, M.F., Whelan, M., Moyna, N.: Data transformation and query management in personal health sensor networks. J. Netw. Comput. Appl. 35(4), 1191–1202 (2012). Intelligent Algorithms for Data-Centric Sensor Networks
Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009)
van Boxtel, M.P., Buntinx, F., Houx, P.J., Metsemakers, J.F., Knottnerus, A., Jolles, J.: The relation between morbidity and cognitive performance in a normal aging population. J. Gerontol. Ser. A Biol. Sci. Med. Sci. 53(2), 147–154 (1998)
Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of the 30th International Conference on Machine Learning, ICML-2013, pp. 1058–1066 (2013)
Jimeno Yepes, A., MacKinlay, A., Bedo, J., Garnavi, R., Chen, Q.: Deep belief networks and biomedical text categorisation. In: Australasian Language Technology Association Workshop, p. 123 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Donoghue, J.O., Roantree, M. (2015). A Framework for Selecting Deep Learning Hyper-parameters. In: Maneth, S. (eds) Data Science. BICOD 2015. Lecture Notes in Computer Science(), vol 9147. Springer, Cham. https://doi.org/10.1007/978-3-319-20424-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-20424-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20423-9
Online ISBN: 978-3-319-20424-6
eBook Packages: Computer ScienceComputer Science (R0)