Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Framework for Selecting Deep Learning Hyper-parameters

  • Conference paper
  • First Online:
Data Science (BICOD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9147))

Included in the following conference series:

Abstract

Recent research has found that deep learning architectures show significant improvements over traditional shallow algorithms when mining high dimensional datasets. When the choice of algorithm employed, hyper-parameter setting, number of hidden layers and nodes within a layer are combined, the identification of an optimal configuration can be a lengthy process. Our work provides a framework for building deep learning architectures via a stepwise approach, together with an evaluation methodology to quickly identify poorly performing architectural configurations. Using a dataset with high dimensionality, we illustrate how different architectures perform and how one algorithm configuration can provide input for fine-tuning more complex models.

Research funded by In-MINDD, an EU FP7 project, Grant Agreement Number 304979.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Parameters not learned by the algorithm but instead passed as input.

  2. 2.

    ensures features with large data values does not overly impact the model.

References

  1. Arauzo-Azofra, A., Aznarte, J.L., Bentez, J.M.: Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst. Appl. 38(7), 8170–8177 (2011)

    Article  Google Scholar 

  2. Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Ian Goodfellow, J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)

    Google Scholar 

  3. Bellazzi, R., Zupan, B.: Predictive data mining in clinical medicine: current issues and guidelines. Int. J. Med. Inform. 77(2), 81–97 (2008)

    Article  Google Scholar 

  4. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  6. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)

    MATH  MathSciNet  Google Scholar 

  7. Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010. Oral Presentation

    Google Scholar 

  8. Camous, F., McCann, D., Roantree, M.: Capturing personal health data from wearable sensors. In: International Symposium on Applications and the Internet, SAINT 2008, pp. 153–156. IEEE (2008)

    Google Scholar 

  9. Deckers, K., Boxtel, M.P.J., Schiepers, O.J.G., Vugt, M., Sánchez, J.L.M., Anstey, K.J., Brayne, C., Dartigues, J.-F., Engedal, K., Kivipelto, M., et al.: Target risk factors for dementia prevention: a systematic review and delphi consensus study on the evidence from observational studies. Int. J.Geriatr. Psychiatry 30(3), 234–246 (2014)

    Article  Google Scholar 

  10. Donnelly, N., Irving, K., Roantree, M.: Cooperation across multiple healthcare clinics on the cloud. In: Magoutis, K., Pietzuch, P. (eds.) DAIS 2014. LNCS, vol. 8460, pp. 82–88. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  11. Fakhraei, S., Soltanian-Zadeh, H., Fotouhi, F., Elisevich, K.: Confidence in medical decision making: application in temporal lobe epilepsy data mining. In: Proceedings of the 2011 Workshop on Data Mining for Medicine and Healthcare, pp. 60–63. ACM (2011)

    Google Scholar 

  12. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  13. Hinton, G.: A practical guide to training restricted boltzmann machines. Momentum 9(1), 926 (2010)

    Google Scholar 

  14. Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  15. Humphrey, E.J., Bello, J.P., LeCun, Y.: Feature learning and deep architectures: new directions for music informatics. J. Intell. Inf. Syst. 41(3), 461–481 (2013)

    Article  Google Scholar 

  16. van Boxtel, M.P.J., Ponds, R.H.W.M., Jolles, J., Houx, P.J.: The Maastricht Aging Study: Determinants of Cognitive Aging. Neuropsych Publishers, Maastricht (1995)

    Google Scholar 

  17. Larochelle, H., Erhan, D., Courville, A., Bergstra, J., Bengio, Y.: An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 473–480. ACM, New York, NY, USA (2007)

    Google Scholar 

  18. Liang, Z., Zhang, G., Huang, J.X., Hu, Q.V.: Deep learning for healthcare decision making with EMRs. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 556–559. IEEE (2014)

    Google Scholar 

  19. Roantree, M., O’Donoghue, J., O’Kelly, N., Pierce, M., Irving, K., Van Boxtel, M., Köhler, S.: Mapping longitudinal studies to risk factors in an ontology for dementia. Health Inf. J., pp. 1–13 (2015)

    Google Scholar 

  20. Roantree, M., Shi, J., Cappellari, P., O’Connor, M.F., Whelan, M., Moyna, N.: Data transformation and query management in personal health sensor networks. J. Netw. Comput. Appl. 35(4), 1191–1202 (2012). Intelligent Algorithms for Data-Centric Sensor Networks

    Article  Google Scholar 

  21. Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009)

    Google Scholar 

  22. van Boxtel, M.P., Buntinx, F., Houx, P.J., Metsemakers, J.F., Knottnerus, A., Jolles, J.: The relation between morbidity and cognitive performance in a normal aging population. J. Gerontol. Ser. A Biol. Sci. Med. Sci. 53(2), 147–154 (1998)

    Article  Google Scholar 

  23. Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of the 30th International Conference on Machine Learning, ICML-2013, pp. 1058–1066 (2013)

    Google Scholar 

  24. Jimeno Yepes, A., MacKinlay, A., Bedo, J., Garnavi, R., Chen, Q.: Deep belief networks and biomedical text categorisation. In: Australasian Language Technology Association Workshop, p. 123 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jim O’ Donoghue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Donoghue, J.O., Roantree, M. (2015). A Framework for Selecting Deep Learning Hyper-parameters. In: Maneth, S. (eds) Data Science. BICOD 2015. Lecture Notes in Computer Science(), vol 9147. Springer, Cham. https://doi.org/10.1007/978-3-319-20424-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20424-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20423-9

  • Online ISBN: 978-3-319-20424-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics