Abstract
In this paper, we present a Custom Parallel Architecture for Neural networks (CuPAN). CuPAN consists of streamlined nodes that each node is able to integrate a single or a group of neurons. It relies on a high-throughput and low-cost Clos on-chip interconnection network in order to efficiently handle inter-neuron communication. We show that the similarity between the traffic pattern of neural networks (multicast-based multi-stage traffic) and topological characteristics of multi-stage interconnection networks (MINs) makes neural networks naturally suited to the MINs. The Clos network, as one of the most important classes of MINs, provide scalable low-cost interconnection fabric composed of several stages of switches to connect two groups of nodes and interestingly, can support multicast in an efficient manner. Our evaluation results show that CuPAN can manage the multicast-based traffic of neural networks better than the mesh-based topologies used in many parallel neural network implementations and gives lower average message latency, which directly translates to faster neural processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Painkras, E., Plana, L.A., Garside, J., et al.: SpiNNaker: a 1-W 18-core system-on-chip for massively-parallel neural network simulation. IEEE J. Solid State Circuits 48(8), 1943–1953 (2013)
Carrillo, S., Harkin, J., McDaid, L.J., et al.: Scalable hierarchical network-on-chiparchitecture for spiking neural network hardware implementations. IEEE Trans. Parallel Distrib. Syst. 24, 2451–2461 (2013)
Zhang, Q., Wang, T., Tian, Y., et al.: ApproxANN: an approximate computing framework for artificial neural network. In: Design Automation and Test in Europe Conference Exhibition (DATE), pp. 701–706 (2015)
Venkataramani, S., Ranjan, A., Roy, K., Raghunathan, A.: AxNN. In: Proceedings of the International Symposium on Low power Electronics and Design-ISLPED 2014, pp. 27–32 (2014)
Esmaeilzadeh, H., Sampson, A., Ceze, L., Burger, D.: Neural acceleration for general-purpose approximate programs. IEEE Micro 33, 16–27 (2012)
Fakhraie, S.M., Smith, K.C.: VLSI — Compatible Implementations for Artificial Neural Networks. Springer, Heidelberg (1997)
Vainbrand, D., Ginosar, R.: Network-on-chip architectures for neural networks. In: 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, pp. 135–144. IEEE (2010)
Dong, Y., Li, C., Lin, Z., Watanabe, T.: Multiple network-on-chip model for high performance neural network. J. Semicond. Technol. Sci. 10, 1 (2010)
Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks, 1st edn. Morgan-Kaufmann Publishers, San Francisco (2004)
Legacy Myrinet-2000. https://www.myricom.com/hardware/myrinet-2000-switches.html. Accessed July 2015
Kao, Y.-H., Alfaraj, N., Yang, M., Chao, H.J.: Design of high-radix clos network-on-chip. In: Fourth ACM/IEEE International Symposium on Networks-on-Chip, pp. 181–188. IEEE (2010)
Kim, J., Dally, W.J., Towles, B., Gupta, A.K.: Microarchitecture of a high-radix router. In: 32nd International Symposium on Computer Architecture, pp. 420–431. IEEE (2005)
Kao, Y.-H., Yang, M., Artan, N.S., Chao, H.J.: CNoC: high-radix clos network-on-chip. IEEE Trans. Comput. Des. Integr. Circuits Syst. 30, 1897–1910 (2011)
Chen, L., Zhao, L., Wang, R., Pinkston, T.M.: MP3: minimizing performance penalty for power-gating of clos network-on-chip. In: 2014 IEEE 20th International Symposium on High Performance Computer Architecture, pp. 296–307. IEEE (2014)
Kamali, M., Petre, L., Sere, K., Daneshtalab, M.: Formal modeling of multicast communication in 3D NoCs. In: 2011 14th Euromicro Conference on Digital System Design, pp. 634–642. IEEE (2011)
Ebrahimi, M., Daneshtalab, M., Liljeberg, P., Plosila, J., Flich, J., et al.: Path-based partitioning methods for 3D networks-on-chip with minimal adaptive routing. IEEE Trans. Comput. 63, 718–733 (2014)
Esmaeilzadeh, H., Saeedi, P., Araabi, B.N., et al.: Neural network stream processing core (NnSP) for embedded systems. In: IEEE International Symposium Circuits System (2006)
Krizhevsky, A.: Learning multiple layers of features from tiny images. M.S. thesis, University of Toronto (2009)
Lichman, M.: UCI machine learning repository. In: University of California; Irvine; School of Information and Computer Science (2013). http://archive.ics.uci.edu/ml
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Acknowledgement
This work was partially supported by VINNOVA (Swedish Agency for Innovation Systems) within the CUBRIC project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Yasoubi, A., Hojabr, R., Takshi, H., Modarressi, M., Daneshtalab, M. (2015). CuPAN – High Throughput On-chip Interconnection for Neural Networks. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9491. Springer, Cham. https://doi.org/10.1007/978-3-319-26555-1_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-26555-1_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26554-4
Online ISBN: 978-3-319-26555-1
eBook Packages: Computer ScienceComputer Science (R0)