Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

In-sensor classification with boosted race trees

Published: 24 May 2021 Publication History

Abstract

When extremely low-energy processing is required, the choice of data representation makes a tremendous difference. Each representation (e.g., frequency domain, residue coded, and log-scale) embodies a different set of tradeoffs based on the algebraic operations that are either easy or hard to perform in that domain. We demonstrate the potential of a novel form of encoding, race logic, in which information is represented as the delay in the arrival of a signal. Under this encoding, the ways in which signal delays interact and interfere with one another define the operation of the system. Observations of the relative delays (for example, the outcome of races between signals) define the output of the computation. Interestingly, completely standard hardware logic elements can be repurposed to this end and the resulting embedded systems have the potential to be extremely energy efficient. To realize this potential in a practical design, we demonstrate two different approaches to the creation of programmable tree-based ensemble classifiers in an extended set of race logic primitives; we explore the trade-offs inherent to their operation across sensor, hardware architecture, and algorithm; and we compare the resulting designs against traditional state-of-the-art hardware techniques.

References

[1]
Bermak, A., Martinez, D. A compact 3d vlsi classifier using bagging threshold network ensembles. IEEE Trans. Neural Netw. Learn. Syst. 14, 5 (2003), 1097--1109.
[2]
Chan, V., Liu, S.C., van Schaik, A. Aer ear: a matched silicon cochlea pair with address event representation interface. IEEE Trans Circuits Syst I Regul Pap. 54, 1(2007), 48--59.
[3]
Chen, S., Wang, Y., Lin, X., Xie, Q., Pedram, M. Performance prediction for multiple-threshold 7nm-FinFET-based circuits operating in multiple voltage regimes using a cross-layer simulation framework. In 2014 SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S) (Millbrae, CA, 2014), 1--2. https://ieeexplore.ieee.org/document/7028218.
[4]
Clow, J., Tzimpragos, G., Dangwal, D., Guo, S., McMahan, J., Sherwood, T. A pythonic approach for rapid hardware prototyping and instrumentation. In 2017 27th International Conference on Field Programmable Logic and Applications (FPL) (Ghent, 2017), 1--7, https://ieeexplore.ieee.org/document/8056860.
[5]
Esser, S.K., Appuswamy, R., Merolla, P., Arthur, J.V., Modha, D.S. Backpropagation for energy-efficient neuromorphic computing. In Advances in Neural Information Processing Systems, 2015, 1117--1125. https://papers.nips.cc/paper/2015/hash/10a5ab2db37feedfdeaab192ead4ac0e-Abstract.html.
[6]
Guo, X., Qi, X., Harris, J.G. A time-to-first-spike cmos image sensor. IEEE Sens. J. 7, 8(2007), 1165--1175.
[7]
Kim, J.K., Knag, P., Chen, T., Zhang, Z. A 640M pixel/s 3.65mW sparse event-driven neuromorphic object recognition processor with on-chip learning. In 2015 Symposium on VLSI Circuits (VLSI Circuits) (Kyoto, 2015), C50-C51, https://ieeexplore.ieee.org/document/7231323.
[8]
Kung, J., Kim, D., Mukhopadhyay, S. A power-aware digital feedforward neural network platform with backpropagation driven approximate synapses. In 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). IEEE, 2015, 85--90.
[9]
Lichtsteiner, P., Posch, C., Delbruck, T. A 128x128 120db 15/μs latency asynchronous temporal contrast vision sensor. IEEE J. Solid-St. Circ. 43, 2(2008), 566--576.
[10]
Madhavan, A., Sherwood, T., Strukov, D. Race logic: a hardware acceleration for dynamic programming algorithms. Comput. Architect. News 42, 3(2014), 517--528.
[11]
Madhavan, A., Sherwood, T., Strukov, D. Energy efficient computation with asynchronous races. In 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) (Austin, TX, 2016), 1--6. https://ieeexplore.ieee.org/document/7544351.
[12]
Najafi, M.H., Lilja, D.J., Riedel, M., Bazargan, K. Power and area efficient sorting networks using unary processing. In 2017 IEEE International Conference on Computer Design (ICCD) (Boston, MA, 2017), 125--128. https://ieeexplore.ieee.org/document/8119200.
[13]
Naraghi, S. Time-based analog to digital converters, 2009. https://oatd.org/oatd/record?record=handle%5C%3A2027.42%5C%2F64787.
[14]
Niclass, C., Soga, M., Matsubara, H., Kato, S., Kagami, M. A 100-m range 10-frame/s 340x96-pixel time-of-flight depth sensor in 0.18-μm cmos. IEEE J. Solid-St. Circ. 48, 2(2013), 559--572.
[15]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot M., Duchesnay, E. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825--2830.
[16]
Posch, C., Matolin, D., Wohlgenannt, R., Hofstäatter, M., Schäon, P., Litzenberger, M., Bauer, D., Garn, H. Biomimetic frame-free hdr camera with event-driven pwm image/video sensor and full-custom address-event processor. In 2010 IEEE on Biomedical Circuits and Systems Conference (BioCAS). IEEE, 2010, 254--257.
[17]
Qi, X., Guo, X., Harris, J.G. A time-to-first spike CMOS imager. In 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512) (Vancouver, BC, 2004). IV--824. https://ieeexplore.ieee.org/document/1329131.
[18]
Reagen, B., Whatmough, P., Adolf, R., Rama, S., Lee, H., Lee, S.K., Hernández-Lobato, J.M., Wei, G.Y., Brooks, D. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16) (2016). IEEE Press, 267--278. https://dl.acm.org/doi/10.1145/3007787.3001165.
[19]
Shalf, J., Dosanjh, S., Morrison, J. Exascale computing technology challenges. In High Performance Computing for Computational Science -- VECPAR 2010. J. M. L. M. Palma, M. Daydé, O. Marques, and J. C. Lopes, eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, 1--25.
[20]
Smith, J. Space-time algebra: A model for neocortical computation. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) (Los Angeles, CA, 2018), 289--300. https://ieeexplore.ieee.org/document/8416835.
[21]
Tzimpragos, G., Vasudevan, D., Tsiskaridze, N., Michelogiannakis, G., Madhavan, A., Volk, J., Shalf, J., Sherwood, T. A computational temporal logic for superconducting accelerators. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '20. Association for Computing Machinery, New York, NY, USA, 2020, 435--448.
[22]
Vasudevan, D., Butko, A., Michelogiannakis, G., Donofrio, D., Shalf, J. Towards an integrated strategy to preserve digital computing performance scaling using emerging technologies. In High Performance Computing. J. M. Kunkel, R. Yokota, M. Taufer, and J. Shalf, eds. Springer International Publishing, Cham, 2017, 115--123.
[23]
Whatmough, P.N., Lee, S.K., Lee, H., Rama, S., Brooks, D., Wei, G. 14.3 A 28nm SoC with a 1.2 GHz 568nJ/prediction sparse deep-neural-network engine with > 0.1 timing error rate tolerance for IoT applications. In 2017 IEEE International Solid-State Circuits Conference (ISSCC). (San Francisco, CA, 2017), 242--243. https://ieeexplore.ieee.org/document/7870351.
[24]
Wolf, C., Glaser, J. Yosys--A free verilog synthesis suite. In Proceedings of Austrochip (2013). https://www.semanticscholar.org/paper/Yosys-A-Free-Verilog-Synthesis-Suite-Wolf-Glaser/65b4all36599d74ada27ce5226f02dda06d2ccda.

Cited By

View all
  • (2022)A Primer for tinyML Predictive Maintenance: Input and Model OptimisationArtificial Intelligence Applications and Innovations10.1007/978-3-031-08337-2_6(67-78)Online publication date: 10-Jun-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 64, Issue 6
June 2021
106 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3467845
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2021
Published in CACM Volume 64, Issue 6

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,175
  • Downloads (Last 6 weeks)44
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)A Primer for tinyML Predictive Maintenance: Input and Model OptimisationArtificial Intelligence Applications and Innovations10.1007/978-3-031-08337-2_6(67-78)Online publication date: 10-Jun-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media