research-article

Open access

In-sensor classification with boosted race trees

Authors:

Georgios Tzimpragos,

Advait Madhavan,

Dilip Vasudevan,

Dmitri Strukov,

Timothy SherwoodAuthors Info & Claims

Communications of the ACM, Volume 64, Issue 6

Pages 99 - 105

https://doi.org/10.1145/3460223

Published: 24 May 2021 Publication History

All formats PDF

Abstract

When extremely low-energy processing is required, the choice of data representation makes a tremendous difference. Each representation (e.g., frequency domain, residue coded, and log-scale) embodies a different set of tradeoffs based on the algebraic operations that are either easy or hard to perform in that domain. We demonstrate the potential of a novel form of encoding, race logic, in which information is represented as the delay in the arrival of a signal. Under this encoding, the ways in which signal delays interact and interfere with one another define the operation of the system. Observations of the relative delays (for example, the outcome of races between signals) define the output of the computation. Interestingly, completely standard hardware logic elements can be repurposed to this end and the resulting embedded systems have the potential to be extremely energy efficient. To realize this potential in a practical design, we demonstrate two different approaches to the creation of programmable tree-based ensemble classifiers in an extended set of race logic primitives; we explore the trade-offs inherent to their operation across sensor, hardware architecture, and algorithm; and we compare the resulting designs against traditional state-of-the-art hardware techniques.

References

[1]

Bermak, A., Martinez, D. A compact 3d vlsi classifier using bagging threshold network ensembles. IEEE Trans. Neural Netw. Learn. Syst. 14, 5 (2003), 1097--1109.

[2]

Chan, V., Liu, S.C., van Schaik, A. Aer ear: a matched silicon cochlea pair with address event representation interface. IEEE Trans Circuits Syst I Regul Pap. 54, 1(2007), 48--59.

[3]

Chen, S., Wang, Y., Lin, X., Xie, Q., Pedram, M. Performance prediction for multiple-threshold 7nm-FinFET-based circuits operating in multiple voltage regimes using a cross-layer simulation framework. In 2014 SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S) (Millbrae, CA, 2014), 1--2. https://ieeexplore.ieee.org/document/7028218.

[4]

Clow, J., Tzimpragos, G., Dangwal, D., Guo, S., McMahan, J., Sherwood, T. A pythonic approach for rapid hardware prototyping and instrumentation. In 2017 27th International Conference on Field Programmable Logic and Applications (FPL) (Ghent, 2017), 1--7, https://ieeexplore.ieee.org/document/8056860.

[5]

Esser, S.K., Appuswamy, R., Merolla, P., Arthur, J.V., Modha, D.S. Backpropagation for energy-efficient neuromorphic computing. In Advances in Neural Information Processing Systems, 2015, 1117--1125. https://papers.nips.cc/paper/2015/hash/10a5ab2db37feedfdeaab192ead4ac0e-Abstract.html.

[6]

Guo, X., Qi, X., Harris, J.G. A time-to-first-spike cmos image sensor. IEEE Sens. J. 7, 8(2007), 1165--1175.

[7]

Kim, J.K., Knag, P., Chen, T., Zhang, Z. A 640M pixel/s 3.65mW sparse event-driven neuromorphic object recognition processor with on-chip learning. In 2015 Symposium on VLSI Circuits (VLSI Circuits) (Kyoto, 2015), C50-C51, https://ieeexplore.ieee.org/document/7231323.

[8]

Kung, J., Kim, D., Mukhopadhyay, S. A power-aware digital feedforward neural network platform with backpropagation driven approximate synapses. In 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). IEEE, 2015, 85--90.

[9]

Lichtsteiner, P., Posch, C., Delbruck, T. A 128x128 120db 15/μs latency asynchronous temporal contrast vision sensor. IEEE J. Solid-St. Circ. 43, 2(2008), 566--576.

[10]

Madhavan, A., Sherwood, T., Strukov, D. Race logic: a hardware acceleration for dynamic programming algorithms. Comput. Architect. News 42, 3(2014), 517--528.

Digital Library

[11]

Madhavan, A., Sherwood, T., Strukov, D. Energy efficient computation with asynchronous races. In 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) (Austin, TX, 2016), 1--6. https://ieeexplore.ieee.org/document/7544351.

Digital Library

[12]

Najafi, M.H., Lilja, D.J., Riedel, M., Bazargan, K. Power and area efficient sorting networks using unary processing. In 2017 IEEE International Conference on Computer Design (ICCD) (Boston, MA, 2017), 125--128. https://ieeexplore.ieee.org/document/8119200.

[13]

Naraghi, S. Time-based analog to digital converters, 2009. https://oatd.org/oatd/record?record=handle%5C%3A2027.42%5C%2F64787.

[14]

Niclass, C., Soga, M., Matsubara, H., Kato, S., Kagami, M. A 100-m range 10-frame/s 340x96-pixel time-of-flight depth sensor in 0.18-μm cmos. IEEE J. Solid-St. Circ. 48, 2(2013), 559--572.

[15]

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot M., Duchesnay, E. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825--2830.

Digital Library

[16]

Posch, C., Matolin, D., Wohlgenannt, R., Hofstäatter, M., Schäon, P., Litzenberger, M., Bauer, D., Garn, H. Biomimetic frame-free hdr camera with event-driven pwm image/video sensor and full-custom address-event processor. In 2010 IEEE on Biomedical Circuits and Systems Conference (BioCAS). IEEE, 2010, 254--257.

[17]

Qi, X., Guo, X., Harris, J.G. A time-to-first spike CMOS imager. In 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512) (Vancouver, BC, 2004). IV--824. https://ieeexplore.ieee.org/document/1329131.

[18]

Reagen, B., Whatmough, P., Adolf, R., Rama, S., Lee, H., Lee, S.K., Hernández-Lobato, J.M., Wei, G.Y., Brooks, D. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16) (2016). IEEE Press, 267--278. https://dl.acm.org/doi/10.1145/3007787.3001165.

Digital Library

[19]

Shalf, J., Dosanjh, S., Morrison, J. Exascale computing technology challenges. In High Performance Computing for Computational Science -- VECPAR 2010. J. M. L. M. Palma, M. Daydé, O. Marques, and J. C. Lopes, eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, 1--25.

Digital Library

[20]

Smith, J. Space-time algebra: A model for neocortical computation. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) (Los Angeles, CA, 2018), 289--300. https://ieeexplore.ieee.org/document/8416835.

Digital Library

[21]

Tzimpragos, G., Vasudevan, D., Tsiskaridze, N., Michelogiannakis, G., Madhavan, A., Volk, J., Shalf, J., Sherwood, T. A computational temporal logic for superconducting accelerators. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '20. Association for Computing Machinery, New York, NY, USA, 2020, 435--448.

Digital Library

[22]

Vasudevan, D., Butko, A., Michelogiannakis, G., Donofrio, D., Shalf, J. Towards an integrated strategy to preserve digital computing performance scaling using emerging technologies. In High Performance Computing. J. M. Kunkel, R. Yokota, M. Taufer, and J. Shalf, eds. Springer International Publishing, Cham, 2017, 115--123.

[23]

Whatmough, P.N., Lee, S.K., Lee, H., Rama, S., Brooks, D., Wei, G. 14.3 A 28nm SoC with a 1.2 GHz 568nJ/prediction sparse deep-neural-network engine with > 0.1 timing error rate tolerance for IoT applications. In 2017 IEEE International Solid-State Circuits Conference (ISSCC). (San Francisco, CA, 2017), 242--243. https://ieeexplore.ieee.org/document/7870351.

[24]

Wolf, C., Glaser, J. Yosys--A free verilog synthesis suite. In Proceedings of Austrochip (2013). https://www.semanticscholar.org/paper/Yosys-A-Free-Verilog-Synthesis-Suite-Wolf-Glaser/65b4all36599d74ada27ce5226f02dda06d2ccda.

Cited By

Janchum WBunnam TRachburee NJantara SSrikram P(2025)Design of Decision Tree Inference with DAC-based Feature Threshold Circuits2025 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC64972.2025.10879730(1-4)Online publication date: 19-Jan-2025
https://doi.org/10.1109/ICEIC64972.2025.10879730
Njor EMadsen JFafoutis X(2022)A Primer for tinyML Predictive Maintenance: Input and Model OptimisationArtificial Intelligence Applications and Innovations10.1007/978-3-031-08337-2_6(67-78)Online publication date: 10-Jun-2022
https://doi.org/10.1007/978-3-031-08337-2_6

Index Terms

In-sensor classification with boosted race trees
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Reconfigurable computing
2. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators

Recommendations

Boosted Race Trees for Low Energy Classification
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems

When extremely low-energy processing is required, the choice of data representation makes a tremendous difference. Each representation (e.g. frequency domain, residue coded, log-scale) comes with a unique set of trade-offs --- some operations are easier ...
Turbocharging boosted transactions or: how i learnt to stop worrying and love longer transactions
PPoPP '09

Boosted transactions offer an attractive method that enables programmers to create larger transactions that scale well and offer deadlock-free guarantees. However, as boosted transactions get larger, they become more susceptible to conflicts and aborts. ...
Turbocharging boosted transactions or: how i learnt to stop worrying and love longer transactions
PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming

Boosted transactions offer an attractive method that enables programmers to create larger transactions that scale well and offer deadlock-free guarantees. However, as boosted transactions get larger, they become more susceptible to conflicts and aborts. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Communications of the ACM

Communications of the ACM Volume 64, Issue 6

June 2021

106 pages

ISSN:0001-0782

EISSN:1557-7317

DOI:10.1145/3467845

Editor:
Andrew A. Chien
Association for Computing Machinery, New York, NY

Issue’s Table of Contents

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2021

Published in CACM Volume 64, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed

Funding Sources

NSF (National Science Foundation)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
15,630
Total Downloads

Downloads (Last 12 months)404
Downloads (Last 6 weeks)66

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Janchum WBunnam TRachburee NJantara SSrikram P(2025)Design of Decision Tree Inference with DAC-based Feature Threshold Circuits2025 International Conference on Electronics, Information, and Communication (ICEIC)10.1109/ICEIC64972.2025.10879730(1-4)Online publication date: 19-Jan-2025
https://doi.org/10.1109/ICEIC64972.2025.10879730
Njor EMadsen JFafoutis X(2022)A Primer for tinyML Predictive Maintenance: Input and Model OptimisationArtificial Intelligence Applications and Innovations10.1007/978-3-031-08337-2_6(67-78)Online publication date: 10-Jun-2022
https://doi.org/10.1007/978-3-031-08337-2_6

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents