Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3470496.3527396acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article
Public Access

Rethinking programmable earable processors

Published: 11 June 2022 Publication History

Abstract

Earables such as earphones [15, 16, 73], hearing aids [28], and smart glasses [2, 14] are poised to be a prominent programmable computing platform in the future. In this paper, we ask the question: what kind of programmable hardware would be needed to support earable computing in future? To understand hardware requirements, we propose EarBench, a suite of representative emerging earable applications with diverse sensor-based inputs and computation requirements. Our analysis of EarBench applications shows that, on average, there is a 13.54×-3.97× performance gap between the computational needs of EarBench applications and the performance of the microprocessors that several of today's programmable earable SoCs are based on; more complex microprocessors have unacceptable energy efficiency for Earable applications. Our analysis also shows that EarBench applications are dominated by a small number of digital signal processing (DSP) and machine learning (ML)-based kernels that have significant computational similarity. We propose SpEaC --- a coarse-grained reconfigurable spatial architecture - as an energy-efficient programmable processor for earable applications. SpEaC targets earable applications efficiently using a) a reconfigurable fixed-point multiply-and-add augmented reduction tree-based substrate with support for vectorized complex operations that is optimized for the earable ML and DSP kernel code and b) a tightly coupled control core for executing other code (including non-matrix computation, or non-multiply or add operations in the earable DSP kernel code). Unlike other CGRAs that typically target general-purpose computations, SpEaC substrate is optimized for energy-efficient execution of the earable kernels at the expense of generality. Across all our kernels, SpEaC outperforms programmable cores modeled after M4, M7, A53, and HiFi4 DSP by 99.3×, 32.5×, 14.8×, and 9.8× respectively. At 63 mW in 28 nm, the energy efficiency benefits are 1.55 ×, 9.04×, 68.3 ×, and 32.7 × respectively; energy efficiency benefits are 15.7 × -- 1087 × over a low power Mali T628 MP6 GPU.

References

[1]
Berkin Akın, Franz Franchetti, and James C. Hoe. 2014. Understanding the design space of DRAM-optimized hardware FFT accelerators. In 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors. IEEE, Manhattan, NY, 248--255.
[2]
Amazon. 2021. Echo Frames. Amazon. https://www.amazon.com
[3]
Edward Anderson, Zhaojun Bai, Christian Bischof, L Susan Blackford, James Demmel, Jack Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, Alan McKenney, et al. 1999. LAPACK Users' guide. SIAM, Philidelphia, PA. https://www.netlib.org/lapack/lug/
[4]
Apple. 2021. Airpods Max. Apple. https://www.apple.com/airpods-max/
[5]
ARM. 2021. Cortex-A53. Arm Ltd. https://developer.arm.com/documentation/ddi0500/j/Cortex-A53
[6]
ARM. 2021. Cortex-M4. Arm Ltd. https://developer.arm.com/ip-products/processors/cortex-m/cortex-m4
[7]
ARM. 2021. Cortex-M7. Arm Ltd. https://developer.arm.com/ip-products/processors/cortex-m/cortex-m7
[8]
Hans-Kristian Arntzen. 2021. muFFT. https://github.com/Themaister/muFFT
[9]
Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In USENIX annual technical conference, FREENIX Track, Vol. 41. Califor-nia, USA, USENIX Association, Bostn, MA, 46. https://www.usenix.org/legacy/event/usenix05/tech/freenix/full_papers/bellard/bellard.pdf
[10]
K Swetha Bharati and Ashok Jhunjhunwala. 2015. Implementation of machine learning applications on a fixed-point DSP. In 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE). IEEE, Manhattan, NY, 1458--1463.
[11]
K Swetha Bharati and Ashok Jhunjhunwala. 2015. Implementation of machine learning applications on a fixed-point DSP. In 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE). IEEE, Manhattan, NY, 1458--1463.
[12]
Thomas Bible. 2016. Binaural Audio for Narrative VR. https://www.oculus.com/story-studio/blog/binaural-audio-for-narrative-vr/
[13]
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R Hower, Tushar Krishna, Somayeh Sardashti, et al. 2011. The gem5 simulator. ACM SIGARCH computer architecture news 39, 2 (2011), 1--7.
[14]
Bose. 2021. Bose Frames Tenor. Bose. https://www.bose.com/en_us/products/frames/bose-frames-tenor.html
[15]
Bose. 2021. Bose QuietComfort Earbuds. Bose. https://www.bose.com/en_us/products/headphones/earbuds/quietcomfort-earbuds.html
[16]
Bose. 2021. Bose Sport Earbuds. Bose. https://www.bose.com/en_us/products/headphones/earbuds/bose-sport-earbuds.html
[17]
Cadence Design Systems, Inc 2020. Tensilica HiFi DSP Family. Cadence Design Systems, Inc.
[18]
Martin Campbell-Kelly, William Aspray, Daniel P Snowman, Susan R McKay, and Wolfgang Christian. 1997. Computer A history of the information machine. Computers in Physics 11, 3 (1997), 256--257.
[19]
Wei-Hsin Chang and Truong Q Nguyen. 2008. On the fixed-point accuracy analysis of FFT algorithms. IEEE Transactions on Signal Processing 56, 10 (2008), 4673--4682. https://ieeexplore.ieee.org/abstract/document/4626107/
[20]
Arezki Abderrahim Chellal, José Lima, José Gonçalves, and Hicham Megnafi. 2021. Battery Management System For Mobile Robots based on an Extended Kalman Filter Approch. In 2021 29th Mediterranean Conference on Control and Automation (MED). IEEE, IEEE, Manhattan, NY, 1131--1136. https://ieeexplore.ieee.org/abstract/document/9480196/
[21]
Yu-Hsin Chen, Tushar Krishna, Joel S Emer, and Vivienne Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE journal of solid-state circuits 52, 1 (2017), 127--138. https://ieeexplore.ieee.org/abstract/document/7738524
[22]
Ronan Collobert, Christian Puhrsch, and Gabriel Synnaeve. 2016. Wav2Letter: an End-to-End ConvNet-based Speech Recognition System. CoRR abs/1609.03193 (2016). arXiv:1609.03193 http://arxiv.org/abs/1609.03193
[23]
R De Lucia, G Zucchelli, V Barletta, A Di Cori, M Giannotti Santoro, M Parollo, L Segreti, S Viani, V Della Tommasina, L Paperini, et al. 2020. The in-ear region as a novel anatomical site for ECG signal detection: validation study on healthy volunteers. Journal of Interventional Cardiac Electrophysiology 60 (2020), 1--8. https://idp.springer.com/authorize/casa?redirect_uri=https://link.springer.com/article/10.1007/s10840-020-00709-x
[24]
Denis Demidov. 2012. VexCL: Vector expression template library for OpenCL. https://2013.nscf.ru/TesisAll/Section%203/09_1430_DemidovDE_S3.pdf
[25]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805 (2018). arXiv:1810.04805 http://arxiv.org/abs/1810.04805
[26]
Digi-Key. 2005. Speaker CMS-151125-076SP-67. https://www.puiaudio.com/media/SpecSheet/AR01032MR-2-R.pdf
[27]
EarableAI. 2021. Smart Brain Care Wearable. EarableAI. https://earable.ai/preorder-en/
[28]
Eargo. 2021. EARGO NEO. Eargo. https://shop.eargo.com/
[29]
Graham Gobieski, Ahmet Oguz Atli, Kenneth Mai, Brandon Lucia, and Nathan Beckmann. 2021. Snafu: An Ultra-Low-Power, Energy-Minimal CGRA-Generation Framework and Architecture. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, Manhattan, NY, 1027--1040.
[30]
Yifan Gong and Yu-Hung Kao. 2000. Implementing a high accuracy speaker-independent continuous speech recognizer on a fixed-point DSP. In 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100), Vol. 6. IEEE, IEEE, Manhattan, NY, 3686--3689.
[31]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep Learning with Limited Numerical Precision. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Francis Bach and David Blei (Eds.). PMLR, Lille, France, 1737--1746. https://proceedings.mlr.press/v37/gupta15.html
[32]
Awni Y. Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng. 2014. Deep Speech: Scaling up end-to-end speech recognition. CoRR abs/1412.5567 (2014). arXiv:1412.5567 http://arxiv.org/abs/1412.5567
[33]
Muhammad Huzaifa, Rishi Desai, Xutao Jiang, Joseph Ravichandran, Finn Sinclair, and Sarita V. Adve. 2020. Exploring Extended Reality with ILLIXR: A New Playground for Architecture Research. CoRR abs/2004.04643 (2020). arXiv:2004.04643 https://arxiv.org/abs/2004.04643
[34]
IntegrIT, Limited 2020. NatureDSP Signal for HiFi4. IntegrIT, Limited.
[35]
TDK InvenSense. 2020. Bottom Port PDM Low-Power Multi-Mode Microphone With High AOP Mode. TDK InvenSense. https://www.cdiweb.com/datasheets/invensense/ds-000357-t3902-v1.0.pdf
[36]
Jabra. 2021. Jabra Sport Pulse. Jabra. https://www.jabra.com/sports-headphones/jabra-sport-pulse-wireless#100-96100010-02
[37]
François Jarrier-Gellez. 2022. FragJage/SpeakerVoiceIdentifier. https://github.com/FragJage/SpeakerVoiceIdentifier
[38]
Fahim Kawsar, Chulhong Min, Akhil Mathur, and Allesandro Montanari. 2018. Earables for personal-scale behavior analytics. IEEE Pervasive Computing 17, 3 (2018), 83--89.
[39]
Sumin Kim, Seunghwan Oh, and Youngmin Yi. 2021. Minimizing GPU Kernel Launch Overhead in Deep Learning Inference on Mobile GPUs. In Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications (Virtual, United Kingdom) (HotMobile '21). Association for Computing Machinery, New York, NY, USA, 57--63.
[40]
Rakesh Kumar, Keith I Farkas, Norman P Jouppi, Parthasarathy Ranganathan, and Dean M Tullsen. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36. IEEE, IEEE, Manhattan, NY, 81--92.
[41]
Hyoukjun Kwon. 2021. MAERI GITHUB. maeri-project. https://github.com/maeri-project
[42]
Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. 2018. Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. ACM SIGPLAN Notices 53, 2 (2018), 461--475.
[43]
Ruggero Donida Labati, Enrique Muñoz, Vincenzo Piuri, Roberto Sassi, and Fabio Scotti. 2019. Deep-ECG: Convolutional neural networks for ECG biometric recognition. Pattern Recognition Letters 126 (2019), 78--85.
[44]
Waverly Labs. 2022. https://www.waverlylabs.com/
[45]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. CoRR abs/1909.11942 (2019). arXiv:1909.11942 http://arxiv.org/abs/1909.11942
[46]
Steven M LaValle, Anna Yershova, Max Katsev, and Michael Antonov. 2014. Head tracking for the Oculus Rift. In 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, IEEE, Manhattan, NY, 187--194.
[47]
Charles E Leiserson. 1985. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE transactions on Computers 100, 10 (1985), 892--901.
[48]
Bede Liu et al. 1976. Fixed-point fast Fourier transform error analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing 24, 6 (1976), 563--573.
[49]
Wenxin Liu, David Caruso, Eddy Ilg, Jing Dong, Anastasios I Mourikis, Kostas Daniilidis, Vijay Kumar, and Jakob Engel. 2020. TLIO: Tight Learned Inertial Odometry. IEEE Robotics and Automation Letters 5, 4 (2020), 5653--5660.
[50]
MaximIntegrated. 2014. Low-Power, Ultra-Accurate 6 DoF IMU. MaximIntegrated. https://datasheets.maximintegrated.com/en/ds/MAX21105.pdf
[51]
Jeffrey L. McKinstry, Steven K. Esser, Rathinakumar Appuswamy, Deepika Bablani, John V. Arthur, Izzet B. Yildiz, and Dharmendra S. Modha. 2018. Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference. CoRR abs/1809.04191 (2018). arXiv:1809.04191 http://arxiv.org/abs/1809.04191
[52]
Mediatek. 2021. Mediatek MT2533. Mediatek. https://www.mediatek.com/products/wearables/mt2533
[53]
Microchip. 2021. IS2062/64. Microchip. http://ww1.microchip.com/downloads/en/DeviceDoc/60001409D.pdf
[54]
mrDIMAS. 2022. mrDIMAS/rg3d. https://github.com/mrDIMAS/rg3d
[55]
Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP laboratories 27 (2009), 28.
[56]
Subhashini Narayan and E Sathiyamoorthy. 2019. A novel recommender system based on FFT with machine learning for predicting and identifying heart diseases. Neural Computing and Applications 31, 1 (2019), 93--102.
[57]
Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, and Karthikeyan Sankaralingam. 2017. Stream-dataflow acceleration. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, IEEE, Manhattan, NY, 416--429.
[58]
Cedric Nugteren. 2018. CLBlast: A Tuned OpenCL BLAS Library. In Proceedings of the International Workshop on OpenCL (Oxford, United Kingdom) (IWOCL '18). Association for Computing Machinery, New York, NY, USA, Article 5, 10 pages.
[59]
Nuraphone. 2021. How It Works: Music in full colour: Personalized sound. https://www.nuraphone.com/pages/how-it-works
[60]
Nuvoton. 2021. Nuvoton audio DSP. Nuvoton. https://www.nuvoton.com/export/resource-files/TRM_ISD94100_Series_EN_Rev1.09.pdf
[61]
NXP. 2021. NXP LPC54114. NXP. https://www.nxp.com/docs/en/application-note/AN12593.pdf
[62]
James Peckham and Sharmishta Sarkar. 2019. Bose Frames review. https://www.techradar.com/uk/reviews/bose-frames-review
[63]
Petersn. 2020. petersn/tinysr. https://github.com/petersn/tinysr
[64]
Qualcomm. 2021. Qualcomm AptxHd. Qualcomm. https://www.aptx.com/aptx-hd
[65]
QuickLogic. 2021. QuickLogic. QuickLogic. https://www.marketwatch.com/press-release/quicklogics-amazon-qualified-reference-design-brings-alexa-to-hearables-2021-02-18?siteid=bigcharts&dist=bigcharts&tesla=y
[66]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100, 000+ Questions for Machine Comprehension of Text. CoRR abs/1606.05250 (2016). arXiv:1606.05250 http://arxiv.org/abs/1606.05250
[67]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100, 000+ Questions for Machine Comprehension of Text. CoRR abs/1606.05250 (2016). arXiv:1606.05250 http://arxiv.org/abs/1606.05250
[68]
Research and Markets. 2020. Hearables Market by Products, Type, Connectivity Technology, and End User: Global Opportunity Analysis and Industry Forecast, 2019-2026. https://www.researchandmarkets.com/reports/5021786/hearables-market-by-products-type-connectivity
[69]
Romit Roy Choudhurry, Yu-Lin Wei, and Zhijian Yang. 2022. private communication.
[70]
Scribd. 2019. Bragi pivot press release. https://www.scribd.com/document/404134059/Bragi-pivot-press-release
[71]
Matti Siekkinen, Markus Hiienkari, Jukka K Nurminen, and Johanna Nieminen. 2012. How low energy is bluetooth low energy? comparative measurements with zigbee/802.15. 4. In 2012 IEEE wireless communications and networking conference workshops (WCNCW). IEEE, IEEE, Manhattan, NY, 232--237.
[72]
Ruby Singh. 2021. Apple AirPods Pro: A Complete Review. WirelessEar-Buds.best. https://www.wirelessearbuds.best/product/apple-airpods-pro-a-complete-review/
[73]
Sony. 2021. Sony WF-1000XM3 Wireless Noise-Canceling Headphones. Sony. https://www.sony.com/electronics/truly-wireless/wf-1000xm3
[74]
Sudarshan Srinivasan, Pradeep Janedula, Saurabh Dhoble, Sasikanth Avancha, Dipankar Das, Naveen Mellempudi, Bharat Daga, Martin Langhammer, Gregg Baeckler, and Bharat Kaul. 2019. High Performance Scalable FPGA Accelerator for Deep Neural Networks. CoRR abs/1908.11809 (2019). arXiv:1908.11809 http://arxiv.org/abs/1908.11809
[75]
Tensilica, Inc 2010. Xtensa Instruction Set Architecture. Tensilica, Inc.
[76]
Videolabs. 2020. videolabs/libspatialaudio. https://github.com/videolabs/libspatialaudio
[77]
Jian Weng, Sihao Liu, Vidushi Dadu, Zhengrong Wang, Preyas Shah, and Tony Nowatzki. 2020. Dsagen: Synthesizing programmable spatial accelerators. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). IEEE, IEEE, Manhattan, NY, 268--281.
[78]
Zhijian Yang, Yu-Lin Wei, Sheng Shen, and Romit Roy Choudhury. 2020. Ear-AR: Indoor Acoustic Augmented Reality on Earphones. Association for Computing Machinery, New York, NY, USA.
[79]
Hasan Erdem Yantir, Wenzhe Guo, Ahmed M Eltawil, Fadi J Kurdahi, and Khaled Nabil Salama. 2019. An ultra-area-efficient 1024-point in-memory fft processor. Micromachines 10, 8 (2019), 509.
[80]
SD You and Yo-Cheng Hou. 2004. Implementation of IMDCT for MPEG2/4 AAC on 16-bit fixed-point digital signal processors. In The 2004 IEEE Asia-Pacific Conference on Circuits and Systems, 2004. Proceedings., Vol. 2. IEEE, IEEE, Manhattan, NY, 813--816.
[81]
Ofir Zafrir, Guy Boudoukh, Peter Izsak, and Moshe Wasserblat. 2019. Q8bert: Quantized 8bit bert. In 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NIPS). IEEE, IEEE, Manhattan, NY, 36--39.
[82]
ARM ML Zoo. 2021. ML-zoo - speech recognition - wav2letter. Arm Ltd. https://github.com/ARM-software/ML-zoo/tree/master/models/speech_recognition/wav2letter/tflite_int8

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture
June 2022
1097 pages
ISBN:9781450386104
DOI:10.1145/3470496
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS TCAA: IEEE CS technical committee on architectural acoustics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ISCA '22
Sponsor:

Acceptance Rates

ISCA '22 Paper Acceptance Rate 67 of 400 submissions, 17%;
Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 905
    Total Downloads
  • Downloads (Last 12 months)192
  • Downloads (Last 6 weeks)15
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media