poster

Exploring IoT platform with technologically agnostic processing-in-memory framework

Authors:

Paulo Cesar Santos,

João Paulo C. de Lima,

Rafael F. de Moura,

Marco A. Z. Alves,

Antonio C. S. Beck,

Luigi CarroAuthors Info & Claims

INTESA '18: Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications

Pages 1 - 6

https://doi.org/10.1145/3285017.3285020

Published: 04 October 2018 Publication History

Abstract

Since modern Internet of Things (IoT) applications generate massive amounts of data, they either stress the communication mechanism or need extra resources to treat the data locally. The massive volume of data is commonly collected by sensors, and it needs to be stored and processed before being sent through the Internet. This gathering and processing operations demand a significant computational power and time consumption, which are key design constraints in embedded systems. At the same time, Processing-in-Memory (PIM) has emerged as a solution for efficiently processing big data, which can be applied to the IoT data management problem. By using PIM, the generated data can be processed directly in the storage component that keeps the data. However, simulating new architectures is an essential step within a project design life-cycle to analyze and improve new features. Also, the ability to support software development for these new technologies is crucial to make possible the exploitation of experimental designs, reducing design time and costs. In this work, we propose a framework for simulating a state-of-art PIM mechanism, and automatically compile and generate binary code for the target PIM. We demonstrate that the framework can become technologically agnostic by simply adjusting constraints related to the target memory technology. Also, we show how IoT devices can be connected and efficiently make use of the PIM mechanism.

References

[1]

Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In Int. Symp. on Computer Architecture (ISCA).

Digital Library

[2]

Junwhan Ahn, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. PIM-enabled Instructions: A Low-overhead, Locality-aware Processing-in-memory Architecture. In Int. Symp. on Computer Architecture (ISCA).

Digital Library

[3]

M. A. Z. Alves, M. Diener, P. C. Santos, and L. Carro. 2016. Large vector extensions inside the HMC. In Conf. on Design, Automation & Test in Europe.

Digital Library

[4]

Erfan Azarkhish, Davide Rossi, Igor Loi, and Luca Benini. 2016. A Case for Near Memory gem5 Inside the Smart Memory Cube. In Workshop on Emerging Memory Solutions, 2016.

[5]

Erfan Azarkhish, Davide Rossi, Igor Loi, and Luca Benini. 2017. Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes. arXiv preprint arXiv:1701.06420 (2017).

[6]

Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, et al. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (Aug. 2011), 1--7.

Digital Library

[7]

Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).

[8]

Geoffrey W Burr, Matthew J Brightsky, Abu Sebastian, Huai-Yu Cheng, et al. 2016. Recent progress in phase-change memory technology. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 6, 2 (2016), 146--162.

[9]

A. Farmahini-Farahani, J. H. Ahn, K. Compton, and N. S. Kim. 2014. DRAMA: an architecture for accelerated processing near memory. Computer Architecture Letters 99 (2014).

[10]

Di Gao, Tianhao Shen, and Cheng Zhuo. 2018. A design framework for processing-in-memory accelerator. In Proceedings of the 20th System Level Interconnect Prediction Workshop. ACM.

Digital Library

[11]

Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, and Christos Kozyrakis. 2017. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory. In Int. Conf. on Architectural Support for Programming Languages and Operating Systems. ACM, 751--764.

Digital Library

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition. 770--778.

[13]

Kevin Hsieh, Samira Khan, Nandita Vijaykumar, Kevin K Chang, Amirali Boroumand, Saugata Ghose, and Onur Mutlu. 2016. Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation. In Int. Conf. on Computer Design (ICCD).

[14]

Hybrid Memory Cube Consortium. 2013. Hybrid Memory Cube Specification Rev. 2.0. http://www.hybridmemorycube.org/.

[15]

Mohsen Imani, Saransh Gupta, and Tajana Rosing. 2017. Ultra-efficient processing in-memory for data intensive applications. In Annual Design Automation Conference. ACM.

Digital Library

[16]

J. Jeddeloh and B. Keeth. 2012. Hybrid memory cube new DRAM architecture increases density and performance. In 2012 Symposium on VLSI Technology (VLSIT). 87--88.

[17]

Duckhwan Kim, Jaeha Kung, Sek Chai, Sudhakar Yalamanchili, and Saibal Mukhopadhyay. 2016. Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory. Int. Symp. on Computer Architecture, ISCA (2016).

Digital Library

[18]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Int. Symp. on Code Generation and Optimization: Feedback-directed and Runtime optimization. IEEE Computer Society, 75.

Digital Library

[19]

Sheng Li, Jung Ho Ahn, Richard D Strong, et al. 2013. The McPAT Framework for Multicore and Manycore Architectures Simultaneously Modeling Power, Area, and Timing. Transactions on Architecture and Code Optimization 10, 1 (2013), 5.

Digital Library

[20]

João Paulo Lima, Paulo C. Santos, Marco A. Z. Alves, Antonio C. S. Beck, and Luigi Carro. 2018. Design space exploration for PIM architectures in 3D-stacked memories. In Proceedings of the Computing Frontiers Conference. ACM.

Digital Library

[21]

Manqing Mao, Yu Cao, Shimeng Yu, and Chaitali Chakrabarti. 2016. Optimizing latency, energy, and reliability of 1T1R ReRAM through cross-layer techniques. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 6, 3 (2016), 352--363.

[22]

Amir Morad, Leonid Yavits, Shahar Kvatinsky, and Ran Ginosar. {n. d.}. Resistive GP-SIMD Processing-In-Memory. ACM Trans. Archit. Code Optim. 12, 4 ({n. d.}).

Digital Library

[23]

Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, et al. 2017. Graphpim: Enabling instruction-level pim offloading in graph computing frameworks. In Int. Symp. on High Performance Computer Architecture (HPCA). IEEE, 457--468.

[24]

R. Nair, S. F. Antao, C. Bertolli, P. Bose, J. R. Brunheroto, T. Chen, C. Y. Cher, et al. 2015. Active Memory Cube: A processing-in-memory architecture for exascale systems. IBM Journal of Research and Development 59, 2/3 (March 2015), 17:1--17:14.

Digital Library

[25]

Geraldo F Oliveira, Paulo C Santos, Marco AZ Alves, and Luigi Carro. 2017. A generic processing in memory cycle accurate simulator under hybrid memory cube architecture. In Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), 2017 International Conference on. IEEE, 54--61.

[26]

Geraldo F Oliveira, Paulo C Santos, Marco AZ Alves, and Luigi Carro. 2017. NIM: An HMC-Based Machine for Neuron Computation. In International Symposium on Applied Reconfigurable Computing. Springer, Cham, 28--35.

[27]

Joseph Redmon and Ali Farhadi. 2016. YOLO9000: Better, Faster, Stronger. arXiv preprint arXiv:1612.08242 (2016).

[28]

P. C. Santos, G. F. Oliveira, D. G. TomÃl', M. A. Z. Alves, E. C. Almeida, and L. Carro. 2017. Operand size reconfiguration for big data processing in memory. In Design, Automation Test in Europe Conference Exhibition (DATE), 2017. 710--715.

Digital Library

[29]

R. R. Schaller. 1997. Moore's law: past, present and future. IEEE Spectrum 34, 6 (Jun 1997), 52--59.

Digital Library

[30]

Yuan Xie. 2011. Modeling, architecture, and applications for emerging memory technologies. IEEE Design & Test of Computers 28, 1 (2011), 44--51.

Digital Library

[31]

Lifan Xu, Dong Ping Zhang, and Nuwan Jayasena. 2015. Scaling Deep Learning on Multiple In-Memory Processors. WoNDP: 3rd Workshop on Near-Data Processing (2015).

[32]

Richard Zhang, Phillip Isola, and Alexei A Efros. 2016. Colorful image colorization. In European Conference on Computer Vision. Springer, 649--666.

Cited By

Moura RCarro L(2023)Exploiting Heterogeneity in PIM Architectures for Data-Intensive ApplicationsDesigning Modern Embedded Systems: Software, Hardware, and Applications10.1007/978-3-031-34214-1_5(53-64)Online publication date: 11-Jun-2023
https://doi.org/10.1007/978-3-031-34214-1_5
Santos Pde Lima Jde Moura RAlves MBeck ACarro L(2021)Enabling Near-Data Accelerators Adoption by Through Investigation of Datapath SolutionsInternational Journal of Parallel Programming10.1007/s10766-020-00674-yOnline publication date: 28-Jan-2021
https://doi.org/10.1007/s10766-020-00674-y
Khan KPasricha SKim R(2020)A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing ArchitecturesJournal of Low Power Electronics and Applications10.3390/jlpea1004003010:4(30)Online publication date: 24-Sep-2020
https://doi.org/10.3390/jlpea10040030
Show More Cited By

Index Terms

Exploring IoT platform with technologically agnostic processing-in-memory framework
1. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures

Recommendations

A Technologically Agnostic Framework for Cyber-Physical and IoT Processing-in-Memory-based Systems Simulation
Abstract
Smart devices based on Internet of Things (IoT) and Cyber-Physical System (CPS) are emerging as an important and complex set of applications in the modern world. These systems can generate a massive amounts of data, due to the enormous ...
CORUSCANT: Fast Efficient Processing-in-Racetrack Memories
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture

The growth in data needs of modern applications has created significant challenges for modern systems leading to a "memory wall." Spintronic Domain-Wall Memory (DWM), provides near-SRAM read/write performance, energy savings and non-volatility, ...
A programmable shared-memory system for an array of processing-in-memory devices

Processing in memory (PIM), the concept of integrating processing directly with memory has been attracting a lot of attention, since PIM can assist in overcoming the throughput limitation caused by data movement between CPU and memory. The challenge, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

INTESA '18: Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications

October 2018

62 pages

ISBN:9781450365987

DOI:10.1145/3285017

General Chair:
Maurizio Martina
POLITO, IT
,
Program Chair:
William Fornanciari
POLIMI, IT

Copyright © 2018 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2018

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

Conference

INTESA

INTESA: INTelligent Embedded Systems Architectures and Applications

October 4, 2018

Turin, Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
159
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Moura RCarro L(2023)Exploiting Heterogeneity in PIM Architectures for Data-Intensive ApplicationsDesigning Modern Embedded Systems: Software, Hardware, and Applications10.1007/978-3-031-34214-1_5(53-64)Online publication date: 11-Jun-2023
https://doi.org/10.1007/978-3-031-34214-1_5
Santos Pde Lima Jde Moura RAlves MBeck ACarro L(2021)Enabling Near-Data Accelerators Adoption by Through Investigation of Datapath SolutionsInternational Journal of Parallel Programming10.1007/s10766-020-00674-yOnline publication date: 28-Jan-2021
https://doi.org/10.1007/s10766-020-00674-y
Khan KPasricha SKim R(2020)A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing ArchitecturesJournal of Low Power Electronics and Applications10.3390/jlpea1004003010:4(30)Online publication date: 24-Sep-2020
https://doi.org/10.3390/jlpea10040030
de Lima JSantos Pde Moura RAlves MBeck ACarro L(2019)Exploiting Reconfigurable Vector Processing for Energy-Efficient Computation in 3D-Stacked MemoriesIntelligent Information and Database Systems10.1007/978-3-030-17227-5_19(262-276)Online publication date: 29-Mar-2019
https://doi.org/10.1007/978-3-030-17227-5_19

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents