Nothing Special   »   [go: up one dir, main page]

skip to main content
article

An optimal memory allocation scheme for scratch-pad-based embedded systems

Published: 01 November 2002 Publication History

Abstract

This article presents a technique for the efficient compiler management of software-exposed heterogeneous memory. In many lower-end embedded chips, often used in microcontrollers and DSP processors, heterogeneous memory units such as scratch-pad SRAM, internal DRAM, external DRAM, and ROM are visible directly to the software, without automatic management by a hardware caching mechanism. Instead, the memory units are mapped to different portions of the address space. Caches are avoided due to their cost and power consumption, and because they make it difficult to guarantee real-time performance. For this important class of embedded chips, the allocation of data to different memory units to maximize performance is the responsibility of the software.Current practice typically leaves it to the programmer to partition the data among different memory units. We present a compiler strategy that automatically partitions the data among the memory units. We show that this strategy is optimal, relative to the profile run, among all static partitions for global and stack data. For the first time, our allocation scheme for stacks distributes the stack among multiple memory units. For global and stack data, the scheme is provably equal to or better than any other compiler scheme or set of programmer annotations. Results from our benchmarks show a 44.2% reduction in runtime from using our distributed stack strategy vs. using a unified stack, and a further 11.8% reduction in runtime from using a linear optimization strategy for allocation vs. a simpler greedy strategy; both in the case of the SRAM size being 20% of the total data size. For some programs, less than 5% of data in SRAM achieves a similar speedup.

References

[1]
Appel, A. and George, L. 2001. Optimal spilling for CISC machines with few registers. In Proceedings of the SIGPLAN '01 Conference on Program Language Design and Implementation (Snowbird, UT, June).]]
[2]
Banakar, R., Steinke, S., Lee, B.-S., Balakrishnan, M., and Marwedel, P. 2002. Scratchpad memory: A design alternative for cache on-chip memory in embedded systems. In 10th International Symposium on Hardware/Software Codesign (CODES) (Estes Park, CO, May 6--8). ACM, New York.]]
[3]
Barua, R., Lee, W., Amarasinghe, S., and Agarwal, A. 2001. Compiler support for scalable and efficient memory systems. IEEE Trans. Comput., Special Issue on Advances in High Performance Memory Systems (Nov.).]]
[4]
Bhattacharyya, S. S., Leupers, R., and Marwedel, P. 2000. Software synthesis and code generation for signal processing systems. IEEE Trans. Circuits Syst. 47, 9 (Sept.).]]
[5]
Consortium, T. T. 1999. The Trimaran benchmark suite. Available at http://www.trimaran.org/.]]
[6]
Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. MiBench: A free, commercially representative embedded benchmark suite. Available at http://www.eecs.umich.edu/jringenb/mibench/.]]
[7]
Hallnor, G., and Reinhardt, S. K. 2000. A fully associative software-managed cache design. In Proceedings of the 27th International Symposium on Computer Architecture (ISCA) (Vancouver, B.C., Canada, June).]]
[8]
Hennessy, J. and Patterson, D. 1996. Computer Architecture A Quantitative Approach. 2nd ed. Morgan-Kaufmann, Palo Alto, Calif.]]
[9]
Lee, T.-C., Tiwari, V., Malik, S., and Fujita, M. 1997. Power analysis and minimization techniques for embedded DSP software. IEEE Trans. VLSI Syst. (Mar.).]]
[10]
Matlab 6.1. The Math Works, Inc., 2001. http://www.mathworks.com/products/matlab/.]]
[11]
Moritz, C. A., Frank, M., and Amarasinghe, S. 2001. FlexCache: A framework for flexible compiler generated data caching. In The 2d Workshop on Intelligent Memory Systems (Boston, MA, Nov. 12).]]
[12]
CPU12 Reference Manual. Motorola Corporation, 2000. http://e-www.motorola.com/brdata/PDFDB/MICROCONTROLLERS/16_BIT/68HC12_FAMILY/REF_MAT/CPU12RM.pdf.]]
[13]
M-CORE---MMC2001 Reference Manual. Motorola Corporation, 1998. http://www.motorola. com/SPS/MCORE/info_documentation.htm.]]
[14]
New York City, Office of Budget and Management. 1999. Website on frequently asked questions on linear programming. http://www.eden.rutgers.edu/∼pil/FAQ.html. New York, NY.]]
[15]
University of Toronto Digital Signal Processing (UTDSP). 1992. University of Toronto Digital Signal Processing (UTDSP) Benchmark Suite. Available at http://www.eecg.toronto.edu/.]]
[16]
Panda, P. R., Dutt, N. D., and Nicolau, A. 2000. On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems. ACM Trans. Des. Autom. Electron. Syst. 5, 3 (July).]]
[17]
Paulin, P., Liem, C., Cornero, M., Nacabal, F., and Goossens, G. 1997. Embedded software in real-time signal processing systems: Application and architecture trends. Invited paper, Proc. IEEE 85, 3 (Mar.).]]
[18]
Rutter, P., Orost, J., and Gloistein, D. BTOA: Binary to printable ASCII converter source code. Available at http://www.bookcase.com/library/software/msdos.devel.lang.c.html.]]
[19]
Sjodin, J., Froderberg, B., and Lindgren, T. 1998. Allocation of global data objects in on-chip RAM. Compiler and Architecture Support for Embedded Computing Systems. Dec.]]
[20]
Sjodin, J. and von Platen, C. 2001. Storage allocation for embedded processors. 2001 Compiler and Architecture Support for Embedded Computing Systems. Nov.]]
[21]
TMS370Cx7x 8-bit microcontroller. Texas Instruments, Revised Feb. 1997. http://www-s.ti.com/sc/psheets/spns034c/spns034c.pdf.]]

Cited By

View all
  • (2024)Optimal Arrangement and Rearrangement of Objects on Shelves to Minimize Robot Retrieval CostIEEE Transactions on Automation Science and Engineering10.1109/TASE.2023.333689021:3(2184-2198)Online publication date: Jul-2024
  • (2024)A New Generation Cyber-Physical System: A Comprehensive Review from Security PerspectiveComputers & Security10.1016/j.cose.2024.104095(104095)Online publication date: Sep-2024
  • (2023)MinUn: Accurate ML Inference on MicrocontrollersProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3589610.3596278(26-39)Online publication date: 13-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 1, Issue 1
November 2002
140 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/581888
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 01 November 2002
Published in TECS Volume 1, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Memory
  2. allocation
  3. embedded
  4. heterogeneous
  5. storage

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)2
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Optimal Arrangement and Rearrangement of Objects on Shelves to Minimize Robot Retrieval CostIEEE Transactions on Automation Science and Engineering10.1109/TASE.2023.333689021:3(2184-2198)Online publication date: Jul-2024
  • (2024)A New Generation Cyber-Physical System: A Comprehensive Review from Security PerspectiveComputers & Security10.1016/j.cose.2024.104095(104095)Online publication date: Sep-2024
  • (2023)MinUn: Accurate ML Inference on MicrocontrollersProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3589610.3596278(26-39)Online publication date: 13-Jun-2023
  • (2022)Optimal Shelf Arrangement to Minimize Robot Retrieval Time2022 IEEE 18th International Conference on Automation Science and Engineering (CASE)10.1109/CASE49997.2022.9926722(993-1000)Online publication date: 20-Aug-2022
  • (2022)Echtzeitfähige Ethernet-Kommunikation in automobilen Multicore-Systemen mit hierarchischem SpeicherlayoutEchtzeit 202110.1007/978-3-658-37751-9_10(83-92)Online publication date: 25-May-2022
  • (2021)Accelerating Data-Flow Analysis with Full-Partitioning2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00184(1345-1352)Online publication date: Sep-2021
  • (2020)Local Memory Mapping of Multicore Processors on an Automatic Parallelizing CompilerIEICE Transactions on Electronics10.1587/transele.2019LHP0010E103.C:3(98-109)Online publication date: 1-Mar-2020
  • (2020)SPECTRUMACM Transactions on Embedded Computing Systems10.1145/340003219:5(1-28)Online publication date: 26-Sep-2020
  • (2020)SoMMA: A software-managed memory architecture for multi-issue processorsMicroprocessors and Microsystems10.1016/j.micpro.2020.10313977(103139)Online publication date: Sep-2020
  • (2020)Identification of Safety and Security Vulnerabilities in Cyber Physical SystemsData Science and Analytics10.1007/978-981-15-5830-6_1(3-13)Online publication date: 28-May-2020
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media