Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Automatic synthesis of physical system differential equation models to a custom network of general processing elements on FPGAs

Published: 30 September 2013 Publication History

Abstract

Fast execution of physical system models has various uses, such as simulating physical phenomena or real-time testing of medical equipment. Physical system models commonly consist of thousands of differential equations. Solving such equations using software on microprocessor devices may be slow. Several past efforts implement such models as parallel circuits on special computing devices called Field-Programmable Gate Arrays (FPGAs), demonstrating large speedups due to the excellent match between the massive fine-grained local communication parallelism common in physical models and the fine-grained parallel compute elements and local connectivity of FPGAs. However, past implementation efforts were mostly manual or ad hoc. We present the first method for automatically converting a set of ordinary differential equations into circuits on FPGAs. The method uses a general Processing Element (PE) that we developed, designed to quickly solve a set of ordinary differential equations while using few FPGA resources. The method instantiates a network of general PEs, partitions equations among the PEs to minimize communication, generates each PE's custom program, creates custom connections among PEs, and maintains synchronization of all PEs in the network. Our experiments show that the method generates a 400-PE network on a commercial FPGA that executes four different models on average 15x faster than a 3 GHz Intel processor, 30x faster than a commercial 4-core ARM, 14x faster than a commercial 6-core Texas Instruments digital signal processor, and 4.4x faster than an NVIDIA 336-core graphics processing unit. We also show that the FPGA-based approach is reasonably cost effective compared to using the other platforms. The method yields 2.1x faster circuits than a commercial high-level synthesis tool that uses the traditional method for converting behavior to circuits, while using 2x fewer lookup tables, 2x fewer hardcore multiplier (DSP) units, though 3.5x more block RAM due to being programmable. Furthermore, the method does not just generate a single fastest design, but generates a range of designs that trade off size and performance, by using different numbers of PEs.

References

[1]
Ackermann, J., Baecher, P., Franzel, T., Goesele, M., and Hamacher, K. 2009. Massively-parallel simulation of biochemical systems. In Proceedings of the Massively Parallel Computational Biology on GPUs Conference. Jahrestagung der Gesellschaft fÃOEr Informatik e.V.
[2]
Advanced Micro Devices (AMD). 2011. AMD opteron. http://www.amd.com/usen/Processors/Product Information/0,30_118_8825,00.html.
[3]
Agarwal, A., Sites, R., and Horwitz, M. 1986. ATUM a new technique for capturing address traces using microcode. In Proceedings of the 13th International Symposium on Computer Architecture.
[4]
Amorim, R. M., Rocha, B. M., Campos, F. O., and dos Santos, R. W. 2010. Automatic code generation for solvers of cardiac cellular membrane dynamics in gpus. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'10). 2666--2669.
[5]
Andreev, K. and Racke, H. 2006. Balanced graph partitioning. Theor. Comput. Syst. 39, 6, 929--939.
[6]
ARM RISC. 2001. http://www.arm.com/products/processors/technologies/instruction-set-architectures.php.
[7]
ASPX TI CCS. 2011. http://focus.ti.com/docs/toolsw/folders/print/ccstudio.html.
[8]
Atkinson, K. 1993. Elementary. Numerical Analysis 2nd Ed. John Wiley and Sons, New York.
[9]
ATI Graphics Cards. 2011. http://ati.amd.com/support/driver.html.
[10]
Barbini, P., Brighenti, C., Cevenini, G., and Gnudi, G. 2005. A dynamic morphometric model of the normal lung for studying expiratory flow limitation in mechanical ventilation. Ann. Biomed. Engin 33, 4, 518--530.
[11]
Butcher, J. C. 2003. Numerical Methods for Ordinary Differential Equations. Wiley.
[12]
Buyukkurt, B. A., Guo, Z., and Najjar, W. 2006. Impact of loop unrolling on throughput, area and clock frequency in ROCCC: C to VHDL compiler for FPGAs. In Proceedings of the International Workshop on Applied Reconfigurable Computing.
[13]
CellMl. 2011. http://www.cellml.org.
[14]
Celoxica. 2011. http://www.celoxica.com/.
[15]
Chen, H., Sun, S., Aliprantis, D., and Zambrena, J. 2009. Dynamic simulation of electric machines on FPGA boards. In Proceedings of the International Electric Machines and Drives Conference.
[16]
Cong, J., Fan, Y., Han, G., Jiang, W., and Zhang, Z. 2008. Platform-based behavior-level and system-level synthesis. In Proceedings of the IEEE International SOC Conference. 199--202.
[17]
Cray. 2011. http://www.cray.com/Home.aspx.
[18]
CUDA 2011. http://developer.nvidia.com/cuda-downloads.
[19]
CUDA Programming Guide. 2011. http://developer.download.nvidia.com/compute/cuda/4_0-/toolkit/docs/CUDA_C_Programming_Guide.pdf.
[20]
Diniz, P., Hall Park, M., Park, J., So, B., and Ziegler, H. 2001. Bridging the gap between compilation and synthesis in the defacto system. In Proceedings of the 14th Workshop on Languages and Compilers for Parallel Computing Synthesis (LCPC'01).
[21]
Gholkar, A., Lsaacs, A., and Hemendra, A. 2002. Hardware-in-loop simulator for mini aerial vehicle. In Proceedings of the 6th Real-Time Linux Workshop.
[22]
Gokhale, M. B., Stone, J. M., Arnold, J., and Lalinowski, M. 2000. Stream-oriented FPGA computing in the streams-C high level language. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'00).
[23]
Heart Simulator. 2011. http://www.columbia.edu/itc/hs/medical/heartsim/.
[24]
Hong, S. and Kim, H. 2009. An analytical model for GPU architecture with memory-level and thread-level parallelism awareness. In Proceedings of the International Symposium on Computer Architecture.
[25]
Huang, C., Vahid, F., and Givargis, T. A. 2011. Custom FPGA processor for physical model ordinary differential equation solving. IEEE Embedd. Syst. Lett. 3, 4, 113--116.
[26]
Hucka, M., Finney, A., Bornstein, B., Keating, S., Shapiro, B. Matthews, J. Kovitz, B., Schilstra, M., Funahashi, A., Doyle, J., and Kitano, H. 2004. Evolving a lingua franca and associated software infrastructure for computational systems biology: The systems biology markup language (SBML) project. IEEE Syst. Biol. 1, 1, 41--53.
[27]
IBM Blue Gene. 2011. Supercomputer. http://domino.research.ibm.com/comm/research_projects.nsf/pages/bluegene.index.html.
[28]
Intel 64. 2011. http://www.intel.com/technology/intel64/index.htm.
[29]
Intel Corporation. 2011. Multicore technology. http://www.intel.com/multi-core/.
[30]
Iwanaga, N., Shibata, Y., Yoshimi, M., Osana, Y., Iwaoka, Y., et al. 2005. Efficient scheduling of rate law functions for ODE-basedmultimodel biochemical simulation on an FPGA. In Proceedings of the International Conference on Field Programmable Logic and Applications. 666--669.
[31]
JSIM. 2011. http://nsr.bioeng.washington.edu/jsim/.
[32]
Kernighan, B. W. and Lin, S. 1970. An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49, 291--307.
[33]
Kum, K., Kang, J., and Sung, W. 2000. AUTOSCALER for C: An optimizing floating-point to integer C program converter for fixed-point digital signal processors. IEEE Trans. Analog Digital Signal Process. 47, 9, 840--848.
[34]
Lee, E. A. 2008. Cyber physical systems: Design challenges. Tech. rep. UCB/EECS-2008-8, University of California, EECS Department.
[35]
Lin, C. L., Tawhai, M. H., McLennan, G., and Hoffman, E. A. 2009. Multiscale simulation of gas flow in subject-Specific models of the human lung. IEEE Engin Med. Biol. 28, 3, 25--33.
[36]
Lionetti, F. 2010. http://cseweb.ucsd.edu/groups/hpcl/scg/papers/2010/lionetti_ms_thesis.pdf.
[37]
Lutchen, F. P., Primiano, J. R., and Saidel, G. M. 1982. A nonlinear model combining pulmonary mechanics and gas concentration dynamics. IEEE Trans. Biomed. Engin. 29, 629--641.
[38]
Mathematica. 2011. http://www.wolfram.com/.
[39]
Mathworks. 2011. Matlab and simulink. http://www.mathworks.com/.
[40]
MedGadget. 2008. Supercomputer creates most advanced heart model. Int. J. Emerging Med. Technol. Jan. 2008.
[41]
McFarland, M. C., Parker, A. C., and Camposano, R. 1990. The high level synthesis of digital systems. Proc IEEE 78, 301--318.
[42]
MicroBlaze. 2011. http://www.xilinx.com/tools/microblaze.htm.
[43]
Miller, B., Givargis, T., and Vahid, F. 2011. Application-specific codesign platform generation for digital mockups in cyber-physical systems. In Proceedings of the IEEE Electronic System Level Synthesis Conference (ESLsyn'11).
[44]
Mosegaard, J. and Sørensen, T. S. 2005. Real-time deformation of detailed geometry based on mappings to a less detailed physical simulation on the GPU. In Proceedings of the Eurographics Virtual Environments Workshop. 105--110.
[45]
Motuk, E., Woods, R., and Bilbao, S. 2005. Implementation of finite difference schemes for the wave equation on FPGA. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'05).
[46]
Najjar, W., Bohm, W., Draper, B., Hammes, J., Rinker, R., Beveridge, R., Chawathe, M., and Ross, C. 2003. From algorithms to hardware - A high-level language abstraction for reconfigurable computing. Comput. 36, 8.
[47]
National Instruments. 2011. LabView FPGA module. http://www.ni.com/fpga/.
[48]
Nsr Physiome Project. 2011. Mathematical markup language. http://nsr.bioeng.washington.edu/jsim/docs/MML_Intro.html.
[49]
Nvidia Corporation. 2011. http://www.nvidia.com/object/gpu.html.
[50]
Osana, Y., Fukushima, T., and Amano, H. 2004. ReCSiP: A reconfigurable cell simulation platform: Accelerating biological applications with FPGA. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC'04).
[51]
Pimentel, J. and Tirat-Gefen, Y. 2006. Hardware acceleration for real time simulation of physiological systems. In Proceedings of the 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS'06). 218--223.
[52]
Reshadi, M., Gorjiara, B., and Gajski, D. 2005. Utilizing horizontal and vertical parallelism with a no-instruction-set compiler for custom datapaths. In Proceedings of the International Conference on Computer Design.
[53]
Salwinski, L. and Eisenberg, D. 2004. Silico Simulation of Biological Network Dynamics. Nature Publishing Group, 1017--1019.
[54]
Simulink. 2001. http://www.mathworks.com/products/simulink/.
[55]
Spark Project. 2005. http://mesl.ucsd.edu/spark/.
[56]
Synphonyc. 2011. http://www.synopsys.com/Systems/BlockDesign/HLS/Pages/SynphonyCCompiler.
[57]
VHDL. 2011. http://www.vhdl.org.
[58]
Weibel, E. R. 1963. Morphometry of the Human Lung. Springer.
[59]
Xilinx Ise. 2011. http://www.xilinx.com/support/documentation/dt_ise12-4.htm.
[60]
Yoshimi, M., Osana, Y., Fukushima, T., and Amano, H. 2004. Stochastic simulation for biochemical reactions on FPGA. In Proceedings of the 14th International Conference on Field Programmable Logic and Application (FPL'04). 05--114.
[61]
Zhang, H., Holden, A. V., and Boyett, M. R. 2001. Gradient model versus mosaic model of the sinoatrial node. Circulat. 103, 584--588.

Cited By

View all
  • (2024)Energy and Cache Aware Routing for Socially Aware Networking in the Big Data EnvironmentJournal of Signal Processing Systems10.1007/s11265-024-01914-x96:2(169-178)Online publication date: 1-Feb-2024
  • (2023)Towards an Accelerator for Differential and Algebraic Equations Useful to ScientistsIEEE Computer Architecture Letters10.1109/LCA.2023.333231822:2(185-188)Online publication date: Jul-2023
  • (2022)DRHEFT: Deadline-Constrained Reliability-Aware HEFT Algorithm for Real-Time Heterogeneous MPSoC SystemsIEEE Transactions on Reliability10.1109/TR.2020.298141971:1(178-189)Online publication date: Mar-2022
  • Show More Cited By

Index Terms

  1. Automatic synthesis of physical system differential equation models to a custom network of general processing elements on FPGAs

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Transactions on Embedded Computing Systems
          ACM Transactions on Embedded Computing Systems  Volume 13, Issue 2
          Special issue on application-specific processors
          September 2013
          254 pages
          ISSN:1539-9087
          EISSN:1558-3465
          DOI:10.1145/2514641
          Issue’s Table of Contents
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Journal Family

          Publication History

          Published: 30 September 2013
          Accepted: 01 July 2012
          Revised: 01 February 2012
          Received: 01 February 2011
          Published in TECS Volume 13, Issue 2

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. FPGAs
          2. Physical system modeling
          3. application-specific processors
          4. custom processors
          5. cyber-physical systems
          6. differential equations
          7. embedded systems
          8. physical system simulation
          9. processor network
          10. synthesis

          Qualifiers

          • Research-article
          • Research
          • Refereed

          Funding Sources

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)10
          • Downloads (Last 6 weeks)3
          Reflects downloads up to 10 Nov 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Energy and Cache Aware Routing for Socially Aware Networking in the Big Data EnvironmentJournal of Signal Processing Systems10.1007/s11265-024-01914-x96:2(169-178)Online publication date: 1-Feb-2024
          • (2023)Towards an Accelerator for Differential and Algebraic Equations Useful to ScientistsIEEE Computer Architecture Letters10.1109/LCA.2023.333231822:2(185-188)Online publication date: Jul-2023
          • (2022)DRHEFT: Deadline-Constrained Reliability-Aware HEFT Algorithm for Real-Time Heterogeneous MPSoC SystemsIEEE Transactions on Reliability10.1109/TR.2020.298141971:1(178-189)Online publication date: Mar-2022
          • (2022)An Efficient Multi-Keyword Search Scheme over Encrypted Data in Multi-Cloud Environment2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)10.1109/SmartCloud55982.2022.00016(59-67)Online publication date: Oct-2022
          • (2020)Augmented Cross-Entropy-Based Joint Temperature Optimization of Real-Time 3-D MPSoC SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2019.293932839:10(1987-1999)Online publication date: Oct-2020
          • (2020)Semantics-Directed Hardware Generation of Hybrid Systems2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS)10.1109/ICCPS48487.2020.00037(259-268)Online publication date: Apr-2020
          • (2019)Robust Design and Validation of Cyber-physical SystemsACM Transactions on Embedded Computing Systems10.1145/336209818:6(1-21)Online publication date: 15-Nov-2019
          • (2019)Introducing DyMonDS-as-a-Service (DyMaaS) for Internet of Things2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916560(1-9)Online publication date: Sep-2019
          • (2018)Cost-Constrained QoS Optimization for Approximate Computation Real-Time Tasks in Heterogeneous MPSoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2017.277289637:9(1733-1746)Online publication date: Sep-2018
          • (2017)Review of Hardware Platforms for Real-Time Simulation of Electric MachinesIEEE Transactions on Transportation Electrification10.1109/TTE.2017.26561413:1(130-146)Online publication date: Mar-2017
          • Show More Cited By

          View Options

          Get Access

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media