Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2435264.2435345acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
poster

Scalable high-throughput architecture for large balanced tree structures on FPGA (abstract only)

Published: 11 February 2013 Publication History

Abstract

Architectures for tree structures on FPGAs as well as ASICs have been proposed over the years. The exponential growth in the memory size with respect to the tree levels restricts the scalability of these architectures due to limited on-chip memory. For large trees, off-chip memory has to be used. We propose a pipeline architecture on FPGA for large balanced tree structures which achieves both scalability and high throughput. In the proposed architecture, each tree level is mapped onto a single or multiple Processing Elements (PEs) using dual-port distributed RAM, dual-port block RAM and off-chip RAM. We parameterize the pipeline architecture and optimize the performance with respect to scalability and throughput. The resulting architecture for the search tree is dual-threaded and deeply pipelined. It can accept two search requests per clock cycle and operates at a high clock rate of 280MHz. Post place-and-route results show that, by using only 17% of the logic resources and 9% of the BRAM available on a state-of-the-art FPGA, our dual-thread pipelined search tree can perform 560 million search operations per second in a tree containing 512K 64-bit keys.

References

[1]
M. AdelsonVelskii and E. Landis, An Algorithm for the Organization of Information, 2006.
[2]
R. Sedgewick, "Left-Leaning Red-Black Tree," September 2008.
[3]
T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson, Introduction to Algorithms, 2nd ed., 2001.
[4]
M. Fujita and R. Murgai, "Delay Estimation and Optimization of Logic Circuits: a survey," in Proc. of the ASP-DAC '97, pp. 25 - 30.
[5]
T. Kowsalya, "Tree Structured Arithmetic Circuit by using different CMOS Logic Styles," ICGST Intl. Journal on Prog. Dev., Cir. and Sys., PDCS, vol. 8, pp. 11 - 18, 2008.
[6]
J.-F. Jiang, Z.-G. Mao, W.-F. He, and Q. Wang, "A New Full Adder Design for Tree Structured Arithmetic Circuits," in Comp. Eng. and Tech. (ICCET), 2010 2nd Intl. Conf. on, vol. 4, pp. 246 - 249.
[7]
D. Sheldon and F. Vahid, "Don't Forget Memories: A Case Study Redesigning a Pattern Counting ASIC Circuit for FPGAs," in Proc. of CODES+ISSS '08, 2008, pp. 155 - 160.
[8]
Y.-H. E. Yang and V. K. Prasanna, "High Throughput and Large Capacity Pipelined Dynamic Search Tree on FPGA," in Proc. of the 18th annual ACM/SIGDA Intl. Sym. on FPGA, ser. FPGA '10, pp. 83 - 92.
[9]
H. Le, W. Jiang, and V. Prasanna, "Scalable High-throughput SRAM-based Architecture for IP-lookup using FPGA," in FPL '08, pp. 137 - 142.
[10]
H. Le and V. Prasanna, "Scalable Tree-based Architectures for IPv4/v6 Lookup using Prefix Partitioning," Computers, IEEE Transactions on, vol. 61, no. 7, pp. 1026 - 1039, July 2012.
[11]
M. Bando and J. Chao, "Flashtrie: Hash-based Prefix-compressed Trie for IP Route Lookup beyond 100Gbps," in INFOCOM '10, 2010.
[12]
P. Zhong, "An IPv6 Address Lookup Algorithm based on Recursive Balanced Multi-way Range Trees with Efficient Search and Update," in Proc. Int Computer Science and Service System (CSSS) Conf, 2011, pp. 2059 - 2063.
[13]
B. Agrawal and T. Sherwood, "Modeling TCAM power for next generation network devices," in Performance Analysis of Systems and Software, 2006 IEEE Intl. Sym. on, march 2006, pp. 120 - 129.
[14]
D. E. Knuth, The art of computer programming, volume 3: (2nd ed.) sorting and searching, 1998.
[15]
H. Parandeh-Afshar, P. Brisk, and P. Ienne, "Exploiting Fast Carry-chains of FPGAs for Designing Compressor Trees," in Field Programmable Logic and Applications, 2009. FPL '09, pp. 242 - 249.
[16]
F. Liu, F. Forouzandeh, O. Mohamed, G. Chen, X. Song, and Q. Tan, "A Comparative Study of Parallel Prefix Adders in FPGA Implementation of EAC," in DSD '09. 12th Euromicro Conference on, pp. 281 - 286.
[17]
B. Lampson, V. Srinivasan, and G. Varghese, "IP Lookups using Multiway and Multicolumn Search," in INFOCOM '98, May 1998.
[18]
P. Warkhede, S. Suri, and G. Varghese, "Multiway Range Trees: Scalable IP Lookup with Fast Updates," Computer Networks, vol. 44, no. 3, pp. 289 - 303, 2004.
[19]
H. Lu and S. Sahni, "A B-Tree Dynamic Router-Table Design," IEEE Trans. Comput., vol. 54, no. 7, pp. 813 -- 824, 2005.
[20]
I. Sourdis and S. Katamaneni, "Longest Prefix Match and Updates in Range Tries," in 2011 IEEE Intl. Conf. on Application-Specific Systems, Architectures and Processors (ASAP '11), Sept. 2011, pp. 51--58.
[21]
Virtex-6 FPGA Family," http://www.xilinx.com/products/virtex6.
[22]
G. Yuan and T. Aamodt, "A Hybrid Analytical DRAM Performance Model," in Proc. 5th Workshop on Modeling, Benchmarking and Simulation, 2009.
[23]
J. Reineke, I. Liu, H. D. Patel, S. Kim, and E. A. Lee, "PRET DRAM controller: bank privatization for predictability and temporal isolation," ser. CODES+ISSS '11, 2011, pp. 99--108.
[24]
DDR3 SDRAM MT41J128M16 - 16 Meg x 16 x 8 Banks," http://www.micron.com/products/dram/ddr3-sdram, Micron Technology, Inc., 2006.
[25]
DDR3 SDRAM System-Power Calculator," http://www.micron.com/products/support/power-calc.
[26]
20Mbit QUAD-Search Content Addressable Memory," http://am.renesas.com/products/memory/TCAM.
[27]
K. Zheng, C. Hu, H. Lu, and B. Liu, "A TCAM-based Distributed Parallel IP Lookup Scheme and Performance Analysis," IEEE/ACM Trans. Netw., vol. 14, no. 4, pp. 863--875, Aug. 2006.
[28]
M.-H. Ho, Y.-Q. Ai, T. C.-P. Chau, S. C. L. Yuen, C.-S. Choy, P. H. W. Leong, and K.-P. Pun, "Architecture and Design Flow for a Highly Efficient Structured ASIC," VLSI Systems, IEEE Transactions on, vol. PP, no. 99, p. 1, 2012.

Cited By

View all
  • (2019)A Novel FPGA-Based High Throughput Accelerator For Binary Search Trees2019 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS48598.2019.9188158(612-619)Online publication date: Jul-2019

Index Terms

  1. Scalable high-throughput architecture for large balanced tree structures on FPGA (abstract only)

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
        February 2013
        294 pages
        ISBN:9781450318877
        DOI:10.1145/2435264

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 11 February 2013

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. FPGA
        2. high-throughput
        3. scalable
        4. tree

        Qualifiers

        • Poster

        Conference

        FPGA '13
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 125 of 627 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 19 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2019)A Novel FPGA-Based High Throughput Accelerator For Binary Search Trees2019 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS48598.2019.9188158(612-619)Online publication date: Jul-2019

        View Options

        Login options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media