Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1345206.1345214acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Scalable packet classification using interpreting: a cross-platform multi-core solution

Published: 20 February 2008 Publication History

Abstract

Packet classification is an enabling technology to support advanced Internet services. It is still a challenge for a software solution to achieve 10Gbps (line-rate) classification speed. This paper presents a classification algorithm that can be efficiently implemented on a multi-core architecture with or without cache. The algorithm embraces the holistic notion of exploiting application characteristics, considering the capabilities of the CPU and the memory hierarchy, and performing appropriate data partitioning. The classification algorithm adopts two stages: searching on a reduction tree and searching on a list of ranges. This decision is made based on a classification heuristic: the size of the range list is limited after the first stage search. Optimizations are then designed to speed up the two-stage execution. To exploit the speed gap (1) between the CPU and external memory; (2) between internal memory (cache) and external memory, an interpreter is used to trade the CPU idle cycles with demanding memory access requirements. By applying the CISC style of instruction encoding to compress the range expressions, it not only significantly reduces the total memory requirement but also makes effective use of the internal memory (cache) bandwidth. We show that compressing data structures is an effective optimization across the multi-core architectures.
We implement this algorithm on both Intel IXP2800 network processor and Core 2 Duo X86 architecture, and experiment with the classification benchmark, ClassBench. By incorporating architecture-awareness in algorithm design and taking into account the memory hierarchy, data partitioning, and latency hiding in algorithm implementation, the resulting algorithm shows a good scalability on Intel IXP2800. By effectively using the cache system, the algorithm also runs faster than the previous fastest RFC on the Core 2 Duo architecture.

References

[1]
A. Alameldeen and D. A. Wood. Adaptive Cache Compression for High-performance Processors. ACM ISCA-31, Munich, Germany, June 19-23, 2004.
[2]
A. Mendelson and J. Mandelblat et. al. CMP Implementation in Systems Based on the Intel Core Duo Processor. Intel Technology Journal, Vol. 10(2), 2006.
[3]
A.S, Leon and K.W, Tam, et. al. A Power-Efficient High-Throughput 32-Thread SPARC Processor. IEEE Journal of Solid-State Circuits, Vol. 42 No. 1, Jan., 2007.
[4]
D. E. Taylor and J. S. Turner. ClassBench: A Packet Classification Benchmark. Technical Report, WUCSE-2004-28, Department of Computer Science & Engineering, Washington University in Saint Louis, May 2004.
[5]
Duo Liu and Bei Hua and Xianghui Hu and Xinan Tang. High-performance packet classification algorithm for many-core and multithreaded network processor. In Proceedings of ACM CASES'2006, Seoul, Korea, pp. 334--344.
[6]
F. Baboescu, S. Singh, and G. Varghese. Packet Classification for Core Routers: Is there an alternative to CAMs. Technical Report, University of California, San Diego, 2003.
[7]
F. Baboescu and G. Varghese. Scalable Packet Classification. In Proceedings of ACM SIGCOMM, 2001, pp.199--210.
[8]
J. A. Kahle and M. N. Day, et. al. Introduction to the Cell Multiprocessor. IBM Journal of RES. & DEV. VOL. 49 NO. 4/5, 2005.
[9]
M. Adiletta and Mark Rosenbluth, et. al. The Next Generation of Intel IXP Network Processors. Intel Technology Journal, Vol. 6 (3), August 2002.
[10]
M. Degermark, A. Brodnik, S. Carlsson, and S. Pink. Small Forwarding Tables for Fast Routing Lookups. In Proceedings of ACM SIGCOMM '97, Cannes, France, 1997, pp.3--14.
[11]
M. Kounavis et al. Directions in Packet Classification for Network Processors. In Proceedings of Second Workshop on Network Processors (NP2), Feb. 2003.
[12]
P. Gupta and N. McKeown. Packet Classification Using Hierarchical Intelligent Cuttings. IEEE Micro, Vol. 20, No. 1, Jan.-Feb. 2000, pp.34--41.
[13]
P. Gupta and N. McKeown. Packet Classification on Multiple Fields. In Proceedings of ACM SIGCOMM, Computation Communication Review, Vol. 29, Sep. 1999, pp.147--160.
[14]
S. Singh, F. Baboescu, G. Varghese, and Jia Wang. Packet Classification Using Multidimensional Cutting. In Proceedings of ACM SIGCOMM'03, ACM Press, 2003, pp.213--224.
[15]
T. Sherwood, G. Varghese and B. Calder. A Pipelined Memory Architecture for High Throughput Network Processors. In Proceedings of ACM ISCA'03, 2003.
[16]
T. V. Lakshman and D. Stiliadis. High-speed Policy-based Packet Forwarding Using Efficient Multi-dimensional Range Matching. In Proceedings of ACM SIGCOMM98, Sep. 1998, pp. 191--202.
[17]
V. Srinivasan, S. Suri, G. Varghese, and M. Waldvogel. Fast and Scalable Layer Four Switching. In Proceedings of ACM SIGCOMM'98, Sep. 1998, pp. 203--14.
[18]
W. Eatherton, G Varghese, and Z Dittia. Tree Bitmap: Hardware/Software IP Lookups with Incremental Updates. In Proceedings of ACM SIGCOMM on Computer Communication Review, Vol. 34, Issue 2, Apr. 2004, pp.97--122.
[19]
Xianghui Hu, Xinan Tang, and Bei Hua. A High-performance IPv6 Forwarding Algorithm for a Multi-core and Multithreaded Network Processor. In Proceedings of ACM PPoPP'06, Mar. 2006, pp.168--177.
[20]
Xinan Tang and Guang R. Gao. Automatically Partitioning Threads for Multithreaded Architectures. In Journal of Parallel Distributed Computing, 1999, 58(2) pp.159--189.
[21]
Xinan Tang and Guang R. Gao. How hard is thread partitioning and how bad is a list scheduling based partitioning algorithm. In Proceedings of the tenth annual ACM symposium on Parallel Algorithms and Architectures, pp. 159--189, 1998.
[22]
Xinan Tang, J. Wang, K. Theobald, and Guang R. Gao. Thread partitioning and scheduling based on cost model. In Proceedings of the ninth annual ACM symposium on Parallel Algorithms and Architectures, pp. 272--281, 1997.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
February 2008
308 pages
ISBN:9781595937957
DOI:10.1145/1345206
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 February 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. architecture
  2. embedded system design
  3. multithreading
  4. network processor
  5. packet classification
  6. thread-level parallelism

Qualifiers

  • Research-article

Conference

PPoPP08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)2
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Big data analysis for detection of web brute-force attackJournal of Shenzhen University Science and Engineering10.3724/SP.J.1249.2020.9904437:Z1(44-49)Online publication date: 14-Oct-2022
  • (2011)HaRPIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2010.19522:7(1105-1119)Online publication date: 1-Jul-2011
  • (2010)BSPM: A new mechanism for “overlap-matching expressions” in DPIJournal of Electronics (China)10.1007/s11767-010-0313-y27:3(289-297)Online publication date: 12-Oct-2010
  • (2009)Hashing round-down prefixes for rapid packet classificationProceedings of the 2009 conference on USENIX Annual technical conference10.5555/1855807.1855813(6-6)Online publication date: 14-Jun-2009
  • (2009)Three different designs for packet classificationWSEAS Transactions on Computers10.5555/1718109.17181178:9(1494-1503)Online publication date: 1-Sep-2009
  • (2009)Balanced HiCutsProceedings of the WSEAES 13th international conference on Computers10.5555/1627695.1627771(406-411)Online publication date: 23-Jul-2009
  • (2009)Practice of parallelizing network applications on multi-core architecturesProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542307(204-213)Online publication date: 8-Jun-2009
  • (2009)A Timesaving Recursive Flow Packet Classification Algorithm2009 International Conference on Networks Security, Wireless Communications and Trusted Computing10.1109/NSWCTC.2009.245(442-445)Online publication date: Apr-2009
  • (2008)On RTP filtering for network traffic reductionProceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia10.1145/1497185.1497261(356-359)Online publication date: 24-Nov-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media