Reviewing Traffic Classification

Silvio Valenti^19,22,
Dario Rossi¹⁹,
Alberto Dainotti^20,23,
Antonio Pescapè²⁰,
Alessandro Finamore²¹ &
…
Marco Mellia²¹

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 7754))

4034 Accesses
59 Citations

Abstract

Traffic classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, traffic classification is the basic block that is required to enable any traffic management operations, from differentiating traffic pricing and treatment (e.g., policing, shaping, etc.), to security operations (e.g., firewalling, filtering, anomaly detection, etc.).

Up to few years ago, almost any Internet application was using well-known transport layer protocol ports that easily allowed its identification. More recently, the number of applications using random or non-standard ports has dramatically increased (e.g. Skype, BitTorrent, VPNs, etc.). Moreover, often network applications are configured to use well-known protocol ports assigned to other applications (e.g. TCP port 80 originally reserved for Web traffic) attempting to disguise their presence.

For these reasons, and for the importance of correctly classifying traffic flows, novel approaches based respectively on packet inspection, statistical and machine learning techniques, and behavioral methods have been investigated and are becoming standard practice. In this chapter, we discuss the main trend in the field of traffic classification and we describe some of the main proposals of the research community.

We complete this chapter by developing two examples of behavioral classifiers: both use supervised machine learning algorithms for classifications, but each is based on different features to describe the traffic. After presenting them, we compare their performance using a large dataset, showing the benefits and drawback of each approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Traffic Identification in Big Internet Data

A Survey of Network Traffic Classification Methods Using Machine Learning

Article 29 November 2022

Practical and configurable network traffic classification using probabilistic machine learning

Article 16 September 2021

References

CAIDA, The Cooperative Association for Internet Data Analysis, http://www.caida.org/research/traffic-analysis/classification-overview/
IANA, List of assigned port numbers, http://www.iana.org/assignments/port-numbers
l7filter, Application layer packet classifier for Linux, http://l7-filter.clearfoundation.com/
Tstat, http://tstat.tlc.polito.it
Aceto, G., Dainotti, A., de Donato, W., Pescapè, A.: Portload: Taking the best of two worlds in traffic classification. In: INFOCOM IEEE Conference on Computer Communications Workshops, 15, pp. 1–5 (2010)
Google Scholar
Bakerand, F., Fosterand, B., Sharp, C.: Cisco Architecture for Lawful Intercept in IP Networks. IETF RFC 3924 (Informational) (October 2004)
Google Scholar
Bermolen, P., Mellia, M., Meo, M., Rossi, D., Valenti, S.: Abacus: Accurate behavioral classification of P2P-TV traffic. Elsevier Computer Networks 55(6), 1394–1411 (2011)
Article Google Scholar
Bernaille, L., Teixeira, R., Salamatian, K.: Early application identification. In: Proc. of ACM CoNEXT 2006, Lisboa, PT (December 2006)
Google Scholar
Carela-Espaoll, V., Barlet-Ros, P., Sole-Simo, M., Dainotti, A., de Donato, W., Pescapè, A.: K-dimensional trees for continuous traffic classification, pp. 141–154 (2010)
Google Scholar
Cascarano, N., Risso, F., Este, A., Gringoli, F., Salgarelli, L., Finamore, A., Mellia, M.: Comparing P2PTV Traffic Classifiers. In: 2010 IEEE International Conference on Communications (ICC), pp. 1–6 (May 2010)
Google Scholar
Cascarano, N., Rolando, P., Risso, F., Sisto, R.: Infant: Nfa pattern matching on gpgpu devices. Computer Communication Review 40(5), 20–26 (2010)
Article Google Scholar
Claise, B.: Cisco Systems NetFlow Services Export Version 9. RFC 3954 (Informational) (October 2004)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–297 (1995)
MATH Google Scholar
Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, New York (1999)
Google Scholar
Crotti, M., Dusi, M., Gringoli, F., Salgarelli, L.: Traffic classification through simple statistical fingerprinting. ACM SIGCOMM Computer Communication Review 37(1), 5–16 (2007)
Article Google Scholar
Dainotti, A., de Donato, W., Pescapé, A.: TIE: A Community-Oriented Traffic Classification Platform. In: Papadopouli, M., Owezarski, P., Pras, A. (eds.) TMA 2009. LNCS, vol. 5537, pp. 64–74. Springer, Heidelberg (2009)
Chapter Google Scholar
Dainotti, A., de Donato, W., Pescapè, A., Salvo Rossi, P.: Classification of network traffic via packet-level hidden markov models 30, 1–5 (2008)
Google Scholar
Dainotti, A., Pescapè, A., Kim, H.C.: Traffic classification through joint distributions of packet-level statistics. In: GLOBECOM, pp. 1–6 (2011)
Google Scholar
Dainotti, A., Pescapé, A., Claffy, K.C.: Issues and future directions in traffic classification. IEEE Network 26(1), 35–40 (2012)
Article Google Scholar
Dainotti, A., Pescapé, A., Sansone, C.: Early Classification of Network Traffic through Multi-classification. In: Domingo-Pascual, J., Shavitt, Y., Uhlig, S. (eds.) TMA 2011. LNCS, vol. 6613, pp. 122–135. Springer, Heidelberg (2011)
Chapter Google Scholar
Dainotti, A., Pescapé, A., Sansone, C., Quintavalle, A.: Using a Behaviour Knowledge Space Approach for Detecting Unknown IP Traffic Flows. In: Sansone, C., Kittler, J., Roli, F. (eds.) MCS 2011. LNCS, vol. 6713, pp. 360–369. Springer, Heidelberg (2011)
Chapter Google Scholar
Santiago del Río, P.M., Rossi, D., Gringoli, F., Nava, L., Salgarelli, L., Aracil, J.: Wire-speed statistical classification of network traffic on commodity hardware. In: ACM IMC 2012 (2012)
Google Scholar
Erman, J., Arlitt, M., Mahanti, A.: Traffic classification using clustering algorithms. In: MineNet 2006: Mining Network Data (MineNet) Workshop at ACM SIGCOMM 2006, Pisa, Italy (2006)
Google Scholar
Erman, J., Mahanti, A., Arlitt, M., Williamson, C.: Identifying and discriminating between web and peer-to-peer traffic in the network core. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, pp. 883–892 (2007)
Google Scholar
Finamore, A., Mellia, M., Meo, M., Rossi, D.: KISS: Stochastic Packet Inspection. In: Papadopouli, M., Owezarski, P., Pras, A. (eds.) TMA 2009. LNCS, vol. 5537, pp. 117–125. Springer, Heidelberg (2009)
Chapter Google Scholar
Finamore, A., Mellia, M., Meo, M., Rossi, D.: Kiss: Stochastic packet inspection classifier for udp traffic. IEEE/ACM Transaction on Networking 18(5), 1505–1515 (2010)
Article Google Scholar
Finamore, A., Meo, M., Rossi, D., Valenti, S.: Kiss to Abacus: A Comparison of P2P-TV Traffic Classifiers. In: Ricciato, F., Mellia, M., Biersack, E. (eds.) TMA 2010. LNCS, vol. 6003, pp. 115–126. Springer, Heidelberg (2010)
Chapter Google Scholar
Fu, T.Z.J., Hu, Y., Shi, X., Chiu, D.M., Lui, J.C.S.: PBS: Periodic Behavioral Spectrum of P2P Applications. In: Moon, S.B., Teixeira, R., Uhlig, S. (eds.) PAM 2009. LNCS, vol. 5448, pp. 155–164. Springer, Heidelberg (2009)
Chapter Google Scholar
Gringoli, F., Salgarelli, L., Dusi, M., Cascarano, N., Risso, F., Claffy, K.C.: GT: picking up the truth from the ground for internet traffic. ACM SIGCOMM Comput. Commun. Rev. 39(5), 12–18 (2009)
Article Google Scholar
Haffner, P., Sen, S., Spatscheck, O., Wang, D.: ACAS: automated construction of application signatures. In: ACM SIGCOMM Workshop on Mining Network Data (Minenet 2005), Philadelphia, PA (August 2005)
Google Scholar
Iliofotou, M., Pappu, P., Faloutsos, M., Mitzenmacher, M., Singh, S., Varghese, G.: Network monitoring using traffic dispersion graphs (tdgs). In: Proc. of IMC 2007, San Diego, California, USA (2007)
Google Scholar
Jamshed, M., Lee, J., Moon, S., Yun, I., Kim, D., Lee, S., Yi, Y., Park, K.S.: Kargus: a highly-scalable software-based intrusion detection system (2012)
Google Scholar
Jin, Y., Duffield, N., Haffner, P., Sen, S., Zhang, Z.-L.: Inferring applications at the network layer using collective traffic statistics. SIGMETRICS Perform. Eval. Rev. 38 (June 2010)
Google Scholar
Karagiannis, T., Broido, A., Brownlee, N., Klaffy, K.C., Faloutsos, M.: Is P2P dying or just hiding? In: IEEE GLOBECOM 2004, Dallas, Texas, US (2004)
Google Scholar
Karagiannis, T., Broido, A., Faloutsos, M., Claffy, K.C.: Transport layer identification of P2P traffic. In: 4th ACM SIGCOMM Internet Measurement Conference (IMC 2004), Taormina, IT (October 2004)
Google Scholar
Karagiannis, T., Papagiannaki, K., Taft, N., Faloutsos, M.: Profiling the End Host. In: Uhlig, S., Papagiannaki, K., Bonaventure, O. (eds.) PAM 2007. LNCS, vol. 4427, pp. 186–196. Springer, Heidelberg (2007)
Chapter Google Scholar
Khakpour, A.R., Liu, A.X.: High-speed flow nature identification. In: Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems, ICDCS (2009)
Google Scholar
Kim, H., Claffy, K., Fomenkov, M., Barman, D., Faloutsos, M., Lee, K.: Internet traffic classification demystified: myths, caveats, and the best practices. In: Proc. of ACM CoNEXT 2008, Madrid, Spain (2008)
Google Scholar
Kohavi, R., Quinlan, R.: Decision tree discovery. In: Handbook of Data Mining and Knowledge Discovery, pp. 267–276. University Press (1999)
Google Scholar
Kotsiantis, S.B.: Supervised machine learning: A review of classification techniques. In: Proceeding of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies, pp. 3–24. IOS Press, Amsterdam (2007)
Google Scholar
Kumar, S., Crowley, P.: Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In: Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM 2006), pp. 339–350 (2006)
Google Scholar
Li, W., Canini, M., Moore, A.W., Bolla, R.: Efficient application identification and the temporal and spatial stability of classification schema. Computer Networks 53(6), 790–809 (2009)
Article MATH Google Scholar
Liu, Y., Xu, D., Sun, L., Liu, D.: Accurate traffic classification with multi-threaded processors. In: IEEE International Symposium on Knowledge Acquisition and Modeling Workshop, KAM (2008)
Google Scholar
Ma, J., Levchenko, K., Kreibich, C., Savage, S., Voelker, G.M.: Unexpected means of protocol inference. In: 6th ACM SIGCOMM Internet Measurement Conference (IMC 2006), Rio de Janeiro, BR (October 2006)
Google Scholar
McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow Clustering Using Machine Learning Techniques. In: Barakat, C., Pratt, I. (eds.) PAM 2004. LNCS, vol. 3015, pp. 205–214. Springer, Heidelberg (2004)
Chapter Google Scholar
Mellia, M., Pescapè, A., Salgarelli, L.: Traffic classification and its applications to modern networks. Computer Networks 53(6), 759–760 (2009)
Article Google Scholar
Moore, A., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification. Technical report, University of Cambridge (2005)
Google Scholar
Moore, A.W., Zuev, D.: Internet traffic classification using bayesian analysis techniques. In: ACM SIGMETRICS 2005, Banff, Alberta, Canada (2005)
Google Scholar
Moore, D., Keys, K., Koga, R., Lagache, E., Claffy, K.C.: The coralreef software suite as a tool for system and network administrators. In: Proceedings of the 15th USENIX Conference on System Administration, San Diego, California (2001)
Google Scholar
Moore, A.W., Papagiannaki, K.: Toward the Accurate Identification of Network Applications. In: Dovrolis, C. (ed.) PAM 2005. LNCS, vol. 3431, pp. 41–54. Springer, Heidelberg (2005)
Chapter Google Scholar
Napa-Wine, http://www.napa-wine.eu/
Nguyen, T.T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials 10(4), 56–76 (2008)
Article Google Scholar
Paxson, V.: Bro: a system for detecting network intruders in real-time. Elsevier Comput. Netw. 31, 2435–2463 (1999)
Article Google Scholar
Risso, F., Baldi, M., Morandi, O., Baldini, A., Monclus, P.: Lightweight, payload-based traffic classification: An experimental evaluation. In: Proc. of IEEE ICC 2008 (May 2008)
Google Scholar
Risso, F., Cascarano, N.: Diffinder, http://netgroup.polito.it/research-projects/l7-traffic-classification
Roesch, M.: Snort - lightweight intrusion detection for networks. In: Proceedings of the 13th USENIX Conference on System Administration, LISA 1999, pp. 229–238. USENIX Association (1999)
Google Scholar
Rossi, D., Valenti, S.: Fine-grained traffic classification with Netflow data. In: TRaffic Analysis and Classification (TRAC) Workshop at IWCMC 2010, Caen, France (June 2010)
Google Scholar
Roughan, M., Sen, S., Spatscheck, O., Duffield, N.: Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification. In: ACM SIGCOMM Internet Measurement Conference (IMC 2004), Taormina, IT (October 2004)
Google Scholar
Salgarelli, L., Gringoli, F., Karagiannis, T.: Comparing traffic classifiers. ACM SIGCOMM Comp. Comm. Rev. 37(3), 65–68 (2007)
Article Google Scholar
Sen, S., Spatscheck, O., Wang, D.: Accurate, scalable in-network identification of p2p traffic using application signatures. In: 13th International Conference on World Wide Web (WWW 2004), New York, NY, US (May 2004)
Google Scholar
Lim, Y.S., Kim, H., Jeong, J., Kim, C.K., Kwon, T.T., Choi, Y.: Internet traffic classification demystified: on the sources of the discriminative power. In: CoNEXT, p. 9 (2010)
Google Scholar
Szabó, G., Gódor, I., Veres, A., Malomsoky, S., Molnár, S.: Traffic classification over Gbit speed with commodity hardware. IEEE J. Communications Software and Systems 5 (2010)
Google Scholar
Valenti, S., Rossi, D., Meo, M., Mellia, M., Bermolen, P.: Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets. In: Papadopouli, M., Owezarski, P., Pras, A. (eds.) TMA 2009. LNCS, vol. 5537, pp. 84–92. Springer, Heidelberg (2009)
Chapter Google Scholar
Vasiliadis, G., Polychronakis, M., Ioannidis, S.: Midea: a multi-parallel intrusion detection architecture. In: ACM Conference on Computer and Communications Security, pp. 297–308 (2011)
Google Scholar
Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. ACM SIGCOMM CCR 36(5), 5–16 (2006)
Article Google Scholar
Wulf, W.A., Mckee, S.A.: Hitting the memory wall: Implications of the obvious. Computer Architecture News 23, 20–24 (1995)
Article Google Scholar
Xu, K., Zhang, Z.-L., Bhattacharyya, S.: Profiling internet backbone traffic: behavior models and applications. ACM SIGCOMM Comput. Commun. Rev. 35(4), 169–180 (2005)
Article Google Scholar
Zu, Y., Yang, M., Xu, Z., Wang, L., Tian, X., Peng, K., Dong, Q.: Gpu-based nfa implementation for memory efficient high speed regular expression matching. In: PPOPP, pp. 129–140 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Telecom ParisTech, France
Silvio Valenti & Dario Rossi
Università di Napoli Federico II, Italy
Alberto Dainotti & Antonio Pescapè
Politecnico di Torino, Italy
Alessandro Finamore & Marco Mellia
Google, Inc., USA
Silvio Valenti
CAIDA, UC San Diego, USA
Alberto Dainotti

Authors

Silvio Valenti
View author publications
You can also search for this author in PubMed Google Scholar
Dario Rossi
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Dainotti
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Pescapè
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Finamore
View author publications
You can also search for this author in PubMed Google Scholar
Marco Mellia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Networking and Security Department, Eurécom, 450 Route des Chappes, 06410, Biot, France
Ernst Biersack
Department of Information Engineering, University of Pisa, Via Caruso 16, 56122, Pisa, Italy
Christian Callegari
Facults of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000, Zagreb, Croatia
Maja Matijasevic

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Valenti, S., Rossi, D., Dainotti, A., Pescapè, A., Finamore, A., Mellia, M. (2013). Reviewing Traffic Classification. In: Biersack, E., Callegari, C., Matijasevic, M. (eds) Data Traffic Monitoring and Analysis. Lecture Notes in Computer Science, vol 7754. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36784-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-36784-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36783-0
Online ISBN: 978-3-642-36784-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics