Abstract
Machine learning based network traffic classification is a critical technique for network management, and has attracted much attention. Recently, most of the researchers focus on achieving high flow classification accuracy (FCA). However the amount of “mice” flows is more than that of “elephant” flows in the Internet, these classifiers hence are more suitable for “mice” flows, but have low byte classification accuracy (BCA). To address this issue, the notion of byte misclassification is firstly explored. According to the exploration that most misclassified bytes belong to the minority class, a novel method of network traffic classification is proposed by combining the data re-sampling and ensemble learning algorithms. To enhance the classification accuracy of the minority class, the data re-sampling algorithm is employed to increase the number of minority class flows. The data re-sampling however will change the data distribution and degrade the generalization of a classifier. A boosting-style ensemble learning algorithm with the consideration of ensemble diversity hence is employed to improve the generalization. The experiments conducted on the real-world traffic datasets show that the proposed method achieves over 90 % BCA and 96 % FCA on average, and improves about 7.15 % BCA by comparing with the existing methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Carela-Español V, Barlet-Ros P, Cabellos-Aparicio A, Solé-Pareta J (2010) Analysis of the impact of sampling on netflow traffic classification. Comput Netw 55(5):1083–1099. doi:10.1016/j.comnet.2010.11.002 (ISSN:1389-1286)
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16(1):321–357 (ISSN:1076-9757)
Dainotti A, Pescapé A (2012) Issues and future directions in traffic classification. IEEE Netw 26(1):35–40. ISSN:0890-8044. doi:10.1109/MNET.2012.6135854
Dewaele G, Himura Y, Borgnat P, Fukuda K, Abry P, Michel O, Fontugne R, Cho K, Esaki H (2010) Unsupervised host behavior classification from connection patterns. Int J Netw Manag 20(5):317–337. doi:10.1002/nem.750
Erman J, Mahanti A, Arlitt M, Cohen I, Williamson C (2007a) Offline/realtime traffic classification using semi-supervised learning. Perform Eval 64(9–12):1194–1213. doi:10.1016/j.peva.2007.06.014 (ISSN:0166-5316)
Erman J, Mahanti A, Arlitt M (2007b) Byte me: a case for byte accuracy in traffic classification. In: Proceedings of the 3rd annual ACM workshop on mining network data, New York, NY, USA. ACM, pp 35–38. ISBN:978-1-59593-792-6. doi:10.1145/1269880.1269890
Gebert S, Pries R, Schlosser D, Heck K (2012) Internet access traffic measurement and analysis. In: Proceedings of the 4th international conference on traffic monitoring and analysis. Springer, Berlin, Heidelberg, pp 29–42. ISBN:978-3-642-28533-2. doi:10.1007/978-3-642-28534-9_3
Hall MA (1999) Correlation-based feature selection for machine learning. Ph.D. thesis, Waikato University
He HT, Che CH, Ma FT, Luo XN, Wang JM (2008) Improve flow accuracy and byte accuracy in network traffic classification. In: Proceedings of the 4th international conference on intelligent computing, vol 5227. Springer, Berlin, Heidelberg, pp 449–458. ISBN:978-3-540-85984-0. doi:10.1007/978-3-540-85984-0_54
Ikeda M, Kulla E, Hiyama M, Barolli L, Takizawa M (2013) Investigation of TCP and UDP multiple-flow traffic in wireless mobile ad-hoc networks. J High Speed Netw 19(2):129–145 (ISSN:0926-6801)
Jin Y, Duffield N, Erman J, Haffner P, Sen S, Zhang ZL (2012) A modular machine learning system for flow-level traffic classification in large networks. ACM Trans Knowl Discov Data 6(1):1–34. doi:10.1145/2133360.2133364 (ISSN:1556–4681)
Law KLE, So S (2012) Qos control framework for content satisfaction in ubiquitous multimedia computing. J Ambient Intell Hum Comput 3(2):103–112. doi:10.1007/s12652-011-0077-8 (ISSN:1868-5137)
Lee S, Kim H, Barman D, Lee S, Kim CK, Kwon T, Choi Y (2011) Netramark: a network traffic classification benchmark. Comput Commun Rev 41(1):23–30. doi:10.1145/1925861.1925865 (ISSN:0146-4833)
Liu Z, Liu Q (2012a) Balanced feature selection method for internet traffic classification. IET Netw 1(2):74–83. doi:10.1049/iet-net.2011.0049 (ISSN:2047-4954)
Liu Z, Liu Q (2012b) Studying cost-sensitive learning for multi-class imbalance in internet traffic classification. J China Univ Posts Telecommun 19(6):63–72. doi:10.1016/S1005-8885(11)60319-1
Moore AW, Zuev D (2005) Internet traffic classification using bayesian analysis techniques. In: Proceedings of the 2005 ACM SIGMETRICS international conference on measurement and modeling of computer systems, New York, NY, USA. ACM, pp 50–60. ISBN:1-59593-022-1. doi:10.1145/1064212.1064220
Moore AW, Zuev D, Crogan M (2005) Discriminators for use in flow-based classification. Department of Computer Science, Queen Mary, University of London, RR-05-13 (ISSN 1470–5559)
Palmieri F, Fiore U (2008) A nonlinear, recurrence-based approach to traffic classification. Comput Netw 53(6):761–773. doi:10.1016/j.comnet.2008.12.015 (ISSN 1389-1286)
Palmieri F, Fiore U, Castiglione A, De Santis A (2013) On the detection of card-sharing traffic through wavelet analysis and support vector machines. Appl Soft Comput 13(1):615–627. doi:10.1016/j.asoc.2012.08.045 (ISSN:1568-4946)
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (ISBN:1-55860-238-0)
Soysal M, Schmidt EG (2010) Machine learning algorithms for accurate flow-based network traffic classification: evaluation and comparison. Perform Eval 67:451–467. doi:10.1016/j.peva.2010.01.001
Tao M, Yuan HQ, Dong SB, Yu HW (2012) Initiative movement prediction assisted adaptive handover trigger scheme in fast MIPv6. Comput Commun 35(10):1272–1282. doi:10.1016/j.comcom.2012.03.015 (ISSN:0140-3664)
Tao M, Yuan HQ, Wei WH (2014) Active overload prevention based adaptive map selection in hmipv6 networks. Wirel Netw 20(2):197–208. doi:10.1007/s11276-013-0603-z (ISSN:1022–0038)
Wang S, Chen HH, Yao X (2010) Negative correlation learning for classification ensembles. In: Proceedings of international joint conference on neural networks, pp 2893–2900. doi:10.1109/IJCNN.2010.5596702
Wang RY, Liu Z, Zhang L (2014) Method of data cleaning for network traffic classification. J China Univ Posts Telecommun 21(3):35–45. doi:10.1016/S1005-8885(14)60299-5
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (ISBN:0-12-088407-0)
Ye W, Cho K (2014) Hybrid p2p traffic classification with heuristic rules and machine learning. Soft Comput 18(9):1815–1827. doi:10.1007/s00500-014-1253-5 (ISSN:1432-7643)
Acknowledgments
This work is supported by National Natural Science Fund, China (Grant No. 61300198), Guangdong Province Natural Science Foundation (No. S2013040016582). Guangdong Higher School Scientific Innovation Project (Nos. 2013KJCX0177 and 2014KTSCX188), Fundamental Research Funds for the Central Universities (SCUT 2014ZB0029) and China Postdoctoral Science Foundation (No. 2014M552199).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, Z., Wang, R. & Tao, M. SmoteAdaNL: a learning method for network traffic classification. J Ambient Intell Human Comput 7, 121–130 (2016). https://doi.org/10.1007/s12652-015-0310-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-015-0310-y