Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Exploring Generative Adversarial Networks for Augmenting Network Intrusion Detection Tasks

Published: 23 December 2024 Publication History

Abstract

The advent of generative networks and their adoption in numerous domains and communities have led to a wave of innovation and breakthroughs in AI and machine learning. Generative Adversarial Networks (GANs) have expanded the scope of what is possible with machine learning, allowing for new applications in areas such as computer vision, natural language processing, and creative AI. GANs, in particular, have been used for a wide range of tasks, including image and video generation, data augmentation, style transfer, and anomaly detection. They have also been used for medical imaging and drug discovery, where they can generate synthetic data to augment small datasets, reduce the need for expensive experiments, and lower the number of real patients that must be included in medical trials. Given these developments, we propose using the power of GANs to create and augment flow-based network traffic datasets. We evaluate a series of GAN architectures, including Wasserstein, conditional, energy-based, gradient penalty, and LSTM-GANs. We evaluate their performance on a set of flow-based network traffic data collected from 16 subjects who used their computers for home, work, and study purposes. The performance of these GAN architectures is described according to metrics that involve networking principles, data distribution among a collection of flows, and temporal data distribution. Given the tendency of network intrusion detection datasets to have a very imbalanced data distribution, i.e., a large number of samples in the “normal traffic” category and a comparatively low number of samples assigned to the “intrusion” categories, we test our GANs by augmenting the intrusion data and checking whether this helps intrusion detection neural networks in their task. We publish the resulting UPBFlow dataset and code on GitHub.

References

[1]
Jon J. Aho, Alexander W. Witt, Carter B. F. Casey, Nirav Trivedi, and Venkatesh Ramaswamy. 2018. Generating realistic data for network analytics. In MILCOM 2018-2018 IEEE Military Communications Conference (MILCOM). IEEE, 401–406.
[2]
Blake Anderson and David McGrew. 2017. Machine learning for encrypted malware traffic classification: Accounting for noisy labels and non-stationarity. In the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1723–1732.
[3]
Zied Aouini and Adrian Pekar. 2022. NFStream: A flexible network data analysis framework. Computer Networks 204 (2022), 108719.
[4]
Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In International Conference on Machine Learning. PMLR, 214–223.
[5]
Adriel Cheng. 2019. PAC-GAN: Packet generation of network traffic using generative adversarial networks. In 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE, 0728–0734.
[6]
Benoit Claise. 2004. Cisco Systems Netflow Services Export Version 9. Technical Report, Cisco Systems Inc.
[7]
L. Dhanabal and S. P. Shantharajah. 2015. A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. International Journal of Advanced Research in Computer and Communication Engineering 4, 6 (2015), 446–452.
[8]
Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and Yi-Hsuan Yang. 2018. Musegan: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In AAAI Conference on Artificial Intelligence, Vol. 32, 34–41.
[9]
Ugo Fiore, Alfredo De Santis, Francesca Perla, Paolo Zanetti, and Francesco Palmieri. 2019. Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences 479 (2019), 448–455.
[10]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Communications of the ACM 63, 11 (2020), 139–144.
[11]
Walter Goralski. 2017. The Illustrated Network: How TCP/IP Works in a Modern Network. Morgan Kaufmann.
[12]
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of Wasserstein GANs. Advances in Neural Information Processing Systems 30 (2017), 1–11.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In IEEE International Conference on Computer Vision, 1026–1034.
[14]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[15]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.
[16]
WooHo Lee, BongNam Noh, YeonSu Kim, and KiMoon Jeong. 2019. Generation of network traffic using WGAN-GP and a DFT filter for resolving data imbalance. In Internet and Distributed Computing Systems: 12th International Conference, (IDCS ’19), Proceedings 12. Springer, 306–317.
[17]
Woo Ho Lee, Chae Sang Lim, and Bong Nam Noh. 2020. Generation of similar traffic using GAN for resolving data imbalance. In Advances in Computer Science and Ubiquitous Computing, CSA-CUTE 2018. James J. Park, Doo-Soon Park, Young-Sik Jeong, and Yi Pan (Eds.), Springer, 1–7.
[18]
Francisco Sales de Lima Filho, Frederico A. F. Silveira, Agostinho de Medeiros Brito Junior, Genoveva Vargas-Solar, and Luiz F. Silveira. 2019. Smart detection: An online approach for DoS/DDoS attack detection using machine learning. Security and Communication Networks 2019 (2019), 1–15.
[19]
Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, and Antonio Torralba. 2020. Diverse image generation via self-conditioned GANs. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14286–14295.
[20]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26 (2013), 1–9.
[21]
Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv:1411.1784.
[22]
Weili Nie, Nina Narodytska, and Ankit Patel. 2019. RelGAN: Relational generative adversarial networks for text generation. In International Conference on Learning Representations, 1–20.
[23]
Xu Ouyang, Xi Zhang, Di Ma, and Gady Agam. 2018. Generating image sequence from description with LSTM conditional GAN. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2456–2461.
[24]
Markus Ring, Alexander Dallmann, Dieter Landes, and Andreas Hotho. 2017. Ip2vec: Learning similarities between IP addresses. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 657–666.
[25]
Markus Ring, Daniel Schlör, Dieter Landes, and Andreas Hotho. 2019. Flow-based network traffic generation using generative adversarial networks. Computers & Security 82 (2019), 156–172.
[26]
Neelam Rout, Debahuti Mishra, and Manas Kumar Mallick. 2018. Handling imbalanced data: A survey. In International Proceedings on Advances in Soft Computing, Intelligent Systems and Applications (ASISA ’16). Springer, 431–443.
[27]
Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani. 2018. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 1 (2018), 108–116.
[28]
Sungho Suh, Haebom Lee, Paul Lukowicz, and Yong Oh Lee. 2021. CEGAN: Classification enhancement generative adversarial networks for unraveling data imbalance problems. Neural Networks 133 (2021), 69–86.
[29]
Ankit Thakkar and Ritika Lohiya. 2020. A review of the advancement in intrusion detection datasets. Procedia Computer Science 167 (2020), 636–645.
[30]
R. Vinayakumar, K. P. Soman, and Prabaharan Poornachandran. 2019. A comparative analysis of deep learning approaches for network intrusion detection systems (N-IDSs): Deep learning for N-IDSs. International Journal of Digital Crime and Forensics (IJDCF) 11, 3 (2019), 65–89.
[31]
Pan Wang, Shuhang Li, Feng Ye, Zixuan Wang, and Moxuan Zhang. 2020. PacketCGAN: Exploratory study of class imbalance for encrypted traffic classification using CGAN. In ICC 2020-2020 IEEE International Conference on Communications (ICC). IEEE, 1–7.
[32]
Yawen Xiao, Jun Wu, and Zongli Lin. 2021. Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data. Computers in Biology and Medicine 135 (2021), 104540.
[33]
Yizhe Zhang, Zhe Gan, and Lawrence Carin. 2016. Generating text via adversarial training. In NIPS Workshop on Adversarial Training, Vol. 21, 21–32.
[34]
Junbo Zhao, Michael Mathieu, and Yann LeCun. 2016. Energy-based generative adversarial network. arXiv:1609.03126.
[35]
Pasquale Zingo and Andrew Novocin. 2020. Can GAN-generated network traffic be used to train traffic anomaly classifiers? In 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE, 0540–0545.

Index Terms

  1. Exploring Generative Adversarial Networks for Augmenting Network Intrusion Detection Tasks

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 21, Issue 1
      January 2025
      860 pages
      EISSN:1551-6865
      DOI:10.1145/3703004
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 December 2024
      Online AM: 18 September 2024
      Accepted: 07 August 2024
      Revised: 05 August 2024
      Received: 24 May 2023
      Published in TOMM Volume 21, Issue 1

      Check for updates

      Author Tags

      1. generative networks
      2. network traffic
      3. network flow

      Qualifiers

      • Research-article

      Funding Sources

      • European Excellence Centre for Media, Society and Democracy, H2020 ICT-48-2020
      • National Association of Technical Universities - GNAC ARUT

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 249
        Total Downloads
      • Downloads (Last 12 months)249
      • Downloads (Last 6 weeks)32
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media