research-article

Open access

Lumen: a framework for developing and evaluating ML-based IoT network anomaly detection

Authors:

Rahul Anand Sharma,

Maria Apostolaki,

Vyas SekarAuthors Info & Claims

CoNEXT '22: Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies

Pages 59 - 71

https://doi.org/10.1145/3555050.3569129

Published: 30 November 2022 Publication History

Abstract

The rise of IoT devices brings a lot of security risks. To mitigate them, researchers have introduced various promising network-based anomaly detection algorithms, which oftentimes leverage machine learning. Unfortunately, though, their deployment and further improvement by network operators and the research community are hampered. We believe this is due to three key reasons. First, known ML-based anomaly detection algorithms are evaluated -in the best case- on a couple of publicly available datasets, making it hard to compare across algorithms. Second, each ML-based IoT anomaly-detection algorithm makes assumptions about attacker practices/classification granularity, which reduce their applicability. Finally, the implementation of those algorithms is often monolithic, prohibiting code reuse. To ease deployment and promote research in this area, we present Lumen. Lumen is a modular framework paired with a benchmarking suite that allows users to efficiently develop, evaluate, and compare IoT ML-based anomaly detection algorithms. We demonstrate the utility of Lumen by implementing state-of-the-art anomaly detection algorithms and faithfully evaluating them on various datasets. Among other interesting insights that could inform real-world deployments and future research, using Lumen, we were able to identify what algorithms are most suitable to detect particular types of attacks. Lumen can also be used to construct new algorithms with better performance by combining the building blocks of competing efforts and improving the training setup.

References

[1]

[n.d.]. Device Functional Role ID via Machine Learning and Network Traffic Analysis. https://github.com/IQTLabs/NetworkML.

[2]

[n.d.]. HOW TO STOP YOUR SMART TV FROM SPYING ON YOU. https://www.mckinsey.com/~/media/mckinsey/business%20functions/mckinsey%20digital/our%20insights/iot%20value%20set%20to%20accelerate%20through%202030%20where%20and%20how%20to%20capture%20it/the-internet-of-things-catching-up-to-an-accelerating-opportunity-final.pdf.

[3]

[n.d.]. IoT vendors ignore basic security best practices, CITL research finds. https://www.csoonline.com/article/3436877/iot-vendors-ignore-basic-security-best-practices-citl-research-finds.html.

[4]

[n.d.]. Memory Profiler. https://pypi.org/project/memory-profiler/.

[5]

[n.d.]. NetML-Competition2020. https://github.com/ACANETS/NetML-Competition2020.

[6]

[n.d.]. PDML - Packet Description Markup Language. https://wiki.wireshark.org/PDML.

[7]

[n.d.]. pypacker. https://gitlab.com/mike01/pypacker.

[8]

[n.d.]. Why Third-Party Vendors Are Responsible for the IoT Security Problem. https://www.securicon.com/why-third-party-vendors-are-responsible-for-the-iot-security-problem/.

[9]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. {TensorFlow}: a system for {Large-Scale} machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265--283.

[10]

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2623--2631.

Digital Library

[11]

Eirini Anthi, Lowri Williams, Małgorzata Słowińska, George Theodorakopoulos, and Pete Burnap. 2019. A supervised intrusion detection system for smart home IoT devices. IEEE Internet of Things Journal 6, 5 (2019), 9042--9053.

[12]

Zied Aouini and Adrian Pekar. 2022. NFStream: A flexible network data analysis framework. Computer Networks (2022), 108719.

Digital Library

[13]

Michael Austin. 2021. IoT Malicious Traffic Classification Using Machine Learning. West Virginia University.

[14]

James Bergstra, Dan Yamins, David D Cox, et al. 2013. Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In Proceedings of the 12th Python in science conference, Vol. 13. Citeseer, 20.

[15]

Randeep Bhatia, Steven Benno, Jairo Esteban, TV Lakshman, and John Grogan. 2019. Unsupervised machine learning for network-centric anomaly detection in iot. In Proceedings of the 3rd acm conext workshop on big data, machine learning and artificial intelligence for data communication networks. 42--48.

Digital Library

[16]

Anil Chacko and Thaier Hayajneh. 2018. Security and privacy issues with IoT in healthcare. EAI Endorsed Transactions on Pervasive Health and Technology 4, 14 (2018).

[17]

Efstratios Chatzoglou, Georgios Kambourakis, and Constantinos Kolias. 2021. Empirical evaluation of attacks against IEEE 802.11 enterprise networks: The AWID3 dataset. IEEE Access 9 (2021), 34188--34205.

[18]

Rohan Doshi, Noah Apthorpe, and Nick Feamster. 2018. Machine learning ddos detection for consumer internet of things devices. In 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 29--35.

[19]

Sebastian Garcia, Martin Grill, Jan Stiborek, and Alejandro Zunino. 2014. An empirical comparison of botnet detection methods. computers & security 45 (2014), 100--123.

[20]

Jordan Holland, Paul Schmitt, Nick Feamster, and Prateek Mittal. 2021. New directions in automated traffic analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 3366--3383.

Digital Library

[21]

Ren-Hung Hwang, Min-Chun Peng, Chien-Wei Huang, Po-Ching Lin, and Van-Linh Nguyen. 2020. An unsupervised deep learning model for early network traffic anomaly detection. IEEE Access 8 (2020), 30387--30399.

[22]

K Hyunjae, Dong Hyun Ahn, Gyung Min Lee, Jeong Do Yoo, Kyung Ho Park, and HK Kim. 2019. IoT network intrusion dataset. IEEE Dataport (2019).

[23]

Kenneth Kimani, Vitalice Oduol, and Kibet Langat. 2019. Cyber security challenges for IoT-based smart grid networks. International Journal of Critical Infrastructure Protection 25 (2019), 36--49.

Digital Library

[24]

Francisco Sales de Lima Filho, Frederico AF Silveira, Agostinho de Medeiros Brito Junior, Genoveva Vargas-Solar, and Luiz F Silveira. 2019. Smart detection: an online approach for DoS/DDoS attack detection using machine learning. Security and Communication Networks 2019 (2019).

[25]

Wes McKinney et al. 2010. Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, Vol. 445. Austin, TX, 51--56.

[26]

Yair Meidan, Michael Bohadana, Yael Mathov, Yisroel Mirsky, Asaf Shabtai, Dominik Breitenbacher, and Yuval Elovici. 2018. N-baiot---network-based detection of iot botnet attacks using deep autoencoders. IEEE Pervasive Computing 17, 3 (2018), 12--22.

Digital Library

[27]

Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, and Asaf Shabtai. 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089 (2018).

[28]

Andrew W Moore and Denis Zuev. 2005. Internet traffic classification using bayesian analysis techniques. In Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. 50--60.

Digital Library

[29]

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. 2018. Ray: A distributed framework for emerging {AI} applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 561--577.

[30]

Nour Moustafa, Benjamin Turnbull, and Kim-Kwang Raymond Choo. 2018. An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things. IEEE Internet of Things Journal 6, 3 (2018), 4815--4830.

[31]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research 12, Oct (2011), 2825--2830.

Digital Library

[32]

Devin Petersohn, Stephen Macke, Doris Xin, William Ma, Doris Lee, Xiangxi Mo, Joseph E Gonzalez, Joseph M Hellerstein, Anthony D Joseph, and Aditya Parameswaran. 2020. Towards scalable dataframe systems. arXiv preprint arXiv:2001.00888 (2020).

[33]

Paul Schmitt, Francesco Bronzino, Renata Teixeira, Tithi Chattopadhyay, and Nick Feamster. 2018. Enhancing transparency: Internet video quality inference from network traffic. TPRC.

[34]

Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2018. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 1 (2018), 108--116.

[35]

Iman Sharafaldin, Arash Habibi Lashkari, Saqib Hakak, and Ali A Ghorbani. 2019. Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. In 2019 International Carnahan Conference on Security Technology (ICCST). IEEE, 1--8.

[36]

Saleh Soltan, Prateek Mittal, and H Vincent Poor. 2018. {BlackIoT}:{IoT} Botnet of High Wattage Devices Can Disrupt the Power Grid. In 27th USENIX Security Symposium (USENIX Security 18). 15--32.

[37]

Natalija Vlajic and Daiwei Zhou. 2018. IoT as a land of opportunity for DDoS hackers. Computer 51, 7 (2018), 26--34.

[38]

Christos Xenofontos, Ioannis Zografopoulos, Charalambos Konstantinou, Alireza Jolfaei, Muhammad Khurram Khan, and Kim-Kwang Raymond Choo. 2021. Consumer, commercial and industrial iot (in) security: attack taxonomy and case studies. IEEE Internet of Things Journal (2021).

[39]

Kun Yang, Samory Kpotufe, and Nick Feamster. 2020. A Comparative Study of Network Traffic Representations for Novelty Detection. arXiv preprint arXiv:2006.16993 (2020).

[40]

Kun Yang, Samory Kpotufe, and Nick Feamster. 2021. An Efficient One-Class SVM for Anomaly Detection in the Internet of Things. arXiv preprint arXiv:2104.11146 (2021).

[41]

Maede Zolanvari, Marcio A Teixeira, Lav Gupta, Khaled M Khan, and Raj Jain. 2019. Machine learning-based network vulnerability analysis of industrial Internet of Things. IEEE Internet of Things Journal 6, 4 (2019), 6822--6834.

Cited By

Bader OLichy ADvir ADubin RHajaj C(2024)OSF-EIMTCComputer Communications10.1016/j.comcom.2023.10.011213:C(271-284)Online publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.comcom.2023.10.011
Gong FRaghunathan DGupta AApostolaki M(2023)Towards Integrating Formal Methods into ML-Based Systems for NetworkingProceedings of the 22nd ACM Workshop on Hot Topics in Networks10.1145/3626111.3628188(48-55)Online publication date: 28-Nov-2023
https://dl.acm.org/doi/10.1145/3626111.3628188
Fu CLi QXu KWu JMeng WJensen CCremers CKirda E(2023)Point Cloud Analysis for ML-Based Malicious Traffic Detection: Reducing Majorities of False Positive AlarmsProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3616631(1005-1019)Online publication date: 15-Nov-2023
https://dl.acm.org/doi/10.1145/3576915.3616631
Show More Cited By

Index Terms

Lumen: a framework for developing and evaluating ML-based IoT network anomaly detection

Recommendations

HALE: Healthy Area of Lumen Estimation for Vessel Stenosis Quantification
Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016
Abstract
One of the most widely used non-invasive clinical metric for diagnosing patients with symptoms of coronary artery disease is %stenosis derived from cCTA. Estimation of %stenosis involves two steps - the measurement of local diameter and the ...
Comparison of thrombosis risk in an abdominal aortic dissection aneurysm with a double false lumen using computational fluid dynamic simulation method
BACKGROUND:
Aneurysms are associated with a mortality rate of 81% or more in cases of rupture. Intraluminal thrombus (ILT) is a common complication of aneurysms, it can greatly increase the risk of rupture. Especially for some ...
Validation of right coronary artery lumen area from cardiac computed tomography against intravascular ultrasound

Quantification of coronary artery disease (CAD) from cardiac computed tomography angiography (CTA) is important both structurally (lumen area stenosis) and functionally (combined with computational fluid dynamics to determine fractional flow reserve) ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CoNEXT '22: Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies

November 2022

431 pages

ISBN:9781450395083

DOI:10.1145/3555050

General Chairs:
Giuseppe Bianchi
University of Rome Tor Vergata, Italy
,
Alessandro Mei
Sapienza University of Rome, Italy

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2022

Check for updates

Qualifiers

Research-article

Funding Sources

Conference

CoNEXT '22

Sponsor:

SIGCOMM

CoNEXT '22: The 18th International Conference on emerging Networking EXperiments and Technologies

December 6 - 9, 2022

Roma, Italy

Acceptance Rates

CoNEXT '22 Paper Acceptance Rate 28 of 151 submissions, 19%;

Overall Acceptance Rate 198 of 789 submissions, 25%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
1,099
Total Downloads

Downloads (Last 12 months)394
Downloads (Last 6 weeks)36

Reflects downloads up to 23 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bader OLichy ADvir ADubin RHajaj C(2024)OSF-EIMTCComputer Communications10.1016/j.comcom.2023.10.011213:C(271-284)Online publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.comcom.2023.10.011
Gong FRaghunathan DGupta AApostolaki M(2023)Towards Integrating Formal Methods into ML-Based Systems for NetworkingProceedings of the 22nd ACM Workshop on Hot Topics in Networks10.1145/3626111.3628188(48-55)Online publication date: 28-Nov-2023
https://dl.acm.org/doi/10.1145/3626111.3628188
Fu CLi QXu KWu JMeng WJensen CCremers CKirda E(2023)Point Cloud Analysis for ML-Based Malicious Traffic Detection: Reducing Majorities of False Positive AlarmsProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3616631(1005-1019)Online publication date: 15-Nov-2023
https://dl.acm.org/doi/10.1145/3576915.3616631
Weber SStein SPilgermann MSchrader T(2023)Attack Detection for Medical Cyber-Physical Systems–A Systematic Literature ReviewIEEE Access10.1109/ACCESS.2023.327022511(41796-41815)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3270225

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents