MalEfficient10%: A Novel Feature Reduction Approach for Android Malware Detection

Hemant Rathore¹⁷,
Ajay Kharat¹⁷,
Rashmi T¹⁷,
Adithya Manickavasakam¹⁷,
Sanjay K. Sahay¹⁷ &
…
Mohit Sewak¹⁷

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 511))

Included in the following conference series:

International Conference on Broadband Communications, Networks and Systems

157 Accesses

Abstract

The Android OS has recently gained immense popularity among smartphone users. It has also attracted many malware developers, leading to countless malicious applications in the ecosystem. Many recent reports suggest that the conventional signature-based malware detection technique fails to protect android smartphones from new and sophisticated malware attacks. Therefore, researchers are exploring machine learning-based malware detection systems that can successfully discriminate between malware and benign applications: effectively and efficiently. Existing literature suggests that many machine learning-based models use large feature sets for malware detection. However, classification models based on a large number of features are computationally expensive, time-consuming, and have poor generalizability. Therefore, this paper proposes a reliable feature reduction approach to select the most prominent features for effective and efficient malware detection. The proposed approach is tested on two different datasets, three distinct features, and twenty-six unique classifiers. The twenty-six baseline malware detection models based on 724 features and thirteen classification algorithms achieved an average accuracy and average AUC of $94.73\%$ and $94.49\%$, respectively. Later we performed feature reduction that works with mutually exclusive and merged feature spaces of android permissions, intents, and opcodes. The proposed feature reduction approach reduced the number of features from 724 to 72 ($10\%$ of the original features). We also list the reduced set of 72 features comprising android permissions, intent, and opcode used for malware detection. The reduced features based twenty-six malware detection models achieved an average accuracy and average AUC of $93.12\%$ and $92.97\%$, respectively. The feature reduction leads to less than $2\%$ reduction in average accuracy and AUC. However, it leads to $85.25\%$ and $91.45\%$ reduction in average test and average training time for twenty-six android malware detection models. Therefore, the feature reduction leads to a minute reduction in the effectiveness but results in massively efficient (w.r.t time) malware detection models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

McAfee Mobile Threat Report. https://www.mcafee.com/en-us/consumer-support/2020-mobile-threat-report.html (2020). Accessed Jan 2023
AVTEST. https://portal.av-atlas.org/malware/statistics (2022). Accessed Jan 2023
IDC Smartphone Market Share. https://www.idc.com/promo/smartphone-market-share (2022). Accessed Jan 2023
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: Drebin: effective and explainable detection of android malware in your pocket. In: Network and Distributed System Security (NDSS) Symposium, vol. 14, pp. 23–26 (2014)
Google Scholar
Chen, T., Mao, Q., Yang, Y., Zhu, J.: TinyDroid: a lightweight and efficient model for android malware detection & classification. Mobile Inf. Syst. 2018, 1–9 (2018)
Google Scholar
Li, C., Mills, K., Niu, D., Zhu, R., Zhang, H., Kinawi, H.: Android malware detection based on factorization machine. IEEE Access 7, 184008–184019 (2019)
Article Google Scholar
Li, J., Sun, L., Yan, Q., Li, Z., Srisa-An, W., Ye, H.: Significant permission identification for machine-learning-based android malware detection. IEEE Trans. Industr. Inf. 14(7), 3216–3225 (2018)
Article Google Scholar
Liu, Y., Tantithamthavorn, C., Li, L., Liu, Y.: Deep learning for android malware defenses: a systematic literature review. arXiv preprint arXiv:2103.05292 (2021)
McLaughlin, N., et al.: Deep android malware detection. In: ACM Conference On Data and Application Security and PrivacY (CODASPY), pp. 301–308 (2017)
Google Scholar
Pushpa Latha, D.: Bat optimization algorithm for wrapper-based feature selection and performance improvement of android malware detection (2021)
Google Scholar
Qiu, J., Zhang, J., Luo, W., Pan, L., Nepal, S., Xiang, Y.: A survey of android malware detection with deep neural models. ACM Comput. Surv. (CSUR) 53(6), 1–36 (2020)
Article Google Scholar
Rathore, H., Nikam, P., Sahay, S.K., Sewak, M.: Identification of adversarial android intents using reinforcement learning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar
Rathore, H., Sahay, S.K., Thukral, S., Sewak, M.: Detection of malicious android applications: classical machine learning vs. deep neural network integrated with clustering. In: Gao, H., J. Durán Barroso, R., Shanchen, P., Li, R. (eds.) BROADNETS 2020. LNICST, vol. 355, pp. 109–128. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68737-3_7
Chapter Google Scholar
Rathore, H., Samavedhi, A., Sahay, S.K., Sewak, M.: Robust malware detection models: learning from adversarial attacks and defenses. Foren. Sci. Int. Digit. Investig. 37, 301183 (2021)
Google Scholar
Sewak, M., Sahay, S.K., Rathore, H.: An investigation of a deep learning based malware detection system. In: 13th International Conference on Availability, Reliability and Security (ARES), pp. 1–5 (2018)
Google Scholar
Sewak, M., Sahay, S.K., Rathore, H.: Assessment of the relative importance of different hyper-parameters of LSTM for an IDS. In: IEEE Region 10 Conference (TENCON), pp. 414–419. IEEE (2020)
Google Scholar
Sewak, M., Sahay, S.K., Rathore, H.: DeepIntent: implicitintent based android IDS with E2E deep learning architecture. In: International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pp. 1–6. IEEE (2020)
Google Scholar
Sewak, M., Sahay, S.K., Rathore, H.: Value-approximation based deep reinforcement learning techniques: an overview. In: International Conference on Computing Communication and Automation, pp. 379–384. IEEE (2020)
Google Scholar
Sewak, M., Sahay, S.K., Rathore, H.: Deep reinforcement learning for cybersecurity threat detection and protection: a review. In: Krishnan, R., Rao, H.R., Sahay, S.K., Samtani, S., Zhao, Z. (eds.) Secure Knowledge Management In The Artificial Intelligence Era. SKM 2021. Communications in Computer and Information Science, vol. 1549, pp. 51–72. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-97532-6_4
Sewak, M., Sahay, S.K., Rathore, H.: DRLDO: a novel DRL based de-obfuscation system for defence against metamorphic malware. Def. Sci. J. 71(1), 55–65 (2021)
Article Google Scholar
Sewak, M., Sahay, S.K., Rathore, H.: DRo: a data-scarce mechanism to revolutionize the performance of DL-based security systems. In: IEEE 46th Conference on Local Computer Networks (LCN), pp. 581–588. IEEE (2021)
Google Scholar
Sun, L., Li, Z., Yan, Q., Srisa-an, W., Pan, Y.: SigPID: significant permission identification for android malware detection. In: 11th International Conference on Malicious and unwanted software (MALWARE), pp. 1–8. IEEE (2016)
Google Scholar
Wang, W., Wang, X., Feng, D., Liu, J., Han, Z., Zhang, X.: Exploring permission-induced risk in android applications for malicious application detection. IEEE Trans. Inf. Forensics Secur. 9(11), 1869–1882 (2014)
Article Google Scholar
Wang, X., Li, C.: Android malware detection through machine learning on kernel task structures. Neurocomputing 435, 126–150 (2021)
Article Google Scholar
Wei, F., Li, Y., Roy, S., Ou, X., Zhou, W.: Deep ground truth analysis of current android malware. In: Polychronakis, M., Meier, M. (eds.) DIMVA 2017. LNCS, vol. 10327, pp. 252–276. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60876-1_12
Chapter Google Scholar
Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining techniques. ACM Comput. Surv. (CSUR) 50(3), 1–40 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of CS and IS, BITS Pilani, K K Birla Goa Campus, Goa, India
Hemant Rathore, Ajay Kharat, Rashmi T, Adithya Manickavasakam, Sanjay K. Sahay & Mohit Sewak

Authors

Hemant Rathore
View author publications
You can also search for this author in PubMed Google Scholar
Ajay Kharat
View author publications
You can also search for this author in PubMed Google Scholar
Rashmi T
View author publications
You can also search for this author in PubMed Google Scholar
Adithya Manickavasakam
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay K. Sahay
View author publications
You can also search for this author in PubMed Google Scholar
Mohit Sewak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hemant Rathore .

Editor information

Editors and Affiliations

Harbin Engineering University, Harbin, Heilongjiang, China
Wei Wang
Shanghai Jiao Tong University, Shanghai, China
Jun Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rathore, H., Kharat, A., T, R., Manickavasakam, A., Sahay, S.K., Sewak, M. (2023). MalEfficient10%: A Novel Feature Reduction Approach for Android Malware Detection. In: Wang, W., Wu, J. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 511. Springer, Cham. https://doi.org/10.1007/978-3-031-40467-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-40467-2_5
Published: 30 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40466-5
Online ISBN: 978-3-031-40467-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics