Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Heuristic nonlinear regression strategy for detecting phishing websites

Published: 01 June 2019 Publication History

Abstract

In this paper, we propose a method of phishing website detection that utilizes a meta-heuristic-based nonlinear regression algorithm together with a feature selection approach. In order to validate the proposed method, we used a dataset comprised of 11055 phishing and legitimate webpages, and select 20 features to be extracted from the mentioned websites. This research utilizes two feature selection methods: decision tree and wrapper to select the best feature subset, while the latter incurred the detection accuracy rate as high as 96.32%. After the feature selection process, two meta-heuristic algorithms are successfully implemented to predict and detect the fraudulent websites: harmony search (HS) which was deployed based on nonlinear regression technique and support vector machine (SVM). The nonlinear regression approach was used to classify the websites, where the parameters of the proposed regression model were obtained using HS algorithm. The proposed HS algorithm uses dynamic pitch adjustment rate and generated new harmony. The nonlinear regression based on HS led to accuracy rates of 94.13 and 92.80% for train and test processes, respectively. As a result, the study finds that the nonlinear regression-based HS results in better performance compared to SVM.

References

[1]
Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41:5948- 5959.
[2]
Aburrous M, Hossain MA, Thabatah F, Dahal K (2008) Intelligent phishing website detection system using fuzzy techniques. In: 3rd international conference on information and communication technologies: from theory to applications. ICTTA 2008. IEEE, pp 1-6.
[3]
Aburrous M, Hossain MA, Dahal K, Thabtah F (2010) Intelligent phishing detection system for e-banking using fuzzy data mining. Expert Syst Appl 37:7913-7921.
[4]
Ameli K, Alfi A, Aghaebrahimi M (2016) A fuzzy discrete harmony search algorithm applied to annual cost reduction in radial distribution systems. Eng Optim 48:1529-1549.
[5]
Basnet R, Mukkamala S, Sung AH (2008) Detection of phishing attacks: a machine learning approach. In: Soft computing applications in industry. Springer, pp 373-383.
[6]
Bottazzi G, Casalicchio E, Cingolani D, Marturana F, Piu M (2015) MP-Shield: a framework for phishing detection in mobile devices. In: 2015 IEEE international conference on computer and information technology; ubiquitous computing and communications; dependable, autonomic and secure computing; pervasive intelligence and computing (CIT/IUCC/DASC/PICOM). IEEE, pp 1977-1983.
[7]
Cai C, Han L, Ji ZL, Chen X, Chen YZ (2003) SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res 31:3692- 3697.
[8]
Cao J, Li Q, Ji Y, He Y, Guo D (2016) Detection of forwarding-based malicious URLs in online social networks. Int J Parallel Prog 44:163-180.
[9]
Fil BA, Korkmaz M, Özmetin C (2016) Application of nonlinear regression analysis for methyl violet (MV) dye adsorption from solutions onto illite clay. J Dispers Sci Technol 37:991-1001.
[10]
Gupta R, Shukla PK (2015) System design, investigation and counter measure of phishing attacks using data mining classification methods and its analysis. Int J Adv Sci Technol 78:29-40.
[11]
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11:10-18.
[12]
Hamid IRA, Abawajy J (2011) Phishing email feature selection approach. In: 2011 IEEE 10th international conference on trust, security and privacy in computing and communications. IEEE, pp 916-921.
[13]
He Y-L, Wang X-Z, Huang JZ (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364:222-240.
[14]
Jahn J (2017) Karush-Kuhn-Tucker conditions in set optimization. J Optim Theory Appl 172:707-725.
[15]
Jeong SY, Koh YS, Dobbie G (2016) Phishing detection on twitter streams. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 141-153.
[16]
Kalivarapu J, Jain S, Bag S (2016) An improved harmony search algorithm with dynamically varying bandwidth. Eng Optim 48:1091-1108.
[17]
Lee KS, Geem ZW (2005) A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice. Comput Methods Appl Mech Eng 194:3902-3933.
[18]
Li K, Wang F, Zhang L (2016) A new algorithm for image recognition and classification based on improved Bag of Features algorithm. Opt Int J Light Electron Opt 127:4736-4740.
[19]
Manjarres D, Landa-Torres I, Gil-Lopez S, Del Ser J, Bilbao MN, Salcedo-Sanz S, Geem ZW (2013) A survey on applications of the harmony search algorithm. Eng Appl Artif Intell 26:1818-1831.
[20]
Mohammad RM, Thabtah F, McCluskey L (2012) An assessment of features related to phishing websites using an automated technique. In: 2012 international conference for internet technology and secured transactions. IEEE, pp 492-497.
[21]
Mohammad RM, Thabtah F, McCluskey L (2014a) Intelligent rule-based phishing websites classification. IET Inf Secur 8:153-160.
[22]
Mohammad RM, Thabtah F, McCluskey L (2014b) Predicting phishing websites based on self-structuring neural network. Neural Comput Appl 25:443-458.
[23]
Mohammad R, Thabtah FA, McCluskey T (2015) Phishing websites Dataset.
[24]
Montazer GA, ArabYarmohammadi S (2013) Identifying the critical indicators for phishing detection in Iranian e-banking system. In: 2013 5th conference on information and knowledge technology (IKT). IEEE, pp 107-112.
[25]
Naik B, Nayak J, Behera HS, Abraham A (2016) A self adaptive harmony search based functional link higher order ANN for non-linear data classification. Neurocomputing 179:69-87.
[26]
Pandey M, Ravi V (2012) Detecting phishing e-mails using text and data mining. In: 2012 IEEE international conference on computational intelligence & computing research (ICCIC). IEEE, pp 1-6.
[27]
Qiu J, Wei Y, Karimi HR, Gao H (2017a) Reliable control of discrete-time piecewise-affine time-delay systems via output feedback. IEEE Trans Reliab 99:1-13.
[28]
Qiu J, Wei Y, Wu L (2017b) A novel approach to reliable control of piecewise affine systems with actuator faults. IEEE Trans Circuits Syst II Express Briefs 64:957-961.
[29]
Rodrigues D, Pereira LA, Nakamura RY, Costa KA, Yang X-S, Souza AN, Papa JP (2014) A wrapper approach for feature selection based on bat algorithm and optimum-path forest. Expert Syst Appl 41:2250-2258.
[30]
Satapathy SC, Chittineni S, Krishna SM, Murthy J, Reddy PP (2012) Kalman particle swarm optimized polynomials for data classification. Appl Math Model 36:115-126.
[31]
Song Q, Jiang H, Liu J (2017) Feature selection based on FDA and F-score for multi-class classification. Expert Syst Appl 81:22-27.
[32]
Wang L, Ni H, Yang R, Pappu V, Fenn MB, Pardalos PM (2014) Feature selection based on meta-heuristics for biomedicine. Optim Methods Softw 29:703-719.
[33]
Wang G-G, Gandomi AH, Zhao X, Chu HCE (2016) Hybridizing harmony search algorithm with cuckoo search for global numerical optimization. Soft Comput 20:273-285.
[34]
Wei Y, Qiu J, Karimi HR (2017) Reliable output feedback control of discrete-time fuzzy affine systems with actuator faults. IEEE Trans Circuits Syst I Regul Pap 64:170-181.
[35]
Xia Z, Wang X, Sun X, Liu Q, Xiong N (2016) Steganalysis of LSB matching using differences between nonadjacent pixels. Multimed Tools Appl 75:1947-1962.

Cited By

View all

Index Terms

  1. Heuristic nonlinear regression strategy for detecting phishing websites
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Soft Computing - A Fusion of Foundations, Methodologies and Applications
        Soft Computing - A Fusion of Foundations, Methodologies and Applications  Volume 23, Issue 12
        June 2019
        662 pages
        ISSN:1432-7643
        EISSN:1433-7479
        Issue’s Table of Contents

        Publisher

        Springer-Verlag

        Berlin, Heidelberg

        Publication History

        Published: 01 June 2019

        Author Tags

        1. Decision tree
        2. Feature selection
        3. Harmony search
        4. Nonlinear regression
        5. Phishing
        6. SVM
        7. Wrapper

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 16 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Robust Framework for Malevolent URL Detection using Hybrid Supervised LearningProcedia Computer Science10.1016/j.procs.2023.12.079230:C(241-247)Online publication date: 12-Apr-2024
        • (2024)A hybrid framework using explainable AI (XAI) in cyber-risk management for defence and recovery against phishing attacksDecision Support Systems10.1016/j.dss.2023.114102177:COnline publication date: 1-Feb-2024
        • (2024)PDHFComputers and Security10.1016/j.cose.2023.103561136:COnline publication date: 1-Feb-2024
        • (2023)The Role of Machine Learning in CybersecurityDigital Threats: Research and Practice10.1145/35455744:1(1-38)Online publication date: 7-Mar-2023
        • (2023)A systematic literature review on phishing website detection techniquesJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.01.00435:2(590-611)Online publication date: 1-Feb-2023
        • (2023)A high-accuracy phishing website detection method based on machine learningJournal of Information Security and Applications10.1016/j.jisa.2023.10355377:COnline publication date: 1-Sep-2023
        • (2023)Intelligent feature selection model based on particle swarm optimization to detect phishing websitesMultimedia Tools and Applications10.1007/s11042-023-15399-682:29(44943-44975)Online publication date: 1-Dec-2023
        • (2023)A comprehensive survey on online social networks security and privacy issuesSecurity and Privacy10.1002/spy2.2756:1Online publication date: 16-Jan-2023
        • (2022)A novel phishing detection system using binary modified equilibrium optimizer for feature selectionComputers and Electrical Engineering10.1016/j.compeleceng.2022.10768998:COnline publication date: 1-Mar-2022
        • (2022)ISHO: improved spotted hyena optimization algorithm for phishing website detectionMultimedia Tools and Applications10.1007/s11042-021-10678-681:24(34677-34696)Online publication date: 1-Oct-2022
        • Show More Cited By

        View Options

        View options

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media