2022 CHVR Lalitha ICSCSP 2021 Proceedings

Advances in Intelligent Systems and Computing 1413
V. Sivakumar Reddy
V. Kamakshi Prasad
Jiacun Wang
K. T. V. Reddy Editors
Soft Computing
and Signal
Processing
Proceedings of 4th ICSCSP 2021
Advances in Intelligent Systems and Computing
Volume 1413
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de
Janeiro, Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
Indexed by DBLP, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and
Technology Agency (JST).
All books published in the series are submitted for consideration in Web of Science.
For proposals from Asia please contact Aninda Bose (aninda.bose@springer.com).
More information about this series at https://link.springer.com/bookseries/11156

V. Sivakumar Reddy · V. Kamakshi Prasad ·
Jiacun Wang · K. T. V. Reddy
Editors
Soft Computing and Signal

Processing
Proceedings of 4th ICSCSP 2021
Editors
V. Sivakumar Reddy V. Kamakshi Prasad
Department of Electronics Department of Computer Science
and Communication Engineering and Engineering
Malla Reddy College of Engineering Jawaharlal Nehru Technological University
and Technology Hyderabad
Hyderabad, Telangana, India Hyderabad, Telangana, India
Jiacun Wang K. T. V. Reddy

Department of Computer Science Department of Electronics
and Software Engineering and Communication Engineering
Monmouth University Sir Visvesvaraya Institute of Technology
New Jersey, NJ, USA Nashik, Maharashtra, India
ISSN 2194-5357 ISSN 2194-5365 (electronic)

Advances in Intelligent Systems and Computing
ISBN 978-981-16-7087-9 ISBN 978-981-16-7088-6 (eBook)
https://doi.org/10.1007/978-981-16-7088-6
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Conference Committee
Chief Patron
Sri. Ch. Malla Reddy, Hon’ble Minister, Govt. of Telangana, Founder Chairman,
MRGI
Patrons
Sri. Ch. Mahendar Reddy, Secretary, MRGI

Sri. Ch. Bhadra Reddy, President, MRGI
Conference Chair
Dr. V. S. K. Reddy, Director
Convener
Dr. S. Srinivasa Rao, Principal
Publication Chair
Dr. Suresh Chandra Satapathy, Professor, KIIT, Bhubaneswar
v
vi Conference Committee
Co-Convener
Dr. P. H. V. Sesha Talpa Sai, Dean R&D, MRCET
Organizing Chair
Prof. P. Sanjeeva Reddy, Dean, International Studies
Organizing Secretaries
Dr. T. Venu Gopal, HOD, CSE

Dr. G. Sharada, HOD, IT
Dr. K. Mallikarjuna Lingam, HOD, ECE
Coordinators
Dr. S. Shanthi, Professor, CSE

Dr. M. Jayapal, Associate Professor, CSE
Mr. M. Vazralu, Associate Professor, IT
Dr. K. M. Rayudu, Associate Professor, CSE
Dr. G. S. Naveen Kumar, Associate Professor, ECE
Dr. N. Subash, Associate Professor, ECE
Mrs. Anitha Patibandla, Associate Professor, ECE
Organizing Committee
Prof. K. Kailasa Rao, Dean, Placements

Prof. K. Subhas, Director and Professor, EEE
Dr. M. Sucharitha, Professor, ECE
Dr. V. Chandrasekhar, Professor, CSE, MRCET
Dr. A. Mummoorthy, Professor, IT
Ms. P. Anitha, Associate Professor, ECE
Ms. Renju Panicker, Assistant Professor, ECE
Sri. B. Rajeswar Reddy, Administrative Officer
Conference Committee vii
Web Developer
Mr. K. Sudhakar Reddy, Assistant Professor, IT
Session Chairs
Dr. Hushairi Zen, Professor, ECE, UNIMAS, Malaysia

Dr. S. L. Sabat, Professor, University of Hyderabad
Dr. Mukil Alagirisamy, Professor, Lincoln University College, Malaysia
Dr. Bharat Gupta, Associate Professor, NIT Patna
Dr. M. Ramakrishna Murthy, ANITS, Visakhapatnam
Dr. Jiacun Wang, Monmouth University, USA
Dr. Ramamurthy Garimella, Mahindra University, Hyderabad
Dr. Divya Midhun Chakravarthy, Lincoln University College, Malaysia
Dr. Naga Mallikarjuna Rao Dasari, Federation University, Australia
Dr. G. Sharada, HOD, IT, MRCET
Dr. S. Shanthi, HOD, CSE, MRCET
Dr. K. Mallikarjuna Lingam, HOD, ECE
Dr. N. S. Gowri Ganesh, Professor, IT, MRCET
Dr. R. Roopa Chandrika, Professor, IT, MRCET
Dr. V. Chandrasekar, Professor, CSE, MRCET
Dr. Viswabharathy, Professor, CSE, MRCET
Proceedings Committee
Mr. N. Sivakumar, Assistant Professor, CSE

Mrs. R. Radha, Associate Professor, CSE
Mrs. M. Gayatri, Assistant Professor, CSE
Mrs. Renju Panicker, Assistant Professor, ECE
Ms. D. Asha, Assistant Professor, ECE
Mrs. Neha Thakur, Assistant Professor, ECE
Technical Program Committee
Dr. K. M. Rayudu, Associate Professor, CSE

Mr. M. Sandeep, Associate Professor, CSE
Mr. P. Bikshapathy, Associate Professor, CSE
Mr. M. Sambasivudu, Associate Professor, CSE
viii Conference Committee
Dr. M. Jayapal, Associate Professor, CSE

Mr. K. Srikanth, Associate Professor, CSE
Mr. K. Sudhakar Reddy, Assistant Professor, IT
Mr. P. Praveen, Assistant Professor, IT
Mr. P. V. Naresh, Assistant Professor, IT
Mrs. K. Srilakshmi, Assistant Professor, IT
Mrs. M. Anusha, Assistant Professor, ECE
Mr. T. Srinivas, Assistant Professor, ECE
Mr. M. Arun Kumar, Assistant Professor, ECE
Mr. Ch. Kiran Kumar, Assistant Professor, ECE
Publicity Committee
Ms. D. Radha, Associate Professor, CSE

Mrs. R. Sujatha, Assistant Professor, CSE
Mrs. Pavani, Assistant Professor, CSE
Mr. A. Yogananda, Assistant Professor, IT
Mr. V. Narsingrao, Assistant Professor, IT
Mrs. P. Swetha, Assistant Professor, ECE
Mr. M. Sreedhar Reddy, Assistant Professor, ECE
Mrs. G. Vaidehi, Assistant Professor, ECE
Mr. K. Suresh, Assistant Professor, ECE
Mr. V. Shivraj Kumar, Assistant Professor, ECE
Mr. Mahendar Reddy, Assistant Professor, ECE
Mr. N. Suresh, Assistant Professor, ECE
Registration Committee
Mrs. Gayatri, Associate Professor, CSE

Mrs. Suneetha, Assistant Professor, CSE
Mrs. Shanthi Priya, Assistant Professor, CSE
Mrs. Nirosha, Assistant Professor, CSE
Mr. Ch. Naveen Kumar Reddy, Assistant Professor, CSE
Mr. Mahendar, Assistant Professor, CSE
Mrs. K. Swetha, Assistant Professor, IT
Mrs. T. Shilpa, Assistant Professor, IT
Mrs. N. Saritha, Assistant Professor, ECE
Mrs. S. Aruna Kumari, Assistant Professor, ECE
Conference Committee ix
Hospitality Committee
Mr. T. Satish Kumar, Associate Professor, MBA

Mr. A. Syam Prasad, Associate Professor, CSE
Mr. G. Ravi, Associate Professor, CSE
Mr. M. Vijay Kumar, Assistant Professor, CSE
Mr. M. Venu, Assistant Professor, CSE
Ms. Shruthi Rani Yadav, Assistant Professor, CSE
Mrs. B. Aruna, Associate Professor, IT
Mr. R. Chinna Rao, Assistant Professor, ECE
Mr. K. L. N. Prasad, Assistant Professor, ECE
Certificate Committee
Mr. Manoj Kumar, Assistant Professor, CSE

Mr. Satish, Assistant Professor, CSE
Mr. Siva Ratna Sai, Assistant Professor, CSE
Mr. D. Subbarao, Assistant Professor, IT
Mrs. N. Prameela, Assistant Professor, IT
Mr. M. Ramanjaneyulu, Associate Professor, ECE
Mr. K. D. K. Ajay, Assistant Professor, ECE
Decoration Committee
Mrs. Radha, Associate Professor, CSE

Mrs. Honey Diana, Assistant Professor, CSE
Mrs. G. Roopa, Assistant Professor, IT
Mrs. P. Swetha, Assistant Professor, IT
Mrs. P. Sampurnima, Assistant Professor, IT
Mrs. K. Navya, Assistant Professor, IT
Mrs. N. Nagma, Assistant Professor, ECE
Mrs. N. Swetha, Assistant Professor, ECE
Mrs. S. Deepika, Assistant Professor, ECE
Transportation Committee
Mr. V. Kamal, Associate Professor, CSE

Mr. P. Dileep, Associate Professor, CSE
x Conference Committee
Mr. G. Ravi, Associate Professor, CSE

Mr. Saleem, Assistant Professor, CSE
Mr. P. Harikrishna, Assistant Professor, IT
Mr. T. Srinidhi, Assistant Professor, IT
Mr. M. Anantha Guptha, Assistant Professor, ECE
International and National Advisory Committee
Dr. Heggere Ranganath, Chair of CS, University of Alabama, Huntsville, USA

Dr. Someswar Kesh, Professor, Department of CISA, University of Central Missouri,
USA
Mr. Alex Wong, Senior Technical Analyst, Diligent Inc., USA
Dr. Bhaskar Kura, Professor, University of New Orleans, USA
Dr. Ch. Narayana Rao, Scientist, Denver, Colorado, USA
Dr. Arun Kulkarni, Professor, University of Texas at Tyler, USA
Dr. Sam Ramanujan, Professor, Department of CIS and IT, University of Central
Missouri, USA
Dr. Richard H. Nader, Associate Vice President, Mississippi State University, USA
Prof. Peter Walsh, Head of the Department, Vancouver Film School, Canada
Dr. Ram Balalachandar, Professor, University of Windsor, Canada
Dr. Asoke K. Nandi, Professor, Department of EEE, University of Liverpool, UK
Dr. Vinod Chandran, Professor, Queensland University of Technology, Australia
Dr. Amiya Bhaumik, Vice Chancellor, Lincoln University College, Malaysia
Prof. Soubarethinasamy, UNIMAS International, Malaysia
Dr. Sinin Hamdan, Professor, UNIMAS
Dr. Hushairi bin Zen, Professor, ECE, UNIMAS
Dr. Bhanu Bhaskara, Professor, Majmaah University, Saudi Arabia
Dr. Narayanan, Director, ISITI, CSE, UNIMAS
Dr. Koteswararao Kondepu, Research Fellow, Scuola Superiore Sant’Anna, Pisa,
Italy
Shri B. H. V. S. Narayana Murthy, Director, RCI, Hyderabad
Prof. P. K. Biswas, Head, Department of E&ECE, IIT Kharagpur
Dr. M. Ramasubba Reddy, Professor, IIT Madras
Prof. N. C. Shiva Prakash, Professor, IISc, Bengaluru
Dr. B. Lakshmi, Professor, Department of ECE, NIT Warangal
Dr. Y. Madhavee Latha, Professor, Department of ECE, MRECW, Hyderabad
Dr. G. Ram Mohana Reddy, Professor and Head, Department of IT, NITK Suratkal,
Mangalore, India
Preface
The International Conference on Soft Computing and Signal Processing (ICSCSP-

2021) was successfully organized by Malla Reddy College of Engineering and Tech-
nology, an UGC Autonomous Institution, during June 18–19, 2021, at Hyderabad.
The objective of this conference was to provide opportunities for the researchers,
academicians and Industry persons to interact and exchange the ideas and experience
and gain expertise in the cutting-edge technologies pertaining to soft computing and
signal processing. Research papers in the above-mentioned technology areas were
received and subjected to a rigorous peer review process with the help of program
committee members and external reviewers. ICSCSP-2021 received a total of 340
papers, and each paper was reviewed by more than two reviewers, and finally, 75
papers were accepted for publication in Springer AISC series.
Our sincere thanks to our Chief Guest Dr. Suresh Chandra Satapathy, Professor
and Dean R&D, KIIT; Dr. Aninda Bose, Senior Editor, Springer Publications, India;
Dr. Amiya Bhaumik, President, LUC, Malasiya; Dr. Jiacun Wang, USA; Prof Akihiko
Hanafusa, USA; Dr. Nguyen Dang Nam, Vietnam; and Dr. Naeem M. S. Honnoon,
Malasiya, for extending their support and cooperation.
We would like to express our gratitude to all session chairs, viz. Dr. M. Ramakr-
ishna Murthy, ANITS, Visakhapatnam; Dr. Jiacun Wang, Monmouth University,
USA; Dr. Ramamurthy Garimella, Mahindra University, Hyderabad; Dr. Divya
Midhun Chakravarthy, Lincoln University College, Malaysia; Dr. Naga Mallikarjuna
Rao Dasari, Federation University, Australia; Dr. Hushairi Zen, UNIMAS, Malaysia;
Dr. Samrat Lagnajeet Sabat, University of Hyderabad; Dr. Bharat Gupta, NIT Patna;
Dr. Mukil Alagirisamy, Lincoln University College, Malaysia; Dr. G. Sharada, HOD,
IT , MRCET; Dr. T. Venu Gopal, HOD, CSE; Dr. K. Mallikarjuna Lingam, HOD,
ECE; Dr. S. Shanthi, Professor CSE, MRCET; Dr. N. S. Gowri Ganesh Professor IT,
MRCET; Dr. V. Chandrasekar, Professor CSE, MRCET; Dr. A. M. Viswa Bharathy,
Professor CSE, MRCET, for extending their support and cooperation.
We are indebted to the program committee members and external reviewers who
have produced critical reviews in a short time. We would like to express our special
gratitude to publication chair Dr. Suresh Chandra Satapathy, KIIT, Bhubaneswar,
xi
xii Preface
for his valuable support and encouragement till the successful conclusion of the
conference.
We express our heartfelt thanks to our Chief Patron Sri. Ch. Malla Reddy,
Founder Chairman, MRGI, Patrons Sri. Ch. Mahendar Reddy, Secretary, MRGI,
Sri. Ch. Bhadra Reddy, President, MRGI, Convener Prof. P. Sanjeeva Reddy, Dean,
International Studies, and Dr. T. Venugopal, Dean, MRCET.
We would also like thank the Organizing Secretaries Dr. K. Mallikarjuna HOD,
ECE, Dr. T. Venu Gopal, HOD, CSE, and Dr. G. Sharada, HOD, IT, for their valuable
contribution. Our thanks also to all coordinators and the organizing committee as well
as all the other committee members for their contribution in successful conduct of
the conference.
Last but certainly not least, our special thanks to all the authors without whom
the conference would not have taken place. Their technical contributions have made
our proceedings rich and praiseworthy.
Hyderabad, India V. Sivakumar Reddy

Hyderabad, India V. Kamakshi Prasad
West Long Branch, NJ, USA Jiacun Wang
Nashik, India K. T. V. Reddy
Contents
Data Preprocessing and Finding Optimal Value of K for KNN Model . . . 1

Roopashri Shetty, M. Geetha, Dinesh U. Acharya, and G. Shyamala
Prediction of Cardiac Diseases Using Machine Learning Algorithms . . . 11
J. Suneetha, Husna Tabassum, Pundalik Chavan, and N. R. Deepak
A Comprehensive Approach to Misinformation Analysis
and Detection of Low-Credibility News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Meghana Mukunda Joshi, Niyathi Srinivasan Kumbale,
Nikhil S. Shastry, Mohammed Omar Khan, and N. Nagarathna
Evaluation of Machine Learning Algorithms
for Electroencephalography-Based Epileptic Seizure State
Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Vibha Patel, Jaishree Tailor, and Amit Ganatra
Lung Disease Detection and Classification from Chest X-Ray
Images Using Adaptive Segmentation and Deep Learning . . . . . . . . . . . . . 49
Shimpy Goyal and Rajiv Singh
A Quantitative Analysis for Breast Cancer Prediction Using
Artificial Neural Network and Support Vector Machine . . . . . . . . . . . . . . . 59
Harnehmat Walia and Prabhpreet Kaur
Heart Disease Prediction Using Deep Learning Algorithm . . . . . . . . . . . . . 83
Gouthami Velakanti, Shivani Jarathi, Malladi Harshini,
Praveen Ankam, and Shankar Vuppu
Tracking Misleading News of COVID-19 Within Social Media . . . . . . . . . 97
Mahboob Massoudi and Rahul Katarya
Energy aware Multi-chain PEGASIS in WSN: A Q-Learning
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Ranadeep Dey, Parag Kumar, and Guha Thakurta
xiii
xiv Contents
Textlytic: Automatic Project Report Summarization Using NLP

Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Riya Menon, Namrata Tolani, Gauravi Tolamatti, Akansha Ahuja,
and R. L. Priya
Management of Digital Evidence for Cybercrime
Investigation—A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Chougule Harshwardhan, Dhadiwal Sunny, Lokhande Mehul,
Naikade Rohit, and Rachana Patil
Real-Time Human Pose Detection and Recognition Using
MediaPipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Amritanshu Kumar Singh, Vedant Arvind Kumbhare, and K. Arthi
Charge the Missing Data with Synthesized Data by Using SN-Sync
Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Yeswanth Surya Srikar Nuchu and Srinivasa Rao Narisetty
Discovery of Popular Languages from GitHub Repository: A Data
Mining Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
K. Jyothi Upadhya, B. Dinesh Rao, and M. Geetha
Performance Analysis of Flower Pollination Algorithms Using
Statistical Methods: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Pratosh Bansal and Sameer Bhave
Counterfactual Causal Analysis on Structured Data . . . . . . . . . . . . . . . . . . 187
Swarna Kamal Paul, Tauseef Jamal Firdausi, Saikat Jana,
Arunava Das, and Piyush Nandi
Crime Analysis Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Sree Rama Chandra Murthy Akuri, Manikanta Tikkisetty,
Nandini Dimmita, Lokesh Aathukuri, and Shivani Rayapudi
Multi-model Neural Style Transfer (MMNST) for Audio and Image . . . . 205
B. Vishal, K. G. Sriram, and T. Sujithra
Forecasting of COVID-19 Using Supervised Machine Learning
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Y. Vijay Bhaskar Reddy, Vyshnavi Adusumalli,
Venkata Bharath Krishna Boggavarapu, Mahesh Babu Bale,
and Archana Challa
Feature Extraction from Radiographic Skin Cancer Data Using
LRCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
V. S. S. P. Raju Gottumukkala, N. Kumaran, and V. Chandra Sekhar
Shared Filtering-Based Advice of Online Group Voting . . . . . . . . . . . . . . . 251
Madhari Kalyan and M. Sandeep
Contents xv
Mining Challenger from Bulk Preprocessing Datasets . . . . . . . . . . . . . . . . . 257

A. Sreelekha and P. Dileep
Prioritized Load Balancer for Minimization of VM and Data
Transfer Cost in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Sudheer Mangalampalli, Pokkuluri Kiran Sree, K. V. Narayana Rao,
Anuj Rapaka, and Ravi Teja Kocherla
Smart Underground Drainage Management System Using Internet
of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
K. Venkata Murali Mohan, K. M. V. Madan Kumar, Sarangam Kodati,
and G. Ravi
IoT-based System for Health Monitoring of Arrhythmia Patients
Using Machine Learning Classification Techniques . . . . . . . . . . . . . . . . . . . 283
Sarangam Kodati, Kumbala Pradeep Reddy, G. Ravi, and Nara Sreekanth
EHR-Sec: A Blockchain Based Security System for Electronic
Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Siddhesh Deore, Ruturaj Bachche, Aditya Bichave, and Rachana Patil
End-to-End Speaker Verification for Short Utterances . . . . . . . . . . . . . . . . 305
S. Ranjana, J. Priya, P. S. Reenu Rita, and B. Bharathi
A Comprehensive Analysis on Multi-class Imbalanced Big Data
Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
R. Madhura Prabha and S. Sasikala
Efficient Recommender System for Kid’s Hobby Using Machine
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Sonali Sagarmal Lunawat, Abduttayyeb Rampurawala, Sneha Pujari,
Siddhi Thawal, Jui Pangare, Chetana Thorat, and Bhushan Munot
Efficient Route Planning Supports Road Cache . . . . . . . . . . . . . . . . . . . . . . . 337
Siddi Madhusudhan Rao and M. Vijayakamal
Programming Associative Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Garimella Ramamurthy, Tata Jagannadha Swamy,
and Yaminidhar Reddy
Novel Associative Memories Based on Spherical Separability . . . . . . . . . . 351
Garimella Ramamurthy and Tata Jagannadha Swamy
An Intelligent Fog-IoT-Based Disease Diagnosis Healthcare System . . . . 359
Chandan Kumar Roy and Ritesh Sadiwala
Pre-processing of Linguistic Divergence in English-Marathi
Language Pair in Machine Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Simran N. Maniyar, Sonali B. Kulkarni, and Pratibha R. Bhise
xvi Contents
Futuristic View of Internet of Things and Applications

with Prospective Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
M. Prasad, T. K. Vijay, Morukurthi Sreenivasu,
and Bekele Worku Agajyelew
Identifying and Eliminating the Misbehavior Nodes in the Wireless
Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Navaneethan Selvaraj, E. S. Madhan, and A. Kathirvel
Deep Learning Approach for Image-Based Plant Species
Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
E. Venkateswara Reddy, G. S. Naveen Kumar, Baggam Swathi,
and G. Siva Naga Dhipti
Inventory, Storage and Routing Optimization with Homogenous
Fleet in the Secondary Distribution Network Using a Hybrid VRP,
Clustering and MIP Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Akansha Kumar and Ameya Munagekar
Evaluation and Comparison of Various Static and Dynamic Load
Balancing Strategies Used in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . 429
Homera Durani and Nirav Bhatt
Dielectric Resonator Antenna with Hollow Cylinder for Wide
Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Gaurav Kumar and Rajveer Singh Yaduvanshi
Recent Techniques in Image Retrieval: A Comprehensive Survey . . . . . . 447
K. D. K. Ajay and V. Malleswara Rao
Medical Image Fusion Based on Energy Attribute and PA-PCNN
in NSST Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
K. Vanitha, D. Satyanarayana, and M. N. Giri Prasad
Electrical Shift and Linear Trend Artifacts Removal from Single
Channel EEG Using SWT-GSTV Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Sayedu Khasim Noorbasha and Gnanou Florence Sudha
Forecasting Hourly Electrical Energy Output of a Power Plant
Using Parametric Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
Ch. V. Raghavendran, G. Naga Satish, Vempati Krishna,
and R. V. S. Lalitha
SOI FinFET-Based 6T SRAM Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
V. Vijayalakshmi and B. Mohan Kumar Naik
Cataract Detection Using Deep Convolutional Neural Networks . . . . . . . . 505
Aida Jones, K. Abisheek, R. Dinesh Kumar, and M. Madesh
Contents xvii
Comparative Analysis of Body Biasing Techniques for Digital

Integrated Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
G. Srinivas Reddy, D. Khalandar Basha, U. Somanaidu,
and Rollakanti Raju
Optical Mark Recognition with Facial Recognition System . . . . . . . . . . . . 535
Ronak Shah, Aryak Bodkhe, Sudhanshu Gupta, and Vinayak Gaikwad
Evaluation of Antenna Control System for Tracking Remote
Sensing Satellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
A. N. Satyanarayana, Bandi Suman, and G. Uma Devi
Face Recognition Using Cascading of HOG and LBP Feature
Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
M. Chandrakala and P. Durga Devi
Design of Wideband Metamaterial and Dielectric
Resonator-Inspired Patch Antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
Ch. Manohar Kumar, V. A. Sankar Ponnapalli, T. Vinay Simha Reddy,
N. Swathi, and Undrakonda Jyothsna
Basic Framework of Different Steganography Techniques
for Security Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
R. Chinna Rao, P. V. Y. Jayasree, S. Srinivasa Rao,
G. Srinivasa Yeshwanth, K. R. S. Megana, K. Shreya, and K. Suprasen
Call Admission Control for Interactive Multimedia Applications
in 4G Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
Kirti Keshav, Ashish Kumar Pradhan, T. Srinivas,
and Pallapa Venkataram
AI-Based Pro Mode in Smartphone Photography . . . . . . . . . . . . . . . . . . . . . 597
Paras Nagpal, Ashish Chopra, Shruti Agrawal, and Anuj Jhunjhunwala
A ML-Based Model to Quantify Ambient Air Pollutants . . . . . . . . . . . . . . 611
Vijay A. Kanade
Multimodal Biometric System Using Undecimated Dual-Tree
Complex Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
N. Harivinod and B. H. Shekar
Design of Modified Dual-Coupled Linear Congruential Generator
Method Architecture for Pseudorandom Bit Generation . . . . . . . . . . . . . . 633
Cherukumpalem Heera and Vadthyavath Shankar
Performance Analysis of PAPR and BER in FBMC-OQAM
with Low-complexity Using Modified Fast Convolution . . . . . . . . . . . . . . . 643
D. Rajendra Prasad, S. Tamil, and Bharti Chourasia
xviii Contents
Sign Language Recognition Using Convolution Neural Network . . . . . . . 655

Varshitha Sannareddy, Mounika Barlapudi,
Venkata Koti Reddy Koppula, Gali Reddy Vuduthuri,
and Nagarjuna Reddy Seelam
Key-based Obfuscation of Digital Design for Hardware Security . . . . . . . 663
Hina Nazeem and Deepthi Amuru
Internet of Things-based Cardless Banking System
with Fingerprint Authentication Using Raspberry Pi . . . . . . . . . . . . . . . . . . 673
Eliyaz Mahammad, Nagarjuna Malladhi, G. Bhaskar Phani Ram,
and K. Yeshwanth
Cluster Adaptive Stationary Wavelet Diffusion . . . . . . . . . . . . . . . . . . . . . . . 681
Ajay Kumar Mandava and Emma E. Regentova
Low Complexity and High Speed Montgomery Multiplication
Based on FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693
B. Jyothi, M. Sucharitha, and Anitha Patibandla
An Efficient Group Key Establishment for Secure Communication
to Multicast Groups for WSN-IoT Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
Thirupathi Durgam and Ritesh Sadiwala
Design of Sub-volt High Impedance Wide Bandwidth Current
Mirror for High Performance Analog Circuit . . . . . . . . . . . . . . . . . . . . . . . . 715
P. Anil Kumar, S. Tamil, and N. Raj
Low-Voltage Low-Power Design of Operational Transconductance
Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
Rajesh Durgam, S. Tamil, and Nikhil Raj
Automatic Detection of Cerebral Microbleed Using Deep Bounding
Box Based Watershed Segmentation from Magnetic Resonance
Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
T. Grace Berin and C. Helen Sulochana
New Efficient Tunable Window Function for Designing Finite
Impulse Response Digital Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
Raj Kumar and R. P. Rishishwar
Brain Tumour Detection Using Convolutional Neural Networks
in MRI Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
Dontha Pavani, Kavali Durgalaxmi, Bingi Sai Datta, and D. Nagajyothi
Design of Circular Patch Antenna with Square Slot for Wearable
Ultra-Wide Band Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
S. Rekha, Boga Vaibhav, Guda Rahul Teja, and Pulluri Sathvik
Contents xix
Design of Baugh-Wooley Multiplier Using Full Swing GDI

Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 769
Vamshi Ponugoti, Seetaram Oruganti, Sahithi Poloju,
and Srikanth Bopidi
VLSI Implementation of the Low Power Neuromorphic Spiking
Neural Network with Machine Learning Approach . . . . . . . . . . . . . . . . . . . 781
K. Venkateswara Reddy and N. Balaji
IoT-Based Energy Saving Recommendations by Classification
of Energy Consumption Using Machine Learning Techniques . . . . . . . . . 795
G. Siva Naga Dhipti, Baggam Swathi, E. Venkateswara Reddy,
and G. S. Naveen Kumar
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 809

About the Editors
V. Sivakumar Reddy is Professor at the Department of Electronics and Communi-

cation Engineering, Malla Reddy College of Engineering and Technology, and has
more than 20 years of teaching and research experience. He completed his B.E. in
Electronics and Communication Engineering from S. V. University, his M.Tech. in
Digital Systems at JNT University and his Ph.D. in Electronics and Communication
Engineering at IIT Kharagpur. His areas of research interest include multi-media
signal processing and communication protocols. He has published more than 120
papers in peer-reviewed journals and reputed conferences. He is a member of several
academic bodies, such as IETE, IEEE, ISTE and CSI. He is also a reviewer for several
IEEE journals. He was awarded as “Best Teacher” in three consecutive academic
years with citation and cash award. He is the recipient of “India Jewel Award” for
outstanding contribution to the research in the field of Engineering and Technology.
V. Kamakshi Prasad is Professor in the Department of Computer Science and

Engineering in JNTU Hyderabad; he completed his Ph.D. in speech recognition
at the Indian Institute of Technology Madras and his M.Tech. in Computer Science
and Technology at Andhra University in 1992. He has more than 21 years of teaching
and research experience. His areas of research and teaching interest include speech
recognition and processing, image processing, pattern recognition, ad hoc networks
and computer graphics. He has published several books, chapters and research papers
in peer-reviewed journals and conference proceedings. He is also an editorial board
member of the International Journal of Wireless Networks and Communications and
a member of several academic committees.
Jiacun Wang received a Ph.D. in Computer Engineering from Nanjing University

of Science and Technology (NJUST), China, in 1991. He is currently Professor at the
Computer Science and Software Engineering Department at Monmouth University,
West Long Branch, New Jersey. From 2001 to 2004, he was a member of scientific
staff at Nortel Networks in Richardson, Texas. Prior to joining Nortel, he was a
research associate at the School of Computer Science, Florida International Univer-
sity (FIU) at Miami and Associate Professor at NJUST. He has published numerous
xxi
xxii About the Editors
books and research papers and is Associate Editor of several international journals.
He has also served as a program chair, a program co-chair, a special sessions chair
and a program committee member for several international conferences. He is the
secretary of the Organizing and Planning Committee of the IEEE SMC Society and
has been a senior member of IEEE since 2000.
K. T. V. Reddy Alumni of IIT Bombay, is presently working as Campus Director

and Principal, Pravara Technical Education Campus, Sir Visvesvaraya Institute of
Technology (SVIT) Nashik. He was Former Director at PSIT Kanpur. During his
teaching career at Fr C Rodrigues Institute of Technology (FCRIT), Vashi, he culti-
vated, promoted and developed the Department of ET to the extent that it started being
considered as one of the best institutes under the University of Mumbai. He published
over 100 papers in the national and international journals and conferences, delivered
over 200 invited talks and organized over 150 conferences/workshops. He is a fellow
member of IETE, a senior member of IEEE and a life member of ISTE, ACM and
CSI. He has organized several international and national conferences/workshops.
Data Preprocessing and Finding Optimal
Value of K for KNN Model
Roopashri Shetty, M. Geetha, Dinesh U. Acharya, and G. Shyamala
Abstract K-nearest neighbor (KNN) is a simple classifier used in the classification

of medical data. The performance of KNN depends on the data used for classification
and the number of neighbors considered (K). Data preprocessing is considered to be
an important step in data mining to improve the quality of the data. Preprocessing
involves data cleaning by removing duplicates and noise, data normalization, feature
selection, etc. Hence, in this paper, preprocessing the data is done by removing the
irrelevant attributes present in the dataset using correlation matrix, and suitable value
of K is chosen for KNN algorithm which helps in improving the performance of KNN
model.
Keywords Accuracy · Classification · KNN · Preprocessing · Feature selection
1 Introduction
Data mining has major application in medical field. Medical practitioners have come
up with a lot of algorithms which help in the prediction of diseases. K-nearest neigh-
bors (KNN) algorithm is a supervised method of data mining which is widely used
in the classification of disease [1].
R. Shetty (B) · M. Geetha · D. U. Acharya

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal
Academy of Higher Education, Manipal, Karnataka 576104, India
e-mail: roopashri.shetty@manipal.edu
M. Geetha
e-mail: geetha.maiya@manipal.edu
D. U. Acharya
e-mail: dinesh.acharya@manipal.edu
G. Shyamala
Department of Obstetrics and Gynaecology, Kasturba Medical College, Manipal Academy of
Higher Education, Manipal, Karnataka 576104, India
e-mail: shyamala.g@manipal.edu
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 1
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_1
2 R. Shetty et al.
Preprocessing is an important step in data mining. Presence of missing attributes,

attribute values, noise, and duplicate values degrade the quality of the dataset. Hence,
the data must be clean to consider it for further processing. Feature selection is
one of the important factors in obtaining the accurate classification results [2]. The
appropriate choice of relevant features from the given dataset results in accurate
classification with acceptable computational accuracy.
For the classification of a given data, KNN is considered to be a simple method of
classification. KNN classifies the data by assigning that data with most represented
label among K-nearest samples. The decision of classification is made by investing
the label of its K-nearest data points and taking the vote. The class of the new data
is decided by considering the class which is more frequent among the K-nearest
neighbors. KNN is simple to understand and implement; the method is considered
to be effective for large training dataset. Appropriate value for K has a major impact
on the performance of the model.
The major disadvantage of KNN is its high computational cost due to its distance
calculation for every data instance to all other samples. Value of K will be one of the
parameters given as input in KNN. The accuracy of classification is sensitive to the
value K. Hence, designing a method for selecting value for K is important.
This work preprocesses the tumor dataset to remove all the irrelevant attributes
present in the dataset and selects the relevant features using correlation matrix. The
preprocessed dataset is then given to KNN algorithm to find the suitable value of K
which improves the overall performance of the classifier.
2 Literature Review
The performance of KNN classifier [3] is mainly dependent on the metric used to
calculate the distance in order to identify the K-nearest neighbors of a data point.
Generally, standard Euclidean distance is used. Huge storage of data is used in this
work so that diagnosis based on the historical data can be done. It computes the prob-
ability of occurrence of a particular ailment by using KNN algorithm which increases
the accuracy of the diagnosis. The algorithm can be used to enhance the automated
diagnoses, which include diagnosis of multiple diseases with similar symptoms.
Garcia et al. [4] discussed various techniques of preprocessing data like data
reduction, data normalization, data integration, data transformation, handling missing
values, feature selection, and dealing with noisy data.
Jiang et al. [5] summarized various drawbacks of KNN and discussed a method
to overcome that. The improvement of distance function is done by eliminating the
least relevant attributes during the distance calculation between two data points.
Parvin et al. [6] applied weighted KNN on test samples after checking the validity
of all samples in the trained dataset. The validity of the data considers robustness
and stability value of trained samples with respect to all its neighbors.
Song et al. [7] proposed clustering-based feature selection algorithm named FAST.
FAST divides the features into clusters, and the feature that is highly associated
Data Preprocessing and Finding Optimal Value of K for KNN Model 3
to the target class from each cluster is selected which forms a subset of features.
This produces subset of useful and independent features. Minimum spanning tree
clustering method is used for clustering.
Li et al. [8] discussed various feature selection methods like stable feature selec-
tion, multi-view feature selection, distributed feature selection online feature selec-
tion. The problems with these methods and their applications are also analyzed and
discussed.
Salama et al. [9] attempted to improve Parkinson’s diagnosis using multiple feature
evaluation approach (MFEA) and classification using machine learning algorithms.
MFEA selected the best set of features which helped in improving the performance
of the model.
Chan et al. [10] proposed a feature selection method based on KNN ensemble
classifier. It finds the significant attribute using an iterative approach. When the
number of features extracted increases compared to the number of observations,
effectiveness and robustness of the model are increased.
It is understood from the literature review that KNN algorithm is comparatively
slow since all the instances have to be reviewed for every new data point. The perfor-
mance of KNN algorithm degrades with irrelevant attributes present in the dataset.
Deciding the optimal value of K helps in improving the performance of the system.
The work concentrates on filtering and eliminating irrelevant attributes from the
dataset and finding suitable value of K for KNN algorithm.
3 Methodology
The overall methodology of modified KNN is depicted in Fig. 1. To remove the

irrelevant attributes from the dataset, covariance matrix is used. Covariance matrix
is a square matrix, which gives the covariance between every pair of attributes.
Suitable value of K for KNN model is obtained by the following two methods.
1. By calculating misclassification error.
2. By finding accuracy for different values of K.
After obtaining the value of K, KNN is run with this value and is compared with
the existing KNN model.
Tumor dataset is taken from UCI repository. The dataset contains 568 records
with 31 attributes, namely ID, diagnosis, radius mean, texture mean, perimeter mean,
area mean, smoothness mean, compactness mean, concavity mean, concave points
mean, radius worst, texture worst, perimeter worst, area worst, smoothness worst,
compactness worst, concavity worst, concave points worst, symmetry worst, fractal
dimension worst. The sample dataset is shown in Fig. 2.
4 R. Shetty et al.
Fig. 1 Schematic
representation of proposed
algorithm
Fig. 2 Sample dataset

3.1 Preprocessing
The records are read from the dataset using the command:
data = pd.read csv(‘cancer.csv‘).
The algorithm is implemented in Python. The dataset is split in the ratio 70:30
for training and testing, respectively. Tenfold technique is used for validating the
dataset.
The distance between the new data point and the points of existing trained data
samples is calculated using the Euclidean distance, and K points will be selected
based on the smallest K Euclidean distance. New data point is put into a class where
majority of K points belong.
3.2 Feature Selection Method
A feature selection method is implemented which extracts relevant features by

removing irrelevant attributes. If two features are correlated, then their significance
in predicting the label of a class is less as they reveal the same characteristics and so
are irrelevant and one of them can be removed.
Correlated features = set () will create a set “correlated features” which contains
the names of all the features that are correlated.
Correlation matrix = data. corr() will create a covariance matrix for all the
attributes of dataset.
This is executed for all the attributes of the dataset present in correlation matrix
to select the columns with correlation value greater than the threshold 0.8. These
attributes are added to correlated features set which is shown in the following code
snippet:
The attributes present in the correlated features set have to be removed from the
training as well test data as they are irrelevant which can be done by the following
code snippet:
6 R. Shetty et al.
Fig. 3 Features removed using feature selection method
Using the feature selection method, 17 features are removed from the original
dataset and are shown in Fig. 3.
3.3 Choosing Suitable Value of K
KNN algorithm requires the user to input the value of K. User intervention is required
for doing this. In order to avoid the manual input K, a method is required which
automatically decides the value of K. Suitable value of K improves the overall
performance of the classifier.
Two strategies are considered to get the suitable value of K.
1. By computing the misclassification error: KNN is executed for different values
of K and their misclassification errors are compared. The model with the
minimum misclassification error is considered. Misclassification error for
different values of K is shown in Fig. 4.
Fig. 4 Misclassification error

Fig. 5 Accuracy plot
2. Finding the accuracy of the model for different values of K: The accuracies for
the models with K value as 2, 3, 4, and 6 are plotted, and the model with the
best accuracy is considered. Accuracy plot for different values of K is shown in
Fig. 5.
It is observed from the above two strategies that value 3 is suitable for K.
4 Results and Analysis
Results are analyzed when KNN algorithm is executed with following two cases.
1. Without preprocessing and manual input of K, (KNN)
2. With feature selection method for preprocessing and choosing suitable value of
K (modified KNN)
Fig. 6 Performance of KNN

8 R. Shetty et al.
Fig. 7 Performance of modified KNN
Fig. 8 Confusion matrix for

KNN
Fig. 9 Confusion matrix for

modified KNN
Figure 6 shows the performance of KNN, and Fig. 7 shows the performance of
modified KNN.
Confusion matrix for KNN is shown in Fig. 8
Total, n = 171,
Accuracy = (T.P. + T.N.)/Total = (94 + 54)/171 = 0.865
Misclassification rate = (F.P. + F.N.)/Total = 23/171 = 0.135.
Confusion matrix for modified KNN is shown in Fig. 9.
Total, n = 171,
Accuracy = (104 + 49)/171 = 0.895
Misclassification rate = 9/171 = 0.105.
From the above analysis, it can be observed that the accuracy of KNN is 86.5%,
and the accuracy of modified KNN is 89.5%.
5 Conclusion and Future Scope
KNN algorithm is one of the simple classifiers for classifying medical data. But, the
performance of the model depends on the data used and the value of K considered.
Hence, preprocessing the data by removing the irrelevant attributes present in the
dataset and choosing the suitable value for K help in improving the performance of
KNN model.
The dataset considered here is small, and this can be extended for large datasets.
Other preprocessing techniques, normalization can be used to improve the quality of
the data.
References
1. Z. Deng et al., Efficient kNN classification algorithm for big data. Neurocomputing 195, 143–
148 (2016)
2. H.K. Chantar, D.W. Corne, Feature subset selection for Arabic document categorization using
BPSO-KNN, in 2011 Third World Congress on Nature and Biologically Inspired Computing
(IEEE, 2011), pp. 546–551
3. H.S. Khamis, K.W. Cheruiyot, S. Kimani, Application of k-nearest neighbor classification in
medical data mining. Int. J. Inf. Commun. Technol. Res. 4(4) (2014)
4. S. Garcia, J. Luengo, F. Herrera, Data Preprocessing in Data Mining (Springer, 2015)
5. L. Jiang et al., Survey of improving k-nearest-neighbor for classification, in Fourth International
Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), vol. 1 (IEEE, 2007),
pp. 679–683
6. H. Parvin, H. Alizadeh, B. Minaei-Bidgoli, MKNN: Modified k-nearest neighbor, in Proceed-
ings of the World Congress on Engineering and Computer Science, vol. 1 (Citeseer,
2008)
7. Q. Song, J. Ni, G. Wang, A fast clustering-based feature subset selection algorithm for high-
dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2011)
8. Y. Li, T. Li, H. Liu, Recent advances in feature selection and its applications. Knowl. Inf. Syst.
53(3), 551–577 (2017)
9. S.A. Mostafa et al., Examining multiple feature evaluation and classification methods for
improving the diagnosis of Parkinson’s disease. Cogn. Syst. Res. 54, 90–99 (2019)
10. C.H. Park, S.B. Kim, Sequential random k-nearest neighbor feature selection for high-
dimensional data. Expert Syst. Appl. 42(5), 2336–2342 (2015)
Prediction of Cardiac Diseases Using
Machine Learning Algorithms
J. Suneetha, Husna Tabassum, Pundalik Chavan, and N. R. Deepak
Abstract Coronary illness or heart strokes are most common diseases across the
world. To reduce this disease, prediction of heart strokes is needed to be done.
Many machine learning are already existing in the medical field. Machine learning
algorithms can be used to predict coronary illness or heart strokes. In this paper,
various machine learning algorithms are applied on the dataset to predict the heart
strokes or coronary illness. Dataset used is Cleveland’s dataset, where it contains 14
attributes which are the medical parameters of the patients. Few attributes are the
results obtained from medical tests. Algorithms like decision tree, logistic regres-
sion, random forest, MLP, bagging are applied on the dataset to predict heart strokes.
Implementation is done by dividing dataset into training and testing datasets, and
algorithms are applied to find out the accuracies in predicting the heart strokes. Imple-
mentation is done using RStudio and WEKA. Even, we applied the same implemen-
tation by eliminating few attributes in dataset. The idea behind reducing (eliminating)
attributes is that few attributes in the dataset are results of medical tests done to the
patients. Sometimes, medical tests can be costly. So, if we can get the same accuracy
with reduced attributes, we can conclude that prediction can be done with a smaller
number of medical tests. From the implementation and analysis done, it shows that
logistic regression provides the highest accuracy followed by decision tree.
Keywords Decision tree · Logistic regression · Random forest · MLP · Bagging
1 Introduction
Nowadays, machine learning is everywhere. Every field is now adapting machine

learning. Especially, in medical field, machine learning models are used in analyzing,
predicting, or detecting various kinds of diseases. According to the survey made by
many health organization bodies [1–3], 80% deaths are because of heart strokes and
heart-related diseases. In order to reduce the deaths rate, prediction of heart strokes
is the solution. This thesis will explain completely about how machine learning is
J. Suneetha (B) · H. Tabassum · P. Chavan · N. R. Deepak

Department of CSE, CMRIT, Bangalore, India
12 J. Suneetha et al.
used in predicting of heart strokes or heart attacks. Heart strokes or coronary illness
is a major and most commonly occurring disease. It can affect a person of any age
between 35 and 75, and elderly get effected mostly. Death rate of heart strokes is
very high [4, 5].
In order to reduce the death rate, predicting heart strokes beforehand is the solu-
tion. So, predicting is the only way to control death rate and stop the strokes from
occurring. So, the first thing that has to be done is “prediction.” Now, prediction
can be done using machine learning. Machine learning models and techniques are
used for analyzing and predicting. There are many machine learning algorithms
exist that can do the predictions. But, an effective way of prediction should be
done, i.e., which algorithms provide better accuracy is to be done. Thus, this is
the thought process involved in choosing the title “heart disease prediction using
different machine learning algorithms.”
Predictions are done using different algorithms, and analysis should be done in
order to choose the best algorithm. Best suitable algorithm can be selected by consid-
ering the accuracy in prediction. The algorithms which give the highest accuracy are
considered as effective algorithm for predicting. Machine learning techniques and
models help in analyzing the data and make predictions. In order to reduce the
death rate, we need a model that detects or predicts a heart stroke. Predicting is the
main solution for these kinds of diseases. So, our first main solution to problem is
predicting using machine learning. Now, we need dataset in order to analyze and
predict. In this paper, I have used Cleveland dataset from UCI repository. Cleveland
dataset contains 14 attributes. In which, 13 attributes are the medical parameters of
patients, and 1 attribute shows end result, i.e., the heart stroke result (1—if that heart
stroke is occurred and 0—for no stroke). Few attributes are the results obtained from
medical tests. Medical tests have been carried out, and the results are considered.
Thus, Cleveland dataset is best suited for analyzing heart strokes occurrences [7–9].
Now, we have dataset to analyze. Next comes the prediction process. Prediction
can be done by using various types of algorithms. Now, the question is what is the
best algorithm to use to get accurate predictions. In order to find the best algorithms,
all algorithms should be implemented on dataset, and the accuracies need to be
compared. Thus, the main objective is to predict the heart strokes and also to provide
an effective suitable algorithm for the prediction.
The main objective of the paper is predicting the heart strokes and finding an
effective and suitable algorithm that provides the highest prediction accuracy from
the used algorithms. After getting the suitable algorithms, the implementation can
be further continued by eliminating attributes from dataset.
We have mentioned that Cleveland dataset contains the results of medical tests.
Sometimes, medical tests can be expensive. But if we get the same accuracy even after
eliminating few attributes, we can conclude that money spent on medical tests can be
reduced. For example, assume that we are getting 85% accuracy for 14 attributes, and
now, we have eliminated a least significant attribute, and we got same 85% accuracy
[10]. Then, we can conclude that few medical tests can be eliminated or ignored which
further concludes that predictions can be done using a smaller number of medical
Prediction of Cardiac Diseases Using Machine Learning Algorithms 13
tests. The objective here is to obtain high accuracy results even after eliminating
some attributes.
2 Existing System
In Ref. [1], the authors have provided a summarization on the current research work
on predicting cardiovascular diseases. They have provided information regarding the
types of heart diseases, the types of data mining techniques, and also the various data
mining available for analyzing and predicting the disease. They have also suggested
that more data cleaning and pruning can be done to provide more accuracy in
predicting [1]. The authors implemented C4.5 and PCL approaches for using the
traditional data and proteomic profiling data rules. In general, they are rule-based
classifiers. Implementation was done on bio-medical data taken from UCI reposi-
tory. The most preferred method is C4.5 with two issues, single coverage constraint
and fragmentation problem which affect the accuracy, but this weakness or disad-
vantage is overcome by PCL which is superior to bagging and boosting. The C4.5
approach with the issues affecting its accuracy C4.5 is a decision tree-based single
classifier. PCL uses significant rules which are followed by decision trees which help
in overcoming the issues of C4.5 [2].
The neural systems have demonstrated to be the most well-known and advancing
part machine learning. Multilayer perceptron is used to predict the heart disease rate.
It is a supervised neural network algorithm. It has three layers input, output, and
hidden layers between input and output layer. In this paper, the general data of the
patients are collected like age, sex, blood pressure, diabetes, cholesterol, obesity,
and heart rate. And, the data they have collected are from devices and sensors like
fitbit, Alivekor, Healthgare. They have considered the listed attributes. They also
mentioned that the data in Cleveland dataset are results of expensive medical tests.
So, they have considered their own generic parameters and applied multilayered
perceptron [3].
In [4], authors have used the logistic regression that shows the logistic curve
(between). The linear model equation is: y = b0 + b1 x, (y = mx + c) where c is the
constant and m is the slope which defines the steepness of the curve. Logistic regres-
sion and decision tree are used to make predictions [4]. The authors implemented
naive Bayes and hidden naive Bayes to predict the heart strokes. Hidden naive Bayes
gives a remarkable performance than traditional naive Bayes algorithm. They used
hidden naive Bayes in order to provide an accurate model for cardiovascular disease.
With respect to attribute dependence, hidden naive Bayes is more accurate classifi-
cation. It is a structure extension-based program and needs more time for training.
This proposed approach is done by discretization and IQR filter to increase the HNB
efficiency. With the dependent attributes, they have 100% accuracy [5]. The authors
have proposed the use of data mining algorithm in identification of heart disease with
an accuracy of 52.33%. They have combined the attributes related to ECG and the
clinical symptoms of the patients to detect the heart disease.
The algorithms used by this system are naive Bayes algorithm, decision list algo-
rithm, and KNN algorithm [6]. Extreme learning machine techniques are where feed-
forward neural network is used for classification and regression. The main advantage
of EML is that it is the fast-learning algorithm without the re-iterations. The dataset
used is Cleveland dataset with 14 attributes. Prediction model is designed in such a
way that the output obtained is four groups (0–4). Instead of predicting heart strokes
as 0 or 1, it provides the range of health conditions. This is done in order to increase
the accuracy. The accuracy obtained is 80% [7].
The authors have proposed a scalable model that monitors the heart disease using
SPARK and Cassandra framework. This framework is relied on real-time classifica-
tion model for continuous track of patients. The model or the system has two main
objectives streaming processing and visualization. This is not a predictive model;
this is for monitoring the heart disease continuously [8]. Learning vector is a neural
network algorithm. It is a nearest vector neighbor classifier. They have implemented
the algorithm for different number of epochs and different neurons. The dataset they
have used is 14 attribute datasets from UCI repository. The predictions were based on
the accuracies obtained from the implementations. The performance of the algorithm
with different number of epochs and the performance of algorithm with different are
calculated and compared.
The accuracy obtained by the learning vector quantization is 85% [9]. In this
paper Naive Bayes Classifier and the other is Decision Tree. They used WEKA
tool for building the model and predicting the heart strokes. WEKA tools are for
applying data mining techniques and machine learning algorithms. WEKA tools
reduce the complexities of writing code. Comparative analysis of algorithms is done
and concluded that naive Bayes classifier is accurate than decision tree [10].
3 Proposed Work
This section discusses the flow of the implementation and the methodologies
followed. I have got the dataset. And, there are few fundamental checking needs
to be done before using the data, like checking whether missing values are there in
dataset. If missing values are found, then we need to add the missing values using
mean, median, or mode method. In the Cleveland dataset, there are no missing values.
The block diagram in Fig. 1 shows the outline of the process.
The dataset here represents the Cleveland dataset. The data preprocessing can
be done before and after splitting the data. The preprocessing includes the cleaning
of the data. The dataset should be divided into two parts, one is training dataset
and the other into testing dataset. Once the data are split into training and testing,
we can start implementing the algorithms on the training data. The training of data
provides accuracy of predictions. Once training is done, we need to test the algorithm
on testing dataset. The actual prediction or the accuracy is obtained by the testing
data. This block diagram just represents the outline of the process. Let us get a
Fig. 1 Block diagram
Fig. 2 Flow diagram

detailed understanding using a flow diagram. Fig. 2 represents the flow diagram of
the implementation work.
From the flow diagram, it is clear that the data are split into training and testing
dataset. The preprocessing of the data is needed to be done. Forward selection and
backward substitution are done in order to get a subset of attributes, and then the
machine learning algorithms are needed to be applied.
Once training is done, we have to do the testing part where it provides the accu-
racy of the predictions. The implementation is carried out using Cleveland dataset.
Algorithms like decision tree, logistic regression, bagging, random tree, multilayer
perceptron, and, finally, random forest are used to predict the heart strokes. All
these algorithms are implemented on dataset RStudio, and WEKA tools are used
to implement the algorithms on dataset. Decision tree, bagging, and logistic regres-
sion are implemented in RStudio. Random forest and random tree and multiple
layer perceptron are done using WEKA tools. Sequence of steps is followed in the
implementation.
Step 1: Data preprocessing. Missing values are handled.
Step 2: Divide dataset into training set and testing set. This implementation I used
(60, 40), (70, 30), (80, 20) as training and testing ratios.
Step 3: Applied each algorithm and predicted the accuracy for 14 attribute datasets.
Step 4: Accuracies for each model are obtained.
Step 5: Compare and analyze the accuracies and provide an efficient model for
predicting heart disease.
Step 6: Eliminate the least significant attribute and repeat the implementation and
find the accuracies.
Step 7: Analyze both the accuracies obtained and provide the results.
Firstly, the missing values of the dataset are handled. Missing values can be
replaced by mean, median, mode values of the particular attributes. In the Cleveland
dataset, there were no missing values. In the second step, the dataset should be
divided into two parts, training and testing data. Generally, the training and testing
ratios must be 2/3 and 1/3, respectively. I have taken the three different set of ratios
for training and testing datasets. I have taken 60:40, 70:30, and 80:20 for training and
testing, respectively. All the algorithms are implemented to the training and testing
datasets.
Machine learning algorithms used are decision tree, random forest, logistic regres-
sion, multilayer perceptron, and bagging. After the implementation and getting the
accuracies of the all the algorithms on the dataset, create another dataset with reduced
attributes. In the reduced attribute dataset, eliminate two attributes “ca” and “thal.”
After eliminating two attributes from the dataset, apply the same implementation and
find out the accuracy. The reason behind reducing the elements is in order to check
whether to get same accuracy which is equal to actual accuracies.
3.1 Decision Tree
Decision tree comes under supervised machine learning algorithm. Decision tree can
be used on both numerical data as well as categorical data. Decision tree finally gives
solutions in categorical form, i.e., 0/1 or TRUE/FALSE. The graphical representation
of decision tree is given in Fig. 3.
Fig. 3 Decision tree plot

The accuracies are calculated using confusion matrix. Confusion matrix is gener-
ally used to describe the performance of the algorithm used. Once confusion matrix
is obtained, the accuracy can be calculated. The decision tree algorithm is applied
for three training and testing ratios. For training ratio as 60 and testing ratio as 40,
the accuracy got is 75.806, and with training ratio as 70 and testing ratio as 30, the
accuracy received is 82.002%. And finally, with training ratio as 80 and testing as
20, the accuracy is 77.409. For the reduced attribute dataset, the following are the
accuracies. For ratio of 60 training and 40 testing, the accuracy is 69%. For 70 and
30 training and testing, the accuracy obtained is 77, and for 80 and 20 training and
testing ratio, the accuracy is 74%. Thus, these are the accuracies obtained for the
reduced dataset.
3.2 Logistic Regression
Logistic regression is a probability model. The unlike linear regression, the logistic
regression helps in predicting the nonlinear data. The simple hypothesis used by
logistic regression is y = mx + c, where y is the output, m is the slope, x is the input,
and c is the intercept.
This is the simple formula used in logistic regression. The confusion matrix for
the logistic regression is given below in Fig. 4.
The logistic regression accuracies are for the training and testing ratios of 60 and
30, respectively, it is 85% accurate. Table 1 is the confusion matrix of 70 training
and 30 testing data ratios. It has the highest accuracy rate of 88.76404%. And finally,
for the training and testing ratio of 80 and 20, respectively, the accuracy got is 84%.
For the reduced attribute dataset, the accuracy of 60 training and 40 testing is 79.03.
Fig. 4 Neural network—MLP

Table 1 Logistic regression

Predicted Actual
confusion matrix
0 1
0 49 7
1 3 30
The accuracy for 70 training and 30 testing is 78.91, and the accuracy for 80 training
and 20 testing is 77.04. Thus, these are the accuracies for reduced attribute dataset.
3.3 Multilayer Perceptron
“Multilayer perceptron” is a technique of neural network where it can be classified as

input hidden and output layer. The input layer is the data instances, the output layer
is producing the results, and the hidden layer does the functionalities. In multilayer
perceptron, the backpropagation technique is used for training. The following Fig. 4
is the neural network representation of the dataset.
The accuracy for multiple layer perceptron for the training and testing ratio of 60
and 30 is 51, the accuracy for 70 and 30 training and testing is 49, and the accuracy of
MLP for 80 training and 20 testing is 62. The accuracies for the multilayer perceptron
are very less when compared to decision tree and logistic regression. For the reduced
attribute dataset, the accuracy of multiple layer perceptron for the training and testing
ratio of 60 and 30 is 52%, the accuracy for 70 training data and 30 testing data is 44%,
and the accuracy of MLP for 80 training and 20 testing is 34%. The accuracies for
the multilayer perceptron are very less when compared to decision tree and logistic
regression.
4 Results and Discussions
The following table provides the results, i.e., the accuracies provided by different
algorithms. The table contains the predictions obtained by each algorithm for
different training and testing ratios. From Table 2, logistic regression has the highest
accuracy with 88.74% followed by decision tree. The least accuracy was given
by multilayer perceptron. The main objective of this paper is to predict the heart
strokes. Logistic regression algorithms provide the highest accuracy. Random tree
and bagging provide the same accuracies around 70–75%.
From Table 2, it is proved that logistic regression has provided a better accuracy
results compared to the remaining algorithms. Logistic regression has almost given
85 and above in three sets of training and testing. The accuracy provided by logistic
regression for predicting heart strokes is 88.76% followed by decision tree with
85.02%, and the least accuracy was given by multilayer perceptron. Bagging and
random forest have given accuracies between 70 and 75.
Thus, by comparing the accuracies percent, we can conclude that the logistic
regression has given the highest accuracy in predicting. Apart from comparing the
algorithms with the resulted accuracies, one more comparative study has been done,
i.e., analysis of accuracies of reduced attributed dataset with the full attribute dataset. I
have eliminated two attributes and implemented the same algorithms expecting to get
the accuracies similar or near to the obtained accuracies. The details of the accuracies
are given in Table 3.
By comparing the accuracies obtained from complete dataset and reduced dataset,
the accuracies obtained in the complete dataset are more when compared to reduced
attribute dataset, and if we look at Table 3, logistic regression provides the highest
Table 2 Accuracies in
Train and test ratios 60:40 70:30 80:20
predicting heart strokes
Decision tree 69 77 74
Logistic regression 79.03 78.31 77.04
Multilayer perceptron 52.35 44.4 39.44
Bagging 55.65 58.49 60.27
Random forest 67.39 71.38 74.95
Table 3 Reduced attributes

Train and test ratios 60:40 70:30 80:20
accuracies in predicting heart
strokes Decision tree 75.80 82.00 77.40
Logistic regression 85.02 88.764 84.04
Multilayer perceptron 51.18 49.42 62.65
Bagging 71.4 70.37 72.71
Random forest 67.39 71.38 74.95
accuracy. By this, we can say that among the various algorithms used, logistic
regression has provided the greater accuracies.
5 Conclusion
From the experiments and the analysis done, the prediction of heart strokes can
be done effectively using logistic regression. The logistic regression has provided
the highest accuracy of 88.79%. This conclusion is also supported by the analysis of
results obtained from the implementations done on the reduced attributes. The results
of the reduced attribute dataset are analyzed and concluded that logistic regression
has provided the highest accuracy. Thus, this analysis done supports the second
objective which is providing effective algorithm suited for predicting heart strokes.
References
1. M. Learning, Heart disease diagnosis and prediction using machine learning and data mining
techniques: a review. Adv. Comput. Sci. Technol. (2017)
2. J. Li, L. Wong, Using rules to analyse bio-medical data: a comparison between C4.5 and PCL
(Institute for Infocomm Research, Singapore, 2005)
3. A. Gavhane, G. Kokkula, I. Pandya, K. Devadkar, Prediction of heart disease using machine
learning, in Proceedings of the 2nd International Conference on Electronics, Communication
and Aerospace Technology (ICECA 2018)
4. M.P. Kiran, N.R. Deepak, Crop prediction based on influencing parameters for different states
in India—the data mining approach, in 2021 5th International Conference on Intelligent
Computing and Control Systems (ICICCS) (2021), pp. 1785–1791. https://doi.org/10.1109/
ICICCS51141.2021.9432247
5. K. D’cruz, C. Kumar, A.M. Kumar, M. Gawali, A. Shivashankar, in Prediction of Heart Disease
Using Machine Learning Techniques. CIS 490 Machine Learning University of Massachusetts
6. M.A. Jabbar, S. Samreen, Heart disease prediction system based on hidden Naıve Bayes classi-
fier, in International Conference on Circuits, Controls, Communications and Computing (Oct,
2016)
7. A. Rajkumar, G.S. Reena, Diagnosis of heart disease using data mining algorithm. Glob. J.
Comp. Sci. Technol. 10, 38–43 (2010)
8. S. Ismaeel, A. Miri, D. Chourishi, Using the extreme learning machine (ELM) technique for
heart disease diagnosis, in IEEE Canada International Humanitarian Technology Conference
2015 (May, 2015)
9. N. Thanuja, N.R. Deepak, A convenient machine learning model for cyber security, in 2021 5th
International Conference on Computing Methodologies and Communication (ICCMC) (2021),
pp. 284–290. https://doi.org/10.1109/ICCMC51019.2021.9418051
10. A. Ed-Daoudy, K. Maalmi, Real-time machine learning for early detection of heart disease
using big data approach, in International Conference on Wireless Technologies, Embedded
and Intelligent Systems (April, 2019)
A Comprehensive Approach
to Misinformation Analysis
and Detection of Low-Credibility News
Meghana Mukunda Joshi, Niyathi Srinivasan Kumbale, Nikhil S. Shastry,

Mohammed Omar Khan, and N. Nagarathna
Abstract Misinformation is information that is inaccurate and is usually circulated

online with the intent to deceive. The spread of misinformation has escalated with
the development of technology, with millions of bots spreading false news on several
social media platforms. This has slowly become an issue that needs to be battled,
calling for a software system that relies on linguistic, context-based, user-profile-
based, and social features to detect and analyze fake news. The objective of this paper
is to put forth a review of various literature available on approaches to fake news
detection and delineate the aspects of the implemented solution. This approach aims
to detect the bots that spread false news as well as track and trace fake information
in the form of text. The solution involves text analysis to identify the characteristics
of fake news and employs Natural Language Processing and Machine Learning
techniques for the same.
Keywords Misinformation · Fake news detection · Neural fake news · Twitter

bots · Natural language processing · Machine learning
1 Introduction
Social media has become the source of news for many Internet users today. It is easy
to access, cheap and always available. News spreads particularly fast on social media
as it comprises a large age group who are active regularly. One of the main problems
is that information on social media is not investigated or cross-verified before posted
to the public which leads to unsubstantiated rumours spreading like wildfire. Many
people are susceptible to perceiving news on social media as authentic and reliable.
The more a person is exposed to a certain article or news, especially from reliable
sources, the more easily they are persuaded by it. Bots play a pivotal role in the
spread of misinformation on the Internet. They can post, tag and comment at very
high frequencies allowing this fake news to spread with extensive exposure. Another
M. M. Joshi · N. S. Kumbale · N. S. Shastry (B) · M. O. Khan · N. Nagarathna

BMS College of Engineering, Bull Temple Rd, Basavanagudi, Bengaluru, Karnataka 560019,
India
24 M. M. Joshi et al.
reason bots are able to spread news is because of their capability to search and retrieve
information that is invalidated. Bots tend to post very regularly under hot topics
using trending hashtags. Once the bots have introduced the fake information on the
Internet, it is then the people who begin reposting it, thus giving the information more
exposure. Fake news tends to be more controversial and eye-catching which helps
create the turmoil needed for its spread. Detection of fake news is a field of study that
is still in its rudimentary form due to the several limitations it faces. One of the reasons
detection of fake news is difficult is the pace at which fake news spreads. Before it
is even classified as fake news, the damage would probably be done. The design of
alleviation and intervention techniques for misinformation has received less attention
in social media research, mainly due to the obstacle of designing applicable user
behaviour models [1]. Detection of bots is also difficult as it requires the establishment
of certain user behavioural characteristics which allow pristine distinction between a
regular user and a bot. Bots do have certain characteristics that make them stand out
when compared to a regular user. A regular user would spend time setting up their
social media profile while a bot would generally has only the most basic information
filled out. Bots tend to post at very high frequencies compared to the average interval
between posts of a regular user. Our model aims to deter the genesis of fraudulent
news articles or tweets spread via social neural bots. Early detection of such bots will
enable us to avert the spread of such unverified information, reducing or possibly
eliminating the negative impacts triggered by it.
Our solution allows social media users to be more aware of the information they
read online and its authenticity. Our solution consists of three main components—a
Bot Detection Model, Tweet Classifier and News Article Classifier which employ
various machine learning techniques, like XGBoost Classifier, Passive Aggressive
Classifier, etc. Our models are ultimately integrated into a web application using
Flask.
2 Review of Related Work
2.1 Overview of Misinformation Analysis Techniques
In today’s day and age, a method to analyse and detect the spread of fake news is
crucial. In [1], there is deliberation about the use of unsupervised machine learning
techniques and methods to define user behavioural categories over behaviour dimen-
sions. However, supervised machine learning can be constructed with each through
already labelled datasets which our solution aims to achieve. Further, [2] presents
a detailed review to detect false and misleading news on social media, including
existing algorithms from a data mining perspective, fake news characterizations on
psychology, social and psychological theories and representative datasets. Further-
more, in [3], during analysis, it is observed that most of the fake information found
on social media were generated by bots. These results show that suppressing social
A Comprehensive Approach to Misinformation … 25
bots could be an effective technique to combat the spread of misinformation on the

Internet [4] goes on to discuss that tweets attract instantaneous attention from the
public who express undetermined attitudes towards a rumoured tweet. The use of
machine learning models in terms of user-profile and behavioural attributes plays a
pivotal role in verifying authentic news, which is why we will be implementing these
techniques in our solution.
2.2 Methods for Neural Bot Detection
Although social media platforms like Twitter and Facebook are praised for their
potential to convey essential information, their power is widely misused to influ-
ence people for several reasons. Twitter bots are considered popular misinformation
spreaders. Many methods for detecting these neural bots have been suggested, all of
which process vast volumes of social media posts and make use of network struc-
ture, temporal dynamics and sentiment analysis. Writers of [5] address an approach
to detecting Twitter bots using classifiers that are trained to differentiate between
real and fake accounts. They aim to identify features that are easy to extract while
maintaining accuracy and focusing on language-agnostic features Chavoshi et al.
[6] developed a correlation finder to identify correlated user accounts on social
media platforms like Twitter. The observations concluded that if the users are highly
synchronous in nature, then they are most likely bots. A deep bot detection model is
proposed in [7] to learn a large representation of social media users and then detect
social bots by modelling social activity and content information jointly. Paper [8]
proposes a behaviour enhanced deep model (BeDM) for bot detection. Using the
deep learning approach, BeDM fuses content information and behaviour informa-
tion. The authors of [9] propose a deep neural network based on the contextual long
short-term memory (LSTM) architecture that detects bots at the tweet level using
both content and metadata. They demonstrate that their architecture can achieve high
classification accuracy (AUC > 96%). Simple user-profile-based features like default
profile, geo enabled, followers count are features that the writers of [5] have used in
their study. Similarly, other features such as content-based features can be extracted
for further analysis, which we are implemented in our bot detection model.
2.3 Natural Language Processing and Text Analysis Based

Approaches
Fake news articles are generated using Natural Language Processing (NLP) tech-
niques. This is called “neural fake news”. Since these methods are being used to
generate fake news, they can also be used to detect it and study the characteristics as
well. The research proposed in [10] suggests the use of different machine learning
algorithms combined with strictly stylometric features, categories of emojis, and

lexical features related to the fake news headlines vocabulary. This model achieves an
accuracy of 59.50% using the Random Forest algorithm. In addition, [11] proposes an
SVM-based algorithm using the work previously done on satirical detection, enriched
with five predictive characteristics (Absurdity, Satire, Syntax, Negative Effect and
Punctuation) and tested their combinations on 360 newspapers. The authors of [12]
demonstrate how the fine-tuned Bidirectional Encoder Representations from Trans-
formers (BERT) model is robust enough to perform significantly well on news arti-
cles’ downstream classification even with minimal text pre-processing. In [13], a
model is built to compare the language of real news with satire, hoaxes, and propa-
ganda to find the linguistic characteristics of the untrustworthy text. They present a
case study based on PolitiFact.com using their factuality judgments (on a 6-point)
to probe automatic political fact-checking feasibility. Furthermore, the writers of
[14] understand that to build a promising framework for fake news detection, a
combination of source and fact verification and NLP analysis had to be used. Hence,
they propose a hybrid framework based on the previous work in automating inci-
dent classification. Similarly, the authors of [15, 16] discuss a practical and effective
method for classification of documents using Least Square Support Vector Machines
with Cross-Domain Text Categorization Techniques. Through the above studies, it is
evident that NLP models are essential and lead to highly accurate detection models.
2.4 Large Language Models
Large Language Modelling is an NLP modelling approach that has a wide variety of
functionalities, a popular use case being the one where models learn how to predict
missing words or the next few words of a sentence. Zellers et al. [17] proposes the use
of three large language models—Grover (AllenNLP), BERT and GPT-2 Detector,
which are used to predict if a chunk of text is written by a neural bot, which employs
the previously mentioned models. The studies presented in these papers indicate
the high precision and accuracy associated with using Grover for detecting Neural
Fake News. Along the same lines, [18] delineates a comparison between the GPT-2
and Grover Large Language Model. Their studies reveal that the GROVER-based
discriminator is a refined version of GROVER, which consists of three sizes, i.e.
GROVER-Base, GROVER Large and GROVER-Small. Their findings show that the
GROVER model is superior to the GPT-2 model since it has a larger dataset and
since it can detect neural fake news written by various large language models.
2.5 Fake News Classifiers
In [19], a benchmark study is conducted in order to evaluate the performance of

different approaches for detecting unreliable news on three different datasets namely
Liar, Fake or Real News and Combined Corpus. The revelation made by the authors
was that when sentiment and lexical features were being used, SVM and Logistic
Regression models operated the best compared to the other traditional machine
learning models. The use of natural language processing, text analysis, web crawling
and machine learning models together further increases the accuracy of our solution
in detecting the authenticity of a given piece of news.
The studies that we have come across rely specifically on a single method,
however, to improve accuracy, we are proposing a hybrid and holistic approach,
that relies on several attributes and works as a combination of the aforementioned
techniques.
3 Proposed System Architecture
In our system, we have come up with an architecture that is novel by imple-

menting three different solutions to the same classification problem, i.e. verifying the
authenticity of information on the Internet. Figure 1 shows the various components
employed in our solution.
This solution encompasses a web application wherein a user has an option to check
if a given twitter account is a bot or not, and an option to check the authenticity of a
news article. There are three major components in the proposed system. Firstly, the
Twitter Bot Detection Machine Learning Model is used to classify a Twitter account
as a bot or not. This model will take user-profile features as input and classify accord-
ingly. The second component is the Tweet Classification Model, which uses features
Fig. 1 Proposed framework

associated with tweets like tweet length, number of retweets, number of hashtags etc.,
to classify a given tweet as generated by a bot or not. The aforementioned models
are aimed at understanding the source of the tweet and verifying the originality of
the Twitter account. The third component is a fake news article classifier which is
trained using scraped data from fake and authentic news sources and RSS Feeds.
This Machine Learning Model classifies a given article as fake or real depending on
the accuracy associated with its authenticity in Fig. 1.
The use of these three models together makes for a holistic solution dependent on
linguistic, profile-based and context-based features, which improve the accuracy of
our solution. This makes our solution reliable and robust.
4 Implementation and Methodology
In our initial analysis, it was evident that using simple features to classify information
is not sufficient, and a combination of various features is necessary. Every machine
learning application requires extensive data and feature engineering and analysis.
Feature importance analysis refers to techniques that assign a score to input features
based on how useful they are at predicting a target variable. This helps understand
the data and models better. Our methodology involves rigorous data analysis and
feature engineering to improve accuracy and precision.
4.1 Twitter Bot Detection Model
The bot detection feature of the solution has the following components:
Twitter API: Twitter API is used to extract user-profile features and other impor-
tant details from Twitter by simply providing the required username. The infor-
mation collected is then converted into a format that can be easily fed to the
trained models. The Twitter API script serves the main purpose of returning this
information which serves as the input to the model.
Cresci 2017 Dataset: The Dataset being used to train this model is the cresci 2017
dataset which includes users and tweets information for genuine, traditional, and
social spambot Twitter accounts.
XGBoost: The XGBoost algorithm is an implementation of the gradient boosted
decision trees which yields high speed, high accuracy and it dominates datasets
on classification and regression problems. XGBoost is essentially a decision tree-
based machine learning ensemble that is used to solve regression and classification
problems.
Feature Extraction: A Principal Component Analysis was used to determine the
features which contribute significantly to the output of prediction, these features
are shown in Table 1.
Table 1 Features extracted

Feature Description
for bot detection model
Name length Length of name
Screen name length Length of screen name
Screen name digits Digits present in screen name
Levenshtein distance String metric for difference
between sequences
Default profile Whether the profile follows
default settings
Friends count Number of friends
Geo enabled Whether location is enabled or
not
Statuses count Number of tweets that can be
retrieved
Followers count Number of followers
Profile uses background image Whether profile picture uses a
background image
Listed count Number of priority tweets
Table 2 Features extracted for tweet classification model

Feature Description
Group Whether the tweet is a spam tweet (labelled 0) or genuine (labelled 1)
Length Length of tweet
Number of hashtags Number of hashtags present in the tweet
Number of URLs Number of URLs present in the tweet
Number of mentions Number of mentions present in the tweet
Number of words Number of words present in the tweet
Special characters Number of special characters present in the tweet
4.2 Tweet Classification Model
The tweet classification model also makes use of the cresci 2017 dataset from which
the features shown in Table 2 are being extracted. Similar to the Bot Detection model,
the tweet classification model yields highest accuracy with the XGBoost algorithm
in comparison to Logistic Regressions and other classification algorithms.
4.3 Fake News Article Classifier
The news article classifier consists of the following components:

Dataset and Web Scraper: To train the machine learning model, the data is a
combination of data from the Fake and Real News dataset on Kaggle and the
content from scraping 10 news article sources like CNN, NDTV, InfoWars, etc.
The python packages feed parser and newspaper are used to scrape articles from
news sites and RSS feeds.
Passive Aggressive Classifier: The passive aggressive classifier is an online-
learning based machine learning algorithm that is similar to perceptron models.
Since the data is continuously being updated and scraped from sites on a regular
basis, this model helps to deal with the large amounts of incoming data. This
model yields the highest accuracy of 92% in comparison to other classifiers like
SVM, Naïve Bayes, etc.
4.4 Web Application
For the web interface, a server is hosted through flask on which the project is run.
Flask is a popular Python web framework used for developing web applications. The
machine learning models are integrated with the UI for a seamless user experience.
After extensive analysis, implementation and integration of all the models
discussed, a working model was developed which is capable of producing highly
accurate results.
5 Results
The graphs for average article sentiment as shown in Fig. 2 shows the sentiment
analysis results in which we can see that there is a difference between the sentiments
shown by the fake and real news. The sentiment in fake news articles sways between
positive and negative sentiments implying that it is intended to influence the percep-
tions of people. In Fig. 3, we can see that the sentiment in the headlines of fake news
articles has more variance as most people usually just read the headlines rather than
the whole news article itself. We can also see that the number of sentimentally neutral
articles are more in real news articles compared to fake news articles.
For the Twitter Bot Detection model, we implemented the XGBoost Machine
Learning algorithm, which gave us an accuracy of 95.14%. The accuracy of the
Tweet Classification Model is 75.92% using the XGBoost Classifier. The Passive
Aggressive classifier yields highest accuracy for fake news classification compared
to the other models such as Linear Regressions, Support Vector Machines, Naïve
Bayes and Random Forest which yielded an average accuracy of 80%. Through
this application, we can understand how various models can be tuned to higher
accuracies, and with further visualizations and graphical representations, we can
better understand the nuances behind the spread of misinformation.
Fig. 2 Avg. article sentiment
Fig. 3 Avg. headline sentiment

6 Conclusion
The spread of misinformation has become a ubiquitous concern that has increased
the demand for smart detection systems to identify and analyse fake news and its
sources. With easy access to social media, the need for such systems is much higher
than it was a decade ago. Low-credibility news detection is a field of study still in
its primitive stage due to several limitations. Our extensive research study shows
that the authenticity of news articles can be tracked, traced, and validated against a
set of reliable sources by combining Artificial Intelligence, Machine Learning and
Data Mining techniques. The identification of fake news and Twitter bots depends
significantly on Natural Language Processing. Supervised machine learning models
can be used to classify users as bots along with the Large Language Model. Since
the classifiers depend on the training data, one possible challenge would be that
different classifiers would be required for articles of different lengths. Our solution
relies on linguistic, context-based, user-profile-based and social features to detect and
analyse fake news and misinformation, making it a holistic approach to detecting fake
news. Our solution shows that by employing extensive text analysis, natural language
processing and machine learning techniques, the spread of misinformation can be
detected early and stopped before the damage is done.
References
1. Z. Rajabi, A. Shehu, H. Purohit, User behavior modelling for fake information mitigation on
social web, in Social, Cultural, and Behavioral Modeling, ed. by R. Thomson, H. Bisgin, C.
Dancy, A. Hyder (SBP-BRiMS, 2019)
2. K. Shu, A. Sliva, S. Wang, J. Tang, H. Liu, Fake news detection on social media: a data mining
perspective. SIGKDD Explor. Newsl. 19(1), 22–36 (2017)
3. C. Shao, G.L. Ciampaglia, O. Varol et al., The spread of low-credibility content by social bots.
Nat. Commun. 9, 4787 (2018)
4. L. Tian, X. Zhang, Y. Wang, H. Liu, Early detection of rumours on twitter via stance transfer
learning, in Advances in Information Retrieval, ed. J. Jose et al. (ECIR 2020, Lecture Notes in
Computer Science, vol. 12035 (Springer, Cham)
5. J. Knauth, Language-agnostic twitter-bot detection, in Proceedings of the International
Conference on Recent Advances in Natural Language Processing (RANLP 2019)
6. N. Chavoshi, H. Hamooni, A. Mueen, Debot: twitter bot detection via warped correlation, in
Icdm (2016)
7. C. Cai, L. Li, D. Zeng, Detecting social bots by jointly modeling deep behavior and content
information, in Proceedings of the 2017 ACM on Conference on Information and Knowledge
Management (2017)
8. C. Cai, L. Li, D. Zengi, Behavior enhanced deep bot detection in social media, in 2017 IEEE
International Conference on Intelligence and Security Informatics (ISI)
9. S. Kudugunta, E. Ferrara, Deep neural networks for bot detection. Inform. Sci. 67, 312–322
(2018)
10. R. Manna, A. Pascucci, J. Monti, Profiling fake news spreaders through stylometry and lexical
features. UniOR NLP @PAN2020 Notebook for PAN at CLEF 2020
11. V.L. Rubin, N.J. Conroy, Y. Chen, S. Cornwell, Fake news or truth? using satirical cues to detect
potentially misleading news (Language and Information Technology Research Lab (LIT.RL),
Faculty of Information and Media Studies, University of Western Ontario, London, Ontario,
Canada)
12. A. Aggarwal, A. Chauhan, D. Kumar, M. Mittal, S. Verma, Classification of fake news by
fine-tuning deep bidirectional transformers based language model, 163973
13. H. Rashkin, E. Choi, J. Jang, S. Volkova, Y. Choi, Truth of varying shades: analyzing language
in fake news and political fact-checking (2017)
14. M.D. Ibrishimova, K. Li, A machine learning approach to fake news detection using knowledge
verification and natural language processing, in INCoS
15. M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy, Text document classification based on a
least square support vector machines with singular value decomposition. Int. J. Comput Appl.
27(7), 21–26 (2011)
16. M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy, S.C. Sapathy, A survey of cross-domain
text categorization techniques, in International Conference on Recent Advances in Information
Technology RAIT-2012, ISM-Dhanabad, IEEE Xplorer Proceedings (2012). 978-1-4577-0697-
4/12
17. R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, Y. Choi, Defending
against neural fake news (2019). arxiv: 1905.12616
18. W. Zhong, D. Tang, Z. Xu, R. Wang, N. Duan, M. Zhou, J. Wang, J. Yin, Neural deep fake
detection with factual structure of text (2020). arxiv: 2010.07475
19. J.Y. Khan, Md.T.I. Khondaker, A. Iqbal, S. Afroz, A benchmark study on machine learning
methods for fake news detection (2019)
Evaluation of Machine Learning
Algorithms
for Electroencephalography-Based
Epileptic Seizure State Recognition
Vibha Patel, Jaishree Tailor, and Amit Ganatra
Abstract Epileptic seizures are caused by abnormal brain activities in which person
with epilepsy exhibits unusual behaviour, sensations and sometimes loss of aware-
ness. Recognition of seizure states could aid in predicting epileptic seizures and
better treatment. Electroencephalogram (EEG) is generally used technique to record
the electrical activity of the brain. EEG can be used to predict epileptic seizures by
identifying the preictal state of the EEG data signal. The work presented here focuses
on comparing the performance of traditional machine learning algorithms with using
and without using feature extraction methods for recognizing the state of seizure.
Standard traditional machine learning algorithms, such as k-nearest neighbour, deci-
sion tree, Gaussian naive Bayes, multilayer perceptron, quadratic discriminant anal-
ysis, random forest and support vector machine have been used for the classification
of epileptic seizure states. Various performance evaluation parameters used for the
comparative analysis are: accuracy, sensitivity, specificity, precision, false positive
rate, F1-score, S1-score and area under ROC curve. The standard dataset of Bonn
University has been used to perform the experimentation. The work proves that
feature extraction approaches improve the performance of machine learning classi-
fiers in EEG-based epileptic seizure state recognition problems. Random forest and
Gaussian naïve Bayes outperform all other classifiers considering binary and ternary
classification approaches.
V. Patel (B)
Department of Computer Engineering, C. G. Patel Institute of Technology, UTU, Bardoli, Gujarat,
India
e-mail: vibha.patel@utu.ac.in
J. Tailor
Shrimad Rajchandra Institute of Management and Computer Application (SRIMCA), UTU,
Bardoli, Gujarat, India
e-mail: jaishree.tailor@utu.ac.in
A. Ganatra
Department of Computer Engineering, DEPSTAR, Charotar University of Science and
Technology(CHARUSAT), Anand, Gujarat, India
e-mail: amitganatra.ce@charusat.ac.in
36 V. Patel et al.
Keywords Machine learning · Deep learning · EEG · Epilepsy · Seizure

detection · Seizure prediction
1 Introduction
Epileptic seizure state recognition problem focuses on identifying the state of the
seizures based on the electroencephalography (EEG) data of the person with epilepsy.
EEG data can be collected by either invasive method, called iEEG (intracranial EEG)
or non-invasive method, called sEEG (scalp EEG). There are four states of seizure in
EEG: interictal, preictal, ictal and postictal [1–4]. Recognition of these states leads
to the applications called epileptic seizure detection and epileptic seizure prediction.
The difference between these two approaches is illusive. Many overlapping liter-
atures have also been reported in the analysis of seizure detection algorithms and
seizure prediction algorithms. The difference between epileptic seizure detection and
prediction is described as follows:
Seizure Detection: It is the problem of detecting the ‘ictal’ state amongst all, which
is used by practitioners to identify the presence of seizure in the recorded EEG
signals. This task is generally carried out manually by experts, and it is prone to
human errors. A. Sharmila et al. have used DWT-based feature extraction method
for seizure detection using linear and nonlinear classifiers [5]. Ali Shoeb et al. have
used support vector machine to detect onset of epileptic seizures on EEG recordings
in a patient-specific approach [6]. Xiashuang Wang et al. have used novel random
forest model combined with grid search optimization for epileptic EEG detection [7].
Cristian Donos et al. have used random forest classifier to provide a seizure detection
algorithm that can be used for an implantable closed-loop stimulation device [8]. Md
Mursalin et al. have presented a novel analysis method for detecting epileptic seizure
from EEG signal using improved correlation-based feature selection method with
random forest classifier [9]. Mohammad Khubeb Siddiqui have presented a review
of machine learning classifiers for epileptic seizure detection [10].
Seizure Prediction: The problem of seizure prediction can be simplified as detecting
the ‘preictal’ state amongst all. It is clinically proven that there are early signs of
seizures before it actually occurs, which can be differentiated in EEG signals as
preictal state. If the preictal state is effectively identified before significant amount
of time, it can act as an alarm for the person with epilepsy or its caregivers. Han-Tai
Shiao et al. have used SVM-based seizure prediction system that achieves robust
prediction of preictal and interictal iEEG segments from dogs with epilepsy [11].
Theoden Netoff et al. have proposed a patient-specific classification algorithm based
on support vector machine to distinguish preictal and interictal features extracted
from EEG recordings [12]. Yanli Yang et al. have proposed support vector machine-
based classifier that used permutation entropy for epileptic seizure prediction [13].
Piotr W. Mirowski have compared L1-regularized logistic regression, convolutional
networks and support vector machine for epileptic seizure prediction from iEEG [14].
Evaluation of Machine Learning Algorithms … 37
Khansa Rasheed et al. have presented a vast review on machine learning approaches
for predicting epileptic seizures using EEG signals [15].
1.1 Need of the Research
Though ample amount of work has been done in the field of epileptic seizure state
detection and prediction, there is no clinical applicability till date. This is because
of the requirement of high sensitivity with very law false positive rate in highly
imbalance and noisy data. Also, in this era of high computing power and cloud-
based resource availability, research is more inclined towards deep learning-based
machine learning approaches. This work contributes to the direction of evaluating
the performance of classical machine learning approaches. Experimentation was
conducted to test the following hypothesis: traditional machine learning algorithms
work best when: First, dataset is low scale and not noisy; second, balanced classes
have been considered; third, appropriate feature extraction methods have been used
before classification phase.
2 Machine Learning for Epileptic Seizure State Recognition
Following machine learning algorithms have frequently been used in the literature for
the task of epileptic seizure state recognition: support vector machine, random forest,
Gaussian naïve Bayes, multilayer perceptron, k-nearest neighbour, quadratic discrim-
inant analysis and logistic regression. Fernandez-Delgado et al. [16] have evaluated
179 classifiers on 121 datasets, which represents UCI database and other real-time
problems. They derived that random forest as the best performer followed by SVM
with Gaussian and polynomial kernels, neural networks and boosting ensembles are
better amongst all. This work evaluates the performance of eight machine learning
algorithms: k-nearest neighbour (KNN), decision tree classifier (DTC), Gaussian
naïve Bayes (GNB), multi-layer perceptron (MLP), quadratic discriminant anal-
ysis (QDA), support vector machine with Gaussian kernel (SVM-G), support vector
machine with polynomial kernel (SVM-P) and random forest (RF). These algorithms
were tested with various parameter values to record the best results on the specified
dataset.
3 Feature Extraction
EEG features extraction plays very important role [17] in the performance of clas-
sification for seizure state recognition. Features extraction methods can be classi-
fied into four broad categories [18]: (1) time domain, (2) frequency domain, (3)
38 V. Patel et al.
time–frequency domain and (4) nonlinear methods. Amjed S. Al-Fahoum et al. [19]
have mentioned in their study that each method has its own pros and cons which
makes it suitable for special type of applications. Frequency domain methods may
not provide quality performance for some EEG signals, whereas time–frequency
methods may not provide detailed information on EEG analysis as much as frequency
domain methods. Previous work on automated detection of normal and epileptic
classes uses the following features: nonlinear pre-processing filter, entropy measures,
time and frequency domain features, wavelet transform-based features, FFT-based
features, relative wavelet energy, genetic programming-based features and cross-
correlation and PSD [18]. Amongst these, time and frequency domain features, FFT-
based features and wavelet transform-based features are repeatedly used by different
researchers.
The process of selecting features for EEG analysis is arbitrary to a large extent
since the researcher often must guess the importance of features for every single task.
This comes with the risk of using less useful and redundant features while ignoring the
most important features [20]. Deep learning comes with the advantage of automated
feature extraction and selection. It is completely up to the model training process; i.e.
no handcrafted features are required to build the model. However, this comes with
the requirement of high computing power and larger training data. The following
section describes description of handcrafted features used in this work for evaluation
of the performance of machine learning algorithms.
3.1 Description of Handcrafted Features
Total five features were extracted from EEG signals: detrended fluctuation anal-
ysis (DFA), Petrosian fractal dimension (PFD), Higuchi fractal dimension (HFD),
singular value decomposition entropy (SVD entropy) and Fisher information.
Detrended Fluctuation Analysis (DFA): It is a popular method to analyse long-
range temporal correlations in time series of many different research areas but
in particular also for electrophysiological recordings [21]. DFA provides unique
insights into the functional organization of neuronal systems [19]. DFA is a simple
mathematical method but very efficient to investigate the power law of long-term
correlations of non-stationary time series. The process to compute DFA has been
adapted from [22].
Petrosian Fractal Dimension (PFD): For a time series, PFD is defined as follows
[22],
log10 N
PFD = (1)
log10 N + log10 (N /(N + 0.4Nδ ))
where N is the series length and N δ is the number of sign changes in the signal
derivative.
Higuchi Fractal Dimension (HFD): Higuchi’s algorithm calculates fractal dimen-

sion of a time series directly in the time domain [23]. HFD is also widely used for
the research of Alzheimer’s disease and detection of depression [24].
Singular Value Decomposition Entropy (SVD Entropy): EEG is prone to noise
due to environment variables while recording. Noisy signal’s singular values are
significantly smaller than noise-free signals. A low SVD entropy value spatially
and temporally indicates patterns of epileptic seizure in EEG signals [25]. SVD
entropy calculates singular value decomposition (SVD) to compute the complexity
of non-stationary bio-signals [26].
Fisher Information: Fisher information is a way of measuring the amount of infor-
mation that an observable random variable X carries about an unknown parameter θ
of a distribution that models X [27].
4 Dataset Description
The standard dataset of Bonn University [28] has been used for the experimentation.
This dataset is the commonly dataset for the task of epileptic seizure state recognition
problems. It is consisting of five different categories (denoted A–E) of EEG record-
ings. Sets A and B contain scalp EEG recordings of healthy volunteers, whereas
sets C and D contain intracranial recording of persons with epilepsy in seizure-free
intervals. Set E contains the intracranial EEG recordings of persons with epilepsy in
seizure period. Each set contains 23.6 s duration 100 single channel EEG segments.
Each segment consists of 4097 dimensional samples (173.61 Hz).
5 System Design
There are two types of model being tested for this work, classification algorithms with
the use of handcrafted features and classification algorithms without the use of hand-
crafted features. Figure 1 shows the training phase, one with extracting handcrafted
features and another without feature extraction methods. Figure 2 shows the predic-
tion phase of the models. The following sections further elaborate experimentation
details and performance evaluation parameters consideration.
5.1 Experimentation
The task of epileptic seizure state recognition can be defined as the classification of
seizure states into either normal, ictal or interictal of the dataset under consideration.
Three scenarios have been considered for justifying the seizure state recognition: (i)
40 V. Patel et al.
Fig. 1 Training phase
Fig. 2 Prediction phase
normal state versus ictal state (NS vs. IS), (ii) normal state versus interictal state (NS
vs. IIS), (iii) interictal state versus ictal state (IIS vs. IS) and (iv) normal state versus
interictal state versus ictal state (NS vs. IIS vs. IS). Table 1 shows details of the same.
Total 19 experimentations have been carried out for each of the eight algorithms.
To remove the overfitting in traditional machine learning algorithms, k-fold cross-
validation was followed. Parameters are manually tuned for best performance which
includes maximum depth in decision tree classification, value of k for k-nearest
neighbour and number of layers in multilayer perceptron.
5.2 Performance Evaluation Parameters
To measure the performance of classification algorithms for epileptic seizure state

recognition, standard parameters [29–31] have been considered which are as follows:
accuracy, sensitivity, specificity, precision, false positive rate, recall, F1-score, S1-
score and area under ROC curve (AUC). These parameters have been obtained using
confusion matrix using the following mathematical calculations:
TP + TN TP + TN
Accuracy = = (2)
TP + TN + FP + FN P+N
TP TP
Sensitivity = = (3)
TP + FN P
Table 1 Experimentation details

S. No. Datasets Classification Class types Dataset size (No. of
type files)
1 A versus E Binary NS versus IS 200
2 B versus E classification NS versus IS 200
3 C versus E IIS versus IS 200
4 D versus E IIS versus IS 200
5 AB versus E NS versus IS 300
6 AC versus E NS versus IIS 300
versus IS
7 AD versus E NS versus IIS 300
versus IS
8 BC versus E NS versus IIS 300
versus IS
9 BD versus E NS versus IIS 300
versus IS
10 ABC versus E NS versus IIS 400
versus IS
11 ABD versus E NS versus IIS 400
versus IS
12 ACD versus E NS versus IIS 400
versus IS
13 BCD versus E NS versus IIS 400
versus IS
14 ABCD versus E NS versus IIS 500
versus IS
15 A versus C versus Ternary NS versus IIS 300
E classification versus IS
16 A versus D versus NS versus IIS 300
E versus IS
17 B versus C versus NS versus IIS 300
E versus IS
18 B versus D versus NS versus IIS 300
E versus IS
19 AB versus CD NS versus IIS 500
versus E versus IS
TN TN
Specificity (SP) = = (4)
TN + FP N
TP
Precision = (5)
TP + FP
42 V. Patel et al.
Table 2 Best and worst

Parameter Best value Worst value
performance values of
evaluation parameters Accuracy 1.00 0.00
Sensitivity 1.00 0.00
Specificity 1.00 0.00
Precision 1.00 0.00
Recall 1.00 0.00
False positive rate 0.00 1.00
F1-score 1.00 0.00
S1-score 1.00 0.00
AUC 1.00 0.00
TP
Recall = (6)
TP + FN
FP
False Positive Rate = = 1 − SP (6)
TN + FP

Precision × Recall
F1 = 2 × (7)
Precision + Recall

Sensitivity × Specificity
S1 = 2 × (8)
Sensitivity + Specificity
These equations have been derived from the confusion matrix of the classification
where TP is true positive, TN is true negative, FP is false positive, and FN is false
negative. TP and TN indicate the correct numbers of positive and negative predictions,
respectively. FP and FN indicate the number of incorrect predictions for negative and
positive cases, respectively. It is important to note that sensitivity is also known as
recall or true positive rate (TPR), and specificity is called true negative rate (TNR).
Precision is also called positive predictive value (PPV). F1-score is a harmonic mean
of precision and recall, whereas S1-score is the harmonic mean of sensitivity and
specificity. The area under the curve (AUC) is used to quantify the area covered by
ROC curve. An idle classifier exhibits AUC of 1.0, which is not real to achieve.
However, the range from 0.6 to 0.9 is considered to be the performance of a good
classifier. A random classifier would exhibit the AUC score of 0.5. Table 2 shows
the best and worst performance values for various parameters considered here for
comparing the classifiers.
6 Results
Table 3 and Fig. 3 represent the performance comparison of classification algorithms

under consideration for binary classification without feature extraction, whereas
Table 4 and Fig. 4 represent the performance comparison of classification algo-
rithms under consideration for binary classification with feature extraction. The
results clearly indicate that feature extraction improves the performance of machine
learning algorithms. For binary classification, random forest, Gaussian naïve Bayes
and support vector machine with Gaussian kernel function achieves the best results in
terms of all evaluation parameters. Specifically, decision tree classifier gives compa-
rable results to Gaussian naïve Bayes and random forest when feature extraction
approaches have been applied.
Table 3 Performance comparison of binary classification without feature extraction

Parameters KNN DT GNB MLP QDA SVM-G SVM-P RF
Accuracy 0.81 0.87 0.96 0.73 0.51 0.94 0.70 0.96
Sensitivity 1.00 0.94 0.98 0.75 0.33 0.99 1.00 0.99
Specificity 0.45 0.73 0.94 0.68 0.86 0.84 0.14 0.88
Precision 0.77 0.86 0.97 0.81 0.81 0.91 0.68 0.91
FPR 0.55 0.27 0.06 0.32 0.14 0.26 0.86 0.12
F1-Score 0.87 0.90 0.97 0.78 0.47 0.95 0.81 0.95
S1-Score 0.62 0.82 0.96 0.71 0.47 0.91 0.25 0.93
ROC 0.73 0.84 0.97 0.68 0.60 0.99 0.75 0.99
Fig. 3 Performance visualization of binary classification without feature extraction

44 V. Patel et al.
Table 4 Performance comparison of binary classification with feature extraction

Accuracy 0.89 1.00 1.00 0.88 0.88 0.88 0.88 1.00
Sensitivity 0.88 1.00 0.99 0.83 0.85 0.82 0.83 1.00
Specificity 0.86 1.00 1.00 0.84 0.81 0.85 0.85 1.00
Precision 0.89 1.00 1.00 0.90 0.91 0.93 0.91 1.00
FPR 0.14 0.00 0.01 0.16 0.19 0.15 0.15 0.00
F1-Score 0.88 1.00 0.99 0.86 0.88 0.87 0.87 1.00
S1-Score 0.86 1.00 1.00 0.81 0.78 0.80 0.83 1.00
ROC 0.87 0.99 0.98 0.88 0.87 0.87 0.88 0.99
Fig. 4 Performance visualization of binary classification with feature extraction
Table 5 Performance comparison of ternary classification without feature extraction

Accuracy 0.60 0.71 0.80 0.77 0.61 0.74 0.58 0.86
Sensitivity 0.39 0.57 0.71 0.65 0.43 0.62 0.36 0.79
Specificity 0.69 0.78 0.85 0.82 0.71 0.80 0.68 0.89
Precision 0.61 0.61 0.71 0.65 0.45 0.64 0.64 0.80
FPR 0.31 0.22 0.15 0.17 0.29 0.20 0.32 0.11
F1-Score 0.48 0.59 0.71 0.65 0.44 0.63 0.46 0.79
S1-Score 0.23 0.65 0.74 0.72 0.44 0.67 0.27 0.83
ROC 0.74 0.70 0.85 0.77 0.59 0.87 0.55 0.94
Table 5 and Fig. 5 represent the performance comparison of classification algo-

rithms under consideration for ternary classification without feature extraction,
whereas Table 6 and Fig. 6 represent the performance comparison of classifica-
tion algorithms under consideration for ternary classification with feature extrac-
tion. These results of ternary classification also indicate that the feature extraction
approaches improve the performance of classification algorithms. For ternary classi-
fication, random forest, Gaussian naïve Bayes and decision tree perform better than
all other algorithms.
Fig. 5 Performance visualization of ternary classification without feature extraction
Table 6 Performance comparison of binary classification with feature extraction

Accuracy 0.83 0.99 0.97 0.78 0.87 0.76 0.81 0.99
Sensitivity 0.74 0.99 0.96 0.67 0.80 0.64 0.71 0.99
Specificity 0.87 0.99 0.97 0.83 0.90 0.82 0.85 1.00
Precision 0.75 0.99 0.96 0.68 0.86 0.67 0.72 0.99
FPR 0.13 0.01 0.03 0.17 0.10 0.18 0.15 0.00
F1-Score 0.74 0.99 0.96 0.67 0.83 0.65 0.71 0.99
S1-Score 0.80 0.99 0.96 0.74 0.81 0.70 0.76 0.95
ROC 0.66 0.75 0.75 0.53 0.65 0.68 0.68 0.76
46 V. Patel et al.
Fig. 6 Performance visualization of ternary classification with feature extraction
7 Conclusion and Future Work
It can be summarized from the performance of various classifiers that the Gaus-
sian naive Bayes and random forest algorithms outperform other traditional machine
learning algorithms—k-nearest neighbour, decision tree classifier, multilayer percep-
tron, quadratic discriminant analysis and support vector machines. Also, the deci-
sion tree algorithm improves drastically if the input dataset is applied task-specific
feature extraction approaches. The hypothesis has been proved, which stated that the
traditional machine learning algorithms work best when: first, dataset is low scale
and not noisy; second, balanced classes have been considered; third, appropriate
feature extraction methods have been used before classification phase. This work
also derives that traditional machine learning algorithms proved to be more efficient
for problems satisfying the aforesaid considerations. Applying larger datasets with
higher dimensions and noise is the future work which can be used to conquer the
conclusion.
References
1. P. Bashivan, I. Rish, M. Yeasin, N. Codella, Learning representations from EEG with deep
recurrent-convolutional neural networks (2015)
2. X. Wei, L. Zhou, Z. Chen, L. Zhang, Y. Zhou, Automatic seizure detection using three-
dimensional CNN based on multi-channel EEG. BMC Med. Inform. Decis. Mak. 18 (2018).
https://doi.org/10.1186/s12911-018-0693-8
3. S.M. Usman, M. Usman, S. Fong, Epileptic seizures prediction using machine learning
methods. Comput. Math. Methods Med. 2017 (2017). https://doi.org/10.1155/2017/9074759
4. B. Świderski, S. Osowski, A. Cichocki, A. Rysz, Epileptic seizure prediction using Lyapunov
exponents and support vector machine. Lecture Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007). https://
doi.org/10.1007/978-3-540-71629-7_42
5. A. Sharmila, P. Geethanjali, DWT based detection of epileptic seizure from EEG signals using
Naive Bayes and k-NN classifiers. IEEE Access 4, 7716–7727 (2016). https://doi.org/10.1109/
ACCESS.2016.2585661
6. A. Shoeb, J. Guttag, Application of machine learning to epileptic seizure detection, in ICML
2010—Proceedings, 27th International Conference on Machine Learning (2010)
7. X. Wang, G. Gong, N. Li, S. Qiu, Detection analysis of epileptic EEG using a novel random
forest model combined with grid search optimization. Front. Hum. Neurosci. 13 (2019). https://
doi.org/10.3389/fnhum.2019.00052
8. C. Donos, M. Dümpelmann, A. Schulze-Bonhage, Early seizure detection algorithm based on
intracranial EEG and random forest classification. Int. J. Neural Syst. 25 (2015). https://doi.
org/10.1142/S0129065715500239
9. M. Mursalin, Y. Zhang, Y. Chen, N.V. Chawla, Automated epileptic seizure detection using
improved correlation-based feature selection with random forest classifier. Neurocomputing
241 (2017). https://doi.org/10.1016/j.neucom.2017.02.053
10. M.K. Siddiqui, R. Morales-Menendez, X. Huang, N. Hussain, A review of epileptic seizure
detection using machine learning classifiers. Brain Inf. (2020). https://doi.org/10.1186/s40708-
020-00105-1
11. H.T. Shiao, V. Cherkassky, J. Lee, B. Veber, E.E. Patterson, B.H. Brinkmann, G.A. Worrell,
SVM-based system for prediction of epileptic seizures from iEEG signal. IEEE Trans. Biomed.
Eng. 64 (2017). https://doi.org/10.1109/TBME.2016.2586475
12. Y. Park, L. Luo, K.K. Parhi, T. Netoff, Seizure prediction with spectral power of EEG using
cost-sensitive support vector machines. Epilepsia 52 (2011). https://doi.org/10.1111/j.1528-
1167.2011.03138.x
13. Y. Yang, M. Zhou, Y. Niu, C. Li, R. Cao, B. Wang, P. Yan, Y. Ma, J. Xiang, Epileptic seizure
prediction based on permutation entropy. Front. Comput. Neurosci. 12 (2018). https://doi.org/
10.3389/fncom.2018.00055
14. P.W. Mirowski, Y. LeCun, D. Madhavan, R. Kuzniecky, Comparing SVM and convolutional
networks for epileptic seizure prediction from intracranial EEG, in Proceedings of the 2008
IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008 (2008). https://doi.
org/10.1109/MLSP.2008.4685487
15. K. Rasheed, A. Qayyum, J. Qadir, S. Sivathamboo, P. Kawn, L. Kuhlmann, T. O’Brien, A.
Razi, Machine learning for predicting epileptic seizures using EEG signals: a review. IEEE
Rev. Biomed. Eng. (2020). https://doi.org/10.1109/RBME.2020.3008792
16. M. Fernández-Delgado, E. Cernadas, S. Barro, D. Amorim, Do we need hundreds of classifiers
to solve real world classification problems? J. Mach. Learn. Res. 15, 3133–3181 (2014). https://
doi.org/10.1117/1.JRS.11.015020
17. M.A. Rahman, W. Ma, D. Tran, J. Campbell, A comprehensive survey of the feature extraction
methods in the EEG research. Lecture Notes in Computer Science (including subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 7440 LNCS, pp. 274–283
(2012). https://doi.org/10.1007/978-3-642-33065-0_29.
18. U.R. Acharya, S. Vinitha Sree, G. Swapna, R.J. Martis, J.S. Suri, Automated EEG analysis of
epilepsy: a review. Knowledge-Based Syst. 45, 147–165 (2013). https://doi.org/10.1016/j.kno
sys.2013.02.014
19. A.S. Al-Fahoum, A.A. Al-Fraihat, Methods of EEG signal features extraction using linear
analysis in frequency and time-frequency domains. ISRN Neurosci. 2014, 1–7 (2014). https://
doi.org/10.1155/2014/730218
20. M.A. Mazurowski, M. Buda, A. Saha, M.R. Bashir, Deep learning in radiology: an overview of
the concepts and a survey of the state of the art with focus on MRI. J. Magn. Reson. Imaging.
49, 939–954 (2019). https://doi.org/10.1002/jmri.26534
48 V. Patel et al.
21. G. Nolte, M. Aburidi, A.K. Engel, Robust calculation of slopes in detrended fluctuation analysis
and its application to envelopes of human alpha rhythms. Sci. Rep. 9, 1–16 (2019). https://doi.
org/10.1038/s41598-019-42732-7
22. F.S. Bao, X. Liu, C. Zhang, PyEEG: an open source python module for EEG/MEG feature
extraction. Comput. Intell. Neurosci. 2011 (2011). https://doi.org/10.1155/2011/406391
23. T.Q.D. Khoa, V.Q. Ha, V. Van Toi, Higuchi fractal properties of onset epilepsy elec-
troencephalogram. Comput. Math. Methods Med. 2012 (2012). https://doi.org/10.1155/2012/
461426
24. M. Čukić, M. Stokić, S. Simić, D. Pokrajac, The successful discrimination of depression from
EEG could be attributed to proper feature extraction and not to a particular classification
method. Cogn. Neurodyn. 14 (2020). https://doi.org/10.1007/s11571-020-09581-x
25. P. Boonyakitanont, A. Lek-uthai, K. Chomtho, J. Songsiri, A review of feature extraction and
performance evaluation in epileptic seizure detection using EEG. Biomed. Signal Process.
Control (2020). https://doi.org/10.1016/j.bspc.2019.101702
26. Y. Zhang, S. Yang, Y. Liu, Y. Zhang, B. Han, F. Zhou, Integration of 24 feature types to accurately
detect and predict seizures using scalp EEG signals. Sensors (Switzerland) 18 (2018). https://
doi.org/10.3390/s18051372
27. R.A. Fisher, Theory of statistical estimation. Math. Proc. Cambridge Philos. Soc. 22 (1925).
https://doi.org/10.1017/S0305004100009580
28. Andrzejak, R.G., Lehnertz, K., Mormann, F., Rieke, C., David, P., Elger, C.E.: Indications
of nonlinear deterministic and finite-dimensional structures in time series of brain electrical
activity: dependence on recording region and brain state. Phys. Rev. E - Stat. Physics, Plasmas,
Fluids, Relat. Interdiscip. Top. 64, 8 (2001). https://doi.org/10.1103/PhysRevE.64.061907
29. Powers, D.M.W.: Evaluation: from precision, recall and f-measure to ROC, informedness,
markedness & correlation. J. Mach. Learn. Technol. 2 (2011)
30. Moghim, N., Corne, D.W.: Predicting epileptic seizures in advance. PLoS One 9 (2014). https://
doi.org/10.1371/journal.pone.0099334
31. U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, H. Adeli, Deep convolutional neural network
for the automated detection and diagnosis of seizure using EEG signals. Comput. Biol. Med.
100, 270–278 (2018). https://doi.org/10.1016/j.compbiomed.2017.09.017
Lung Disease Detection
and Classification from Chest X-Ray
Images Using Adaptive Segmentation
and Deep Learning
Shimpy Goyal and Rajiv Singh
Abstract Detection and classification of lung disease detection using recent method-
ologies have become an important research problem for smart computer-aided diag-
nosis (CAD) tools. The emergence of deep learning brings automation across the
different domains to address the concerns related to manual techniques. The chest
X-ray image remains one of the effective tools for lung disease detection such as pneu-
monia. This paper presents a framework for pneumonia disease detection and classifi-
cation from the raw X-ray images. The proposed framework consists of image prepro-
cessing, adaptive segmentation, features extraction, and automatic disease detection.
Raw X-ray images are preprocessed by applying the lightweight and effective filtering
algorithm. The region of interest from the preprocessed image has been located
by using the adaptive segmentation algorithm. We propose a dynamic threshold
mechanism followed by morphological operations for adaptive segmentation. The
hybrid feature vector has been implemented using visual, texture, shape, and inten-
sity features. For disease detection and classification, the hybrid features are normal-
ized using robust normalization and then automatic deep learning classifier model
recurrent neural network (RNN) with long short-term memory (LSTM) designed.
The simulation results show that the proposed model outperformed state-of-the-art
similar methods.
Keywords Chest X-ray image · Lung disease · Pneumonia · Computer vision ·

Deep learning · Segmentation
1 Introduction
The lung diseases such as lung cancer, pneumonia, and the recent novel COVID-19
[1–3] have become the major threat to human beings. Compared to lung cancer, pneu-
monia is caused by various reasons including COVID-19 which leads to a significant
mortality rate due to infectious behavior. The detection of lung cancer is performed
S. Goyal · R. Singh (B)

Department of Computer Science, Banasthali Vidyapith, Banasthali, Rajasthan 304022, India
50 S. Goyal and R. Singh
using techniques like X-ray, magnetic resonance imaging (MRI), computed tomog-
raphy (CT), and isotope. Among these, CT and X-ray chest imaging techniques are
frequently used for the detection of various lung diseases. X-ray technique is cost-
effective with similar kind of outcomes compared to CT scan. Hence, many doctors
recommended the chest X-ray for the analysis of lung diseases like pneumonia [4,
5]. The recent progress of using the Internet of Things (IoT) for smart health care
systems required a smart disease monitoring system as well [6–11]. The goal of
this paper is to propose a framework of lung disease detection and classification for
accurate health assessment using the effective methods of computer vision and soft
computing. The objectives of this framework are to enhance the accuracy of pneu-
monia disease detection and to minimize the detection time. These objectives of
robust and reliable lung disease detection and classification motivate the framework
proposed in this paper. Rest of the paper is organized as follows: Sect. 2 presents the
study of related works. Section 3 illustrates the proposed model. Section 4 demon-
strates the simulation results. Section 5 concludes the proposed work and summarizes
major findings.
2 Related Works
In [12], authors proposed a CNN model for classification of pneumonia according

to the conversion of VGG-19, decision tree, and Inception V2 over CT scan images
and X-ray images. An automatic framework of coronavirus disease detection and
classification was proposed [13]. The investigation-based approach was proposed
where they designed fine-tuning of pre-trained CNNs for the COVID-19 disease
classification using chest X-ray images [14]. A model called COVIDDetectioNet
[15] has been designed for the classification of COVID-19 from chest X-ray images.
In the similar way, a COVID-19-based work includes the review of various CNN
models [16] that were used for the classification of lung conditions like COVID-19,
viral pneumonia, and normal using chest scans. Another framework for detection
and classification of COVID-19 disease into bacterial pneumonia, viral pneumonia,
and the normal class was proposed [17]. Ensemble deep transfer learning systems
have been proposed for COVID-19 disease detection using chest X-ray images [18].
Another CNN-based pneumonia disease detection system was introduced in [19].
They trained the X-ray images of normal and abnormal conditions and build a
model to detect the presence of pneumonia. In [20], the new methodology using the
weighted soft computing method was proposed using weighted anticipations from
the conventional deep learning systems like DenseNet121, MobileNetV3, Xception,
and ResNet. From the above literature, we noticed that detecting the lung diseases
like pneumonia and COVID-19 affected pneumonia is still a challenging research
problem because of the unavailability of sufficient and effective techniques [21, 22].
The proposed approach of lung disease detection and classification using computer
vision and automated soft computing techniques has differed from the above methods.
Lung Disease Detection and Classification … 51
This paper proposed the model fusion RNN-LSTM (FRNN-LSTM) for efficient and
robust lung diseases detection and classification from the input X-ray images.
3 Proposed Methodology
This section presents the methodology and design of the proposed framework for
lung disease detection and classification using chest X-ray images. As shown in
Fig. 1, the input chest X-ray image is first preprocessed for quality enhancement
by removing the noise and improving the low-contrast regions. To remove noise
and low-contrast regions, preprocessing plays a very significant role. The next step
is adaptive segmentation which aims to localize the region of interest according to
image structure. From the segmented image, four types of features are extracted and
Fig. 1 Proposed approach for lung disease detection and classification

represented by F1 , F2 , F3 , and F4 . These features are further fused and normalized to

build the hybrid feature vector. For disease detection and classification, these hybrid
features vector are fed to the RNN-LSTM network. This network consists of layers
such as the input sequential layer, LSTM layer, fully connected layer, and softmax
layer. Finally, the related probabilities are computed by LSTM for each training class
automatically, and the final detection outcome has been computed in Fig. 1.
3.1 Image Preprocessing
The input chest X-ray image I has been preprocessed in the proposed model FRRN-
LSTM by applying intensity value adjustment, median filtering, and histogram
equalization. The first operation focused on adjusting the image intensity values
for low-contrast X-ray images. This technique is mainly used to enhance the contrast
as:
I 1 = imadjust(I ) (1)
where I 1 is outcome of contrast enhancement step using function imadjust.

The median filtering is applied further to remove the noise in contrast-enhanced
image. It may be possible that adjusting image intensity values leads to noise, and
also X-ray scan may introduce the noise in image. We have used 2D median filtering
in this proposed model that works by moving via pixel by pixel in the given image,
replacing every value with the neighboring pixel’s median value. The neighbor’s
pattern is decided by the size of window. The window size of 3 × 3 neighborhood is
used in this work. The 2D median filter is applied on I 1 as:

I 2 (i, j) = median I 2 (i, j)|(i, j) ∈ w (2)
where I 2 is outcome of median filtering and w is the size of window.
3.2 Adaptive Segmentation
The proposed adaptive segmentation method focuses to address the problems such as
over-segmentation, inaccurate extraction, adaptability, and high computation time.
The segmentation algorithm has been designed using region growing approach and
morphological operations. The segmentation algorithm has been shown in Table
1. The segmentation method aims to find accurate ROI extraction with minimum
computation time. As shown in Algorithm 1, the segmentation starts with the edge
detection followed by dividing it into N number of grids. For each grid, we applied
the dynamic thresholding approach to perform the segmentation. Once all grids are
segmented, those are replaced in the original image. The post-processing has been
Table 1 Segmentation algorithm
applied using morphological operations to deliver accurate and original intensity

information in ROI rather than binary data. The dynamic threshold value has been
computed according to the mean image value and then second level mean.
3.3 Feature Extraction, Fusion, and Normalization
In this work, we calculated four features from the segmented images which are visual,
texture, intensity, and geometric invariant features into the vectors F1 , F2 , F3 , and
F4 . The visual features are extracted by the histogram of oriented gradients (HOG)
descriptor. The various kinds of surface features were separated utilizing gray-level
co-event matrix (GLCM) with four balances. Furthermore, eight geometric invariant
features were removed from the segmented image. From each segmented image,
features were extricated that comprise of 4 HOG features, 8 intensity features, 8

geometric invariant features, and 16 surface features. In fusion phase, the 4 visual
features (F1 ), 16 texture features (F2 ), 8 intensity features (F3 ), and 8 geometric
moment features (F4 ) are concatenated to form one feature vector F as:
F = {F1 , F2 , F3 , F4 } (3)
The fused feature vector contains different kinds of features extracted from the ROI
image; it leads to significant variations among them. The features with a higher range
play a decisive role in the training process of machine learning algorithms. Therefore,
feature normalization is required to enhance speed and accuracy. Normalization is
used to bound features into two numbers like 0 to 1. We applied min–max and robust
normalization methods as represented in Eqs. (4) and (5), respectively.
(F − min(F))
F min _ max = (4)
(max(F) − min(F))
F robust = −sign(F) ∗ log 10|F| (5)
3.4 Classification Using RNN-LSTM
For classification purpose, we have designed the hybrid deep learning model for
early prediction of lung diseases using RNN and LSTM. The RNN-LSTM model is
formed to overcome the vanishing gradient problem and improve the performance
compared to other soft computing methods.
To validate the effectiveness of the proposed lung disease detection and classifi-
cation model, we have implemented the model FRNN-LSTM in MATLAB along
with other soft computing techniques such as support vector machine (SVM), arti-
ficial neural network (ANN), K-nearest neighbor (KNN), and ensemble classifiers.
The performances of using ANN, SVM, KNN, and ensemble classifier have been
compared with the RNN-LSTM technique by dividing the entire dataset into 70%
training and 30% testing ratio. We have collected the lung disease dataset from the
well-known public research repository [23]. The dataset called COVID-19 Radiog-
raphy Database (C19RD) has a collection of chest X-ray images at Qatar University.
The C19RD consists of 2905 samples in three classes like normal chest (1341),
COVID-19 pneumonia (219), and viral pneumonia (1345).
This section presents the comparative results using aforementioned dataset along
with soft computing techniques such as SVM, ANN, KNN, ensemble methods,
and RNN-LSTM with varying features normalization methods. Figures 1, 2, and
3 demonstrate the results of accuracy, precision, and recall, respectively. The results
demonstrated the outcomes of soft computing and feature normalization methods by
using similar computer vision approaches of image enhancement and ROI extraction.
Among all the features normalization techniques, the proposed feature normaliza-
tion method improved the performance of accuracy, precision, and recall metrics
compared to raw features and min–max normalization technique. The raw features
without applying any features normalization led to poor classification performances.
The soft computing methods are also investigated in the above results using the
C19RD dataset. The deep learning model RRN-LSTM-based FRNN-LSTM has
shown enhanced lung disease classification performance compared to other methods.
Fig. 2 Accuracy analysis for a Features normalization analysis and b Soft computing techniques
analysis
Fig. 3 Precision analysis for a Features normalization analysis and b Soft computing techniques
analysis
Fig. 4 Recall analysis for a Features normalization analysis and b Soft computing techniques
analysis
Using the deep learning model RNN-LSTM, detection accuracy, and F1-score perfor-
mance have been enhanced by 4% approximately compared to the second-best soft
computing techniques in Figs. 2 and 3 and 4.
We have implemented and evaluated the existing methods such as COVIDetec-
tioNet [15], CNN using ResNet23 (CNN-RN) [16], Se-ResNeXt-50 [17], CNN using
ensemble approach (CNN-E) [18]. All techniques were implemented on an Intel I5
processor with 4 GB RAM. The comparative analysis has been presented using two
metrics, average detection accuracy and average processing time. Table 2 demon-
strates the comparative study of the proposed FRNN-LSTM model with the existing
methods. These all methods have been implemented with common hyperparameters
such as the number of epochs (70), minimum batch size (27), gradient threshold (1),
and execution environment (CPU). The number of the hidden layers has been set
for 100 in the proposed RNN-LSTM model. Under this hyperparameter setting, we
have received the best classification performances. We have selected these methods
as closely related to the proposed model of lung disease detection using the chest
X-ray images dataset. Additionally, these methods claimed significant results for this
domain using chest X-ray image datasets in Table 2.
Table 2 Comparative
Methods Detection accuracy Training and
analysis of the proposed
(%) detection time
FRNN-LSTM method
(seconds)
COVIDetectioNet 91.34 1879
CNN-RN 92.67 2873
ResNeXt-50 91.58 2762
CNN-E 92.54 2489
FRNN-LSTM 95.04 1289
5 Conclusions
This paper presented the framework of detecting and classifying the chest X-ray
image based on lung disease detection and classification. We focused on pneumonia
disease which can be caused by either COVID-19 or bacterial/viral infections. The
model has been designed using preprocessing, adaptive segmentation, hybrid features
extraction and normalization, and automatic classification. The design of each phase
elaborated in this paper with the core goal is to improve the recognition accuracy
and minimize the processing time. The simulation results reveal that the proposed
method outperforms the existing techniques and is able to predict lung disease with
better accuracy in less time.
References
1. D.S. Smith, E.A. Richey, W.L. Brunetto, A Symptom-based rule for diagnosis of COVID-19.
SN Compr. Clin. Med. 2, 1947–1954 (2020). https://doi.org/10.1007/s42399-020-00603-7
2. E. Elibol, Otolaryngological symptoms in COVID-19. Eur. Arch. Otorhinolaryngol. (2020).
https://doi.org/10.1007/s00405-020-06319-7
3. E. Salepci, B. Turk, S.N. Ozcan et al., Symptomatology of COVID-19 from the otorhino-
laryngology perspective: a survey of 223 SARS-CoV-2 RNA-positive patients. Eur. Arch.
Otorhinolaryngol. (2020). https://doi.org/10.1007/s00405-020-06284-1
4. A. Khatri, R. Jain, H. Vashista, N. Mittal, P. Ranjan, R. Janardhanan, Pneumonia identification
in chest X-ray images using EMD, in Trends in Communication, Cloud, and Big Data, ed. by
H. Sarma, B. Bhuyan, S. Borah, N. Dutta. Lecture Notes in Networks and Systems, vol. 99
(Springer, Singapore, 2020). https://doi.org/10.1007/978-981-15-1624-5_9
5. L.A. Rousan, E. Elobeid, M. Karrar et al., Chest x-ray findings and temporal lung changes
in patients with COVID-19 pneumonia. BMC Pulm. Med. 20, 245 (2020). https://doi.org/10.
1186/s12890-020-01286-5
6. H.B. Mahajan, A. Badarla, A.A. Junnarkar, CL-IoT: cross-layer Internet of Things protocol for
intelligent manufacturing of smart farming. J. Ambient Intell. Human Comput. (2020). https://
doi.org/10.1007/s12652-020-02502-0
7. R. Patel, N. Sinha, K. Raj, D. Prasad, V. Nath, Smart healthcare system using IoT. in Nanoelec-
tronics, Circuits and Communication Systems, ed. by V. Nath, J. Mandal. NCCS 2018. Lecture
Notes in Electrical Engineering, vol. 642 (Springer, Singapore, 2020). https://doi.org/10.1007/
978-981-15-2854-5_15.
8. H.B. Mahajan, A. Badarla, Application of Internet of Things for smart precision farming:
solutions and challenges. Int. J. Adv. Sci. Technol. Dec. 2018, 37–45 (2018)
9. M.M. Islam, A. Rahaman, M.R. Islam, Development of smart healthcare monitoring system
in IoT environment. SN Comput. Sci. 1, 185 (2020). https://doi.org/10.1007/s42979-020-001
95-Y
10. H.B. Mahajan, A. Badarla, Experimental analysis of recent clustering algorithms for wireless
sensor network: application of iot based smart precision farming. J. Adv. Res. Dyn. Control
Syst. 11(9). https://doi.org/10.5373/JARDCS/V11I9/20193162
11. H.B. Mahajan, A. Badarla, Detecting HTTP vulnerabilities in IoT-based precision farming
connected with cloud environment using artificial intelligence. Int. J. Adv. Sci. Technol. 29(3),
214–226 (2020)
12. D. Dansana, R. Kumar, A. Bhattacharjee et al., Early diagnosis of COVID-19-affected patients
based on X-ray and computed tomography images using deep learning algorithm. Soft Comput.
(2020). https://doi.org/10.1007/s00500-020-05275-y
13. I.D. Apostolopoulos, T.A. Mpesiana, Covid-19: automatic detection from X-ray images
utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 43,
635–640 (2020). https://doi.org/10.1007/s13246-020-00865-4
14. T.D. Pham, Classification of COVID-19 chest X-rays with deep learning: new models or fine
tuning? Health Inf. Sci. Syst. 9, 2 (2021). https://doi.org/10.1007/s13755-020-00135-3
15. M. Turkoglu, COVIDetectioNet: COVID-19 diagnosis system based on X-ray images using
features selected from pre-learned deep features ensemble. Appl. Intell. (2020). https://doi.org/
10.1007/s10489-020-01888-w
16. C. Butt, J. Gill, D. Chun, B.A. Babu, Deep learning system to screen coronavirus disease 2019
pneumonia. Appl. Intell. (2020). https://doi.org/10.1007/s10489-020-01714-3
17. S. Hira, A. Bai, S. Hira, An automatic approach based on CNN architecture to detect Covid-19
disease from chest X-ray images. Appl. Intell. (2020). https://doi.org/10.1007/s10489-020-020
10-w
18. N. Gianchandani, A. Jaiswal, D. Singh et al., Rapid COVID-19 diagnosis using ensemble deep
transfer learning models from chest radiographic images. J. Ambient Intell. Human Comput.
(2020). https://doi.org/10.1007/s12652-020-02669-6
19. M. Nath, C. Choudhury, Automatic detection of pneumonia from chest X-rays using deep
learning, in Machine Learning, Image Processing, Network Security and Data Sciences, ed. by
A. Bhattacharjee, S. Borgohain, B. Soni, G. Verma, X.Z. Gao. MIND 2020. Communications
in Computer and Information Science, vol. 1240 (Springer, Singapore, 2020). https://doi.org/
10.1007/978-981-15-6315-7_14
20. M.F. Hashmi, S. Katiyar, A.G. Keskar, N.D. Bokde, Z.W. Geem, Efficient pneumonia detection
in chest Xray images using deep transfer learning. Diagnostics 10(6), 417 (2020). https://doi.
org/10.3390/diagnostics10060417
21. https://www.kaggle.com/
22. G. Himabindu, M. Ramakrishna Murty, et al., Classification of kidney lesions using bee swarm
optimization. Int. J. Eng. Technol. 7(2.33), 1046–1052 (2018)
23. G. Himabindu, M. Ramakrishna Murty, et al., Extraction of texture features and classification
of renal masses from kidney images. Int. J. Eng. Technol. 7(2.33), 1057–1063 (2018)
A Quantitative Analysis for Breast
Cancer Prediction Using Artificial Neural
Network and Support Vector Machine
Harnehmat Walia and Prabhpreet Kaur
Abstract The medical data is increasing rapidly day by day. The number of patients
of different disease is rising, and it is difficult for the radiologists to analyze the
data, detect, and predict the disease with accurate result. It is important to achieve
better performance and classify the disease using different methodologies as database
is large. Hence, the review of different state-of-the-art techniques using machine
learning and deep learning algorithms is included. The literature is best for the clas-
sification of breast cancer and different medical images. The diagnosis and prediction
are done using training (80%) and testing (20%) samples on benchmark dataset. Both
artificial neural network and support vector machine are compared using parameters,
i.e., accuracy, precision, and recall. Experimental results show that SVM is better
compared to ANN.
Keywords Medical images · Machine learning · Deep learning · Computer-aided

diagnostic (CAD) · ANN · SVM · Accuracy · Precision · Recall
1 Introduction
Enormous amount of data is piling up every day due to advancement in technology.

This poses a threat to the researchers to gather and analyze this data for further use.
This huge amount of data also consists of medical images which is of great use. These
medical images available to the researchers or health care workers may not be clear
enough for diagnosis and detection. Medical images like CT, ultrasound, PET, MRI,
X-ray may contain different types of noises such as additive white Gaussian noise,
speckle noise, salt and pepper noises. These noises affect the quality of noise which
leads to difficulty in disease detection. Denoising of medical images helps in easy
detection and diagnosis of disease which further reduces the burden on health care
H. Walia (B) · P. Kaur

Department of Computer Engineering and Technology, Guru Nanak Dev University, Amritsar,
Punjab, India
P. Kaur
e-mail: prabhpreet.cst@gndu.ac.in
60 H. Walia and P. Kaur
worker. Pre- processing is an important step before the classification of the medical
images. It helps in getting better quality images which further helps in accurate
diagnosis. Many state-of-the-art denoising methods have been introduced over the
years in order to enhance the quality of medical images. ML and DL techniques are
gaining an important role in image denoising.
2 Computer-Aided Diagnostics (CAD)
The medical images captured by CT, ultrasound, MRI use radiations that are harmful
for human body, and in order to take clear images, the body has to be exposed to
stronger radiations for longer period of time. CAD system improves the quality of
image taken by CT, MRI and, hence, helps in diagnosis of diseases from the images
correctly. CAD is used in the diagnosis of lung cancer, breast lesion in mammog-
raphy, diabetic retinopathy, brain tumor, etc. Computer-aided diagnostic (CAD) helps
radiologists in the US-based detection of disease and, hence, helps in lowering the
efforts and reliance on the operator [1]. CAD also helps in improving the sensitivity
and specificity of ailment diagnostic. It helps in the detection and diagnosis of lesion.
In the detection phase, the lesion is separated from the normal tissue, and in diagnosis
phase, the lesion is assessed to give the diagnosis. In the CAD system, the feature
extraction is designed by the human which affects the results efficiency. It is the
diagnosis made by the physician taking into account the output of the computer as
the second opinion [2].
Step 1: First, the data is collected from hospitals and laboratories; this data could
contain all kinds of information such as medical images as well as the personal data
of the patients. The required medical images are acquired and then preprocessed.
Step 2: In the preprocessing phase, the image quality is enhanced. The images
are removed of background artifacts; images are filtered using different filters
for removal of different kinds of noises. The various noises affecting images are
discussed below.
Step 3: After the preprocessing, the images are segmented. In the segmenta-
tion process, the medical image is divided into different segments, and only the
useful part is used for further processing. Features are extracted and selected using
methods like GLCM.
Step 4: Then, the ROI is classified using various algorithms like support vector
machine, K-nearest neighbor, convolutional neural network. Figure 1 shows the
block diagram of CAD.
A Quantitative Analysis for Breast Cancer Prediction … 61
Database
Image Image
collected data
Pre-
from hospitals Acquisitions Processing
and laboratories
Feature
Classification
Extraction and Segmentation
selection
Fig. 1 General CAD block diagram
3 Medical Images
Due to increase in the technology, there are various ways to procure images of
human anatomy for the diagnosis of disease. The images modalities include ultra-
sound (US), magnetic resonance imaging (MRI), computed tomography (CT), X-ray,
positron emission tomography (PET). These images can assist in evaluating disease
in different situation. These medical images contain noises which are to be removed
for accurate diagnosis. The different noises affecting the various medical images are
discussed.
3.1 Ultrasound (US)
Ultrasound is a medical imaging method in which high-frequency sound waves are

used to provide a visual image of the inside of the body [3]. It is a non-invasive
technique; that is, there is no use of needles or syringes and, hence, is painless.
It is cost-effective as well [4]. Ultrasound imaging is used to examine the fetal
growth during pregnancy; it also helps inspect the abdomen, breast, heart, etc. for
any abnormality. Despite its advantages, US has a lot of drawbacks such as it contains
speckle noise which makes US image grainy [5], that affects the diagnosis process.
Speckle noise was thought to contain the important information, but later, it was found
that it is caused due to back scattering of the phenomenon that affects the quality
of image. It provides smaller field of view [6]. Fetal ultrasound has been a major
topic of research as fetal image analysis is crucial as it is used to examine various
biological parameters such as head circumference (HC), abdominal circumference
(AC), femur in an unborn. Also, it helps to estimate gestational age, weight, size and
detect any possible abnormalities [7].
3.2 Computed Tomography (CT) Images
Computed tomography also known as X-ray CT. The CT images are used to provide
information regarding the hard tissues of the human body [8]. It uses combination
of various X-rays that is taken from various angles, that is, processed by computer
to produce a tomographic image. CT is used in lesion, tumor detection. It is also
a non-invasive technique and, hence, is painless [5]. The CT images use ionizing
radiations. Ionizing images accumulate in the body and have harmful effects; the
body is exposed to these radiations for a longer period of time. It is recommended to
use less intensity radiations so that the harm caused is less. The image obtained may
not be clear but can be processed using various methods and used for diagnosis.
3.3 Magnetic Resonance Images (MRI)
MRI is used to capture images of that area of the body where other imaging modalities
cannot see appropriately. It makes use of the oscillating magnetic filled and the radio
waves. These radiations cause the hydrogen atoms present on the water molecules of
the tissue to get excited and align in a straight line. Radio waves are sent to deflect
the atoms. After the removal of the magnetic field, the hydrogen atoms release the
electromagnetic wave absorbed in the form of radio waves [5]. MRI provides images
of brain, fetal movements, etc. It is a non-invasive technique and, hence, is painless.
Magnetic resonance images are used for brain imaging for detection of tumors. These
images are non-radioactive and non-aggressive in nature [9]. These images contain
the rician noises.
3.4 X-Rays
X-rays are a form of electromagnetic waves. These radiations can penetrate through
the human skin and provide images of the bones beneath the skin. The image is
formed by the differential absorption of the X-ray beam that forms the images.
These radiations are harmful for the body, so minimum exposure should be given for
such radiations. These images help in detection of bone fractures, bone dislocation,
etc. The X-ray images are affected by Poisson’s noise [10].
4 Noises in Medical Images
The medical imaging devices like MRI, CT, US generate huge amount of images,
and these images may contain noises due to the surroundings’ error while taking the
images or while transferring of the images. The various noises present in the medical
images are speckle noise, Gaussian noise, salt and pepper noises, etc. These types
of noises degrade the original quality of the picture and affect the diagnosis process,
so there is a need to preprocess these images before performing any diagnosis.
4.1 Speckle Noise
Speckle noise occurs in an image formed due to scattering. There are a number
of elementary scatterers inside each resolution cell that reflect the incident wave.
The wave that scatters back undergoes constructive and destructive interferences
randomly which form a granular pattern called “speckle” and reduces visual image
quality. It occurs in the US and MR images and caused while image transferring or
any other internal or external factors such as air gap, beam forming process. Speckle
noise results in random variations of the result signal which raises the gray level in
the images. Speckle noise can be removed using Lee filter or the Kaun filter [11].
4.2 Gaussian Noise
It is a statistical noise that has probability function equal to normal distribution.

Gaussian noise is mostly found in CT scans. It is removed by soothing of the image
pixels using filters like wavelets, but the wavelets tend to destroy the texture of the
image in order to get rid of the noise [12]. The Gaussian white noise removal and
blurring are done by Wiener filter. Weiner filtering finds the minimum mean square
error between the estimated random process and the desired process, that is, the
minimum difference between the original signal and the filtered signal [13].
4.3 Salt and Pepper Noises
Salt and pepper noises show as white (salt) and black (pepper) spots on the images.
This noise generally has bright pixel in the dark portions and dark pixels in the bright
portion appearing as white and black dots [12]. It gives low-quality image. It is an
impulsive noise and, hence, is removed using median filter that uses the nonlinear
technique.
4.4 Poisson Noise
The Poisson noise is present in the X-ray and nuclear imaging like the PET. This
noise is made by the random behavior of the photons. It is a selected quantum
noise. The Poisson noise assumes that each pixel in the image is taken from the
Poisson distribution [14]. The different noises affecting the medical images have
been discussed. It is important for the removal of these modalities in order to get a
clear noise-free image that leads to easy detection and diagnosis of disease by the
health care workers or machines like CAD as discussed above. So, the various image
filtering techniques are discussed.
5 Machine Learning Algorithms
Machine learning algorithms are used for classifying the images and feature
extraction after preprocessing of the images. The classification algorithms are
discussed.
5.1 Support Vector Machine (SVM)
Support vector machine is a supervised learning algorithm. Support vector machine

finds the hyperplane between two different classes in a high-dimensional feature
space which can be used as a classification. The hyperplane is found using support
vectors and margins. It is a method used for the classification of both linear and
nonlinear data. There is no problem of over-fitting in SVM. SVM is applied in a
number of areas like speaker identification, recognition of texts [15].
5.2 Artificial Neural Network (ANN)
ANN is a self-learning approach. The basic unit of this network is a neuron, which
is based on the biological neural networks present in humans. The artificial neural
network architecture consists of an input layer, n-number of hidden layer, and a
output layer. The pattern recognition is performed by learning from examples in
ANN. It is used for sequence and pattern recognition, medical diagnosis, etc. ANN
is used for both textual data as well as images. It has high accuracy for text data
classification [16–18]. The algorithms have limitations as provided in Table 1, so we
require hybrid algorithms for classification and image denoising.
Table 1 Medical image and machine learning
Author Application Dataset Parameters Techniques Advantages Drawbacks
Title
Publication
Year
Author: Wang, S., Text analysis, Varies Accuracy, cost, Cluster analysis “Cost reduction, “The role of
Summer, R.M. Computer-aided application-wise propagating skills support vector disseminating statistical machine
Title: “Machine detection and machine, Naïve expertise, improves learning
learning and radiology.” diagnosis, brain Bayes, artificial accuracy” approaches is not
[21] Publication: function from fMR, neural networks, defined”
Elsevier content-based image linear models,
Year: 2012 retrieval for MRI ensemble learning,
reinforcement
learning
Author: Moon, K.W, CAD for the US of Dataset: ( 78 Accuracy, NPV, Chi-square test, “The partial AUC is “No diversity in
Lo, C.M, Cho, N., breast malignant and 166 PPV, specificity, CAD with (0.90 vs. 0.76, P < the images of
Chang, J.M., Huang, benign) Total 244 pAUC BI-RADS 0.05) than the malignant tumors
C.S., Chen, J., Chang, conventional CAD used even before
A Quantitative Analysis for Breast Cancer Prediction …
R.F. system at classification”

Title: “Computer-Aided sensitivity above
diagnosis of breast 90%”
masses using quantified
BI-RADS findings” [22]
Publication: Science
Direct
Year: 2013
(continued)
65
Table 1 (continued)
66

Title
Publication
Year
Author: Shan, J., Alam, CAD for breast US Dataset (150 Margin, echo Support vector “SVM has the best “Classifier
S.K., Garra, B., Zhang, malignant and 133 pattern, posterior machine, random ROC performance; hybridization was
Y. and Ahmed, T. benign) Total 283 feature orientation, forest, students the performance of overlooked”
Title: “Computer-Aided shape t-test, artificial clustered classifier
Diagnosis for Breast neural network, is better in tumor
Ultrasound using decision tree classification”
Computerized
BI-RADS Features And
Machine Learning
Methods” [23]
Publication: Science
Direct
Year: 2015
(continued)
H. Walia and P. Kaur
Table 1 (continued)
Title
Publication
Year
Author: Ravishankar, Fetal US images Dataset: 70 images Support vector Convolutional The gestational age “Assessment of
H., Prabhu, V.M., segmentation machine, histogram neural network difference is 78% parameters is not
Vaidya, V., Singhal, N. of the oriented and gradient for GBM and 75% clarified correctly”
Title: “Hybrid gradient, random boosting machine for CNN, and mean
Approach for automatic forest, local binary DSC for the hybrid
segmentation of Fetal pattern, and approach is
abdomen from gray-level increased to 0.9
Ultrasound images co-occurance which means 5%
using Deep learning” matrix improvement over
[24] GBM and 6% over
Publication: IEEE CNN, HOG
Conference outperforms Haar
Year: 2016 by 4%
(continued)
67
Table 1 (continued)
68

Title
Publication
Year
Author: Greas, K.J., Screening for breast Dataset: 886 AUC, accuracy Multiview deep “Performance “Limited dataset
Wolfson, S., Shen, Y., cancer thousand images convolutional increases with an and computational
Kim, S.G.,Moy, L., neural netwok, increase in data size resources”
Cho, K. BI-RADS “
Title: “High-Revolution
Breast cancer Screening
With Multiview Deep
Convolutional Neural
Network” [25]
Publication: IEEE
Year 2017
(continued)
Table 1 (continued)
Title
Publication
Year
Author: Yap, M.H., Breast US lesion Dataset: Dataset A True positive Patch-based “FCN-AlexNet “Training process
Pons, G., Matri, J., detection images (60 malignant fraction, false LeNet, U-Net, and gives better and negative
Ganau, S., Sntis, M., and 246 benign) Total positive/image and transfer learning performance, and images required”
Zwiggelaar, R., 306; Dataset B (53 F-measure FCN-AlexNet deep learning
Davison, A.K., Marti, R. malignant and 110 algorithms can
Title: “Automated benign) Total 163 adapt to specific
Breast Ultrasound dataset”
Lesions detection using
Convolutional Neural
Networks” [26]
Publication: IEEE
Journal
Year: 2018
Author: Huang, Q., Breast lesion Dataset: varies Accuracy, CNN, LSTM “Feature extraction “Huge differences
Zhang, F., Li, X. diagnosis, the liver application-wise sensitivity, is done in size and
Title: “Machine lesion, the fetal specificity automatically, and modality of dataset
Learning in Ultrasound ultrasound standard scope of error is employed by
Computer-Aided plane detection, the reduced, faster different methods”
Diagnostic System” thyroid nodule image processing”
[27] diagnosis
Publication: Hindawi
Year: 2018
(continued)
69
Table 1 (continued)
70

Title
Publication
Year
Author: Liu, S., Wang Classification of Dataset: varies Accuracy CNN “Improved “Limited public
Y., Yang, X., Lei, B., tumors, diagnosis of application-wise performance” datasets are
Liu, L., Li, S.X., Ni, D., thyroid nodules available, issues
Wang, T. regarding transfer
Title: “Deep learning in learning”
Medical Ultrasound
Analysis”[28]
Publication: Elsevier
Year 2019
6 Literature Survey
Wasule and Sonar [9] proposed GLCM technique for extracting texture feature and
SVM 7KNN algorithms for classification of brain MRI. The method shows 96%
accuracy for SVM and 86% accuracy for KNN. SVM has better performance than
KNN, and SVM performance enhances using larger training set.
Kalyan et al. [19] used feature extraction methods for classification of abnormali-
ties using ANN by performance evaluation; it is difficult to decide between GLRLM
and combined.
Kohil and Sudharson [20] proposed the pre-trained residual neural network (RLN)
for despeckling of the ultrasound images before diagnosing using the computer-aided
diagnostic (CAD) system. The proposed method gives better PSNR, SSIM at all noise
levels. The performance of RLN method is better in term of naturalness image quality
evaluator (NIQE) compared to other methods.
Table 1 reviews the medical imaging using ML and DL algorithms. It shows the
advantages and disadvantages of using the ML and DL algorithms for classification
purposes. The table specifies the dataset being used and the change in the parameters
like accuracy, SNR, sensitivity on using the machine learning algorithms. After
reviewing the effect of ML and DL algorithms, Table 2 further reviews denoising
the medical images first using different filtering methods like Lee, median filters,
etc., or hybrid filtering techniques and then applying the ML and DL algorithms for
the classification of images. The table shows that the denoising of the images before
classification improves the accuracy of diagnosis.
7 Gaps in Literature
Medical imaging is an important procedure which is performed for diagnosing

various disease like tumors, lesions. After reviewing the papers related to denoising
of medical images using machine learning algorithms, we can infer that medical
images are not clear; they contain different noises, so they are subjected to various
image processing algorithms for better analysis and other conventional diagnostic
activities like classification. The past studies have used traditional models as classi-
fiers for medical image classification. Issue with these is that they are susceptible to
outliers, speckle noise. This can be improved by denoising images and then using
deep learning algorithms for classification.
• The recent trends show an increase in deep learning-based methods like CNN
for image analysis. However, these algorithms require large sample size and have
high computational cost.
• Accuracy of diagnosis increases for large training sets.
• Another drawback in the existing literature is that 2D images have been used in
most of the studies. It would not be wrong to emphasize that 3D images play
a very important role in the field of medical imaging, especially when used in
Table 2 Medical image denoising using machine learning and deep learning
72
Author Attributes Dataset State-of-the-art methods Benefits

Title
Publication
Year
Author: Ali, S.A. Vathsal, S., PSNR, removal of white (256 × 256) CT images Window-based multi-wavelet “Removal of AWGN provides
Kishore, K.L. Gaussian noise transformation and better clinical diagnosis”
Title: “An Efficient denoising thresholding
Technique for CT Images
using window-based
multi-Wavelet Transformation
And Thresholding” [29]
Publication: European Journal
of Scientific Research
Year: 2010
Author: Ragesh, N.K., Anil, MSE, PSNR, RMSE, SNR US images Artificial neural network, “Speckle noise reduction in US
A.R., Rajesh, R. fuzzy logic genetic algorithm images”
Title: “Digital Image
Denoising in Medical
Ultrasound Images: A Survey”
[30] Publication:
ResearchGate
Year: 2011
Author: Malik, M., Ahsan, F., Peak signal to noise ratio, Lena, pirate, mandrill (512 × Cuckoo search algorithm “Cuckoo removes all kinds of
Mohsin, S. images quality index, VIF 512) images total 3 images noise 5–28% increase in PSNR
Title: “Adaptive image for various noises”
denoising using cuckoo
algorithm” [31]
Publication: Springer
Year: 2014
(continued)
Table 2 (continued)
Title
Publication
Year
Author: Gondara, L. SSIM Mini-MIAS for Convolutional autoencoders, “Small training set sufficient
Title: “Medical image mammograms and a dental CNN DAE, NL for denoising CNN DAE
denoising using convolutional radiography database (DX) outperforms NL “
denoising autoencoders.” [32] MMM has 322 images, and
Publication: IEEE conference DX has 400 images
Year: 2016
Author: Hepsag, P.U, Ozel, Accuracy, precision, recall, Mini-MIAS and BCDR Convolutional neural network “Increased classification on
S.A., YazÕcÕ, A.” and F-measure accuracy from 65 to 85% for
Title: Using Deep learning for BCDR”
mammography Classification
[33]
Publications: IEEE conference
Year: 2017
Author: Diwakar, M., Singh, P. PSNR, SSIM, ED, DIV 87 CT images (512 × 512) Non-subsampled Shearlet “Increased noise reduction in
Title: “CT image denoising transform (NS ST) CT images than other
using multivariate model and approaches”
its method noise thresholding
in non-subsampled shearlet
domain” [8]
Year: 2020
(continued)
73
Table 2 (continued)
74

Title
Publication
Year
Author: Gnanaselvi, J.A., Accuracy, sensitivity, Retinal fundus images CNN, modified expectation “CNN achieves 97% accuracy”
Kalavathy, G.M. specificity STARE dataset maximization (MEM), PCA,
Title: Detecting disorders in Curvelet transform-based
retinal images using machine normalized graph cut, KNN
learning techniques” [34]
Publication: Springer
Year: 2020
Author: Meena, B., Bhavana, Accuracy Brain MR image Gray-level co-occurance “Easy detection of brain tumor
D., Avinash, K.M.M., Anuhya, matrix (GLCM) for feature location due to increased
P., Teja, M.S., Kumar, K.B. extraction AdaBoost image quality, and accuracy
Title: “Brain Tumor detection classification achieved is 89.90%”
for MR Images using machine
learning Algorithm” [35]
Publication: Journal of
Information and
Computational science
Year: 2020
(continued)
Table 2 (continued)
Title
Publication
Year
Author: Zhou, L., PSNR, SSIM, normalized PET images of 18 patients CycleWGANs with “RED-CNN outperforms other
scaefferkoetter, J.D., Tham, root-mean-square error RED-CNN, 3D-cGAN, NLM, approaches”
I.W.K., Huang, G., Yan, J. (NRMSE) and BM3D methods
Title: “Supervised learning compared
with CycleGAN for low-dose
FDG PET image denoising”
[36]
Year: 2020
Author: Xie, D., Li, Y., Yang, PSNR, SSIM, MAE, and 110 females and 170 males CNN (DL for ASL) DWAN “Improvement In SNR,
H., Bai, L., Wang, T., Zhou, F., CCC total 280 images model reduction in acquisition time
Zhang, L. while maintaining the CBF
Title: “Denoising arterial spin quality”
labeling perfusion MRI with

deep learning” [37]
Year: 2020
Author: Basha, C.Z., Likhitha, Accuracy, sensitivity, 300 images from Apollo org, Probabilistic neural network, “Quick identification of MRI
A., Alekhya, P., Aparna, V. specificity Hyderabad SVM, KNN, and fuzzy images”
Title: “Computerised C-means
Classification of MRI images
using Machine Learing
Algorithms” [38]
Publication: IEEE
Year: 2020
(continued)
75
Table 2 (continued)
76

Title
Publication
Year
Author: Rehman, Z.U., Zia, Sensitivity, specificity, (Multi-modal brain images) Random forest, AdaBoost, “Reduction in the
M.S., Bojja, G.R., Yaqub, M., accuracy, precision BraTs 2012 dataset SVM computational cost of image
Jinchao, F., Arshid, K. segmentation, improvement in
Title: “Texture based FP using denosing”
localization of a brain tumor
from MR-images by using a
machine learning approach”
[39]
Year: 2020
combination with deep learning. This can increase accuracy of diagnosis and
other medical procedures.
• Hybridization of deep neural network algorithms like DAWN model shows
improvement in SNR, and CNN DAE models require small training sets for
denoising.
8 Case Study
Support vector machine and artificial neural network state-of-the-art techniques are
used for breast cancer prediction. Figure 2 includes the proposed workflow using
ANN, and comparison is done with SVM in terms of accuracy in Fig. 2.
Fig. 2 Proposed flowchart

of breast cancer prediction
using ANN
8.1 Dataset Description
The dataset used is a benchmark dataset taken from Kaggle (https://www.kaggle.

com/uciml/breast-cancerwisconsin-data). It contains 569 cases, and 33 features
are provided. The cases are either benign or malignant. Out of the 569 cases, 357
are benign and 212 are malignant. These are the attributes in the dataset used
for the classification process. “id,” “diagnosis,” “radius_1ean,” “texture_1ean,”
“peri1eter_1ean,” “area_1ean,” “s1oothness_1ean,” “co1pactness_1ean,” “con-
cavity_1ean,” “concave points_1ean,” “sy11etry_1ean,” “fractal_di1ension_1ean,”
“radius_se,” “texture_se,” “peri1eter_se,” “area_se,” “s1oothness_se,”
“co1pactness_se,” “concavity_se,” “concave points_se,” “sy11etry_se,”
“fractal_di1ension_se,” “radius_worst,” “texture_worst,” “peri1eter_worst,”
“area_worst,” “s1oothness_worst,” “co1pactness_worst,” “concavity_worst,”,
“concave points_worst,” “sy11etry_worst,” “fractal_di1ension_worst.”
8.2 Data Preprocessing
In data preprocessing phase, the data is checked for any null values and unnamed
values. Any null values found are either replaced or removed from the dataset. The
attributes that are not required for processing, i.e., those features that do not affect
the disease prediction are dropped. In the dataset, the diagnosis column replaces B
to 0 and M to 1 for processing purposes.
8.3 Training and Testing Phase—SVM
While training the data for SVM, the data is divided into training and testing test
sets. The 70–30 split has been taken for processing purposes. For accuracy, linear
kernel is used, and the random state is defined as 0. ANN: While training the data
for ANN, the data is divided in training and testing set; 0.80% of the data is used for
data vales splitting of data into and epoch and loss ANN training, and the rest 20%
of the data is used for the testing purpose. Thirty-one features have been taken for
prediction. There are two hidden layers used in the processing. Both the hidden layers
use ReLU as the activation function, and the output layer uses sigmoid activation
function. Figure 3 shows ANN diagram.
Fig. 3 ANN
8.4 Experimental Setup and Implementation
Intel core i7 with 16 GB RAM is used for processing of the data. In Fig. 3, the
system is used for breast cancer prediction as benign or malignant using ANN and
SVM algorithms. Google Colaboratory is used for programming in Python. Google
Colaboratory allows us to write Python program with no configuration required by
the computer. It allows easy sharing of code. It is popular among research scholars
and students for performing machine learning algorithms.
9 Results and Discussion
The performance of the SVM and ANN is determined in terms of accuracy. The
accuracy is the ratio of number of samples classified correctly to the total number of
samples.
Acc = (TP + TN)/(TP + TN + FN) (1)
For analytic test estimation, the ROC curve is used. It is used to provide the relation
between true positive and the false positive rate. True positive (TP) is the sensitivity,
and false positive (FP) is 1- specificity). The lower left corner of the ROC curve
Table 3 Performance of SVM and ANN

Accuracy (%) Loss (%) Precision Recall
ANN 93.85 9.17 0.939 0.952
SVM 95.98 14.22 0.872 0.976
specifies area under the curve (AUC) which is 0.994 for ANN and 0.987 for SVM.
For ANN, the training dataset gives an accuracy of 95.82%, loss is 8.21%, whereas
for the testing set of ANN, accuracy is 94.73% and loss 15.01%, respectively. The
precision and recall score are 0.939 and 0.952. The accuracy for the SVM algorithm is
95.98%. Precision and recall sore are 0.872 and 0.976, respectively. The performance
of the respective algorithms is shown in Table 3 along with the confusion matrix and
ROC curve shown in Figs. 4 and 5. SVM performs better than ANN for breast cancer
prediction. The accuracy in SVM is achieved with kernel-based transform.
Table 3 shows performance of SVM and ANN.
Fig. 4 a Heat map of confusion matrix for ANN and b ROC
Fig. 5 a Heat map of confusion matrix for SVM and b ROC SVM
10 Conclusion
The machine learning and deep learning methods along with different classification
algorithms are reviewed thoroughly for medical images. The training and testing
samples of benchmark breast cancer UCI are improved using ANN and SVM clas-
sification methods. The evaluation parametric result of SVM in terms of accuracy is
95.98%, whereas precision and recall rate are also improved as compared to ANN.
The qualitative prediction using ROC curve is 0.987 in SVM. The hybrid approach
using transfer learning and RNN may prove better as a future scope in this prediction
approach.
References
1. R.J. Ramteke, K. Monali, Automatic medical image classification and abnormality detection
using K-nearest neighbour. Int. J. Adv. Comp. Res. 2(6), 190–196 (2012)
2. S. Liu, Y. Wang, X. Yang, B. Lei, L. Liu, S.X. Li, D. Ni, T. Wang, Deep learning in medical
ultrasound analysis—a review, pp. 261–275 (2018)
3. T. Wang, Y. Lei, Y. Fu, W.J Curran, T. Liu, X. Yang, Machine Learning in Quantitative PET
Imaging (2020)
4. C. Bowles, B. Kainz, Machine learning for the automatic extraction and classification of foetal
features in-utero (2014)
5. S.V.M.M. Sagheer, S.N. George, A review on medical image denoising algorithms. Biomed.
Signal Process. Control, 1746–8094 (2020)
6. S. Kollem, K.R.L. Reddy, D.S. Rao, A review of image denoising and segmentation methods
based on medical images. Int. J. ML Comp., 288–295 (2019)
7. P. Kaur, G. Singh, P. Kaur, An intelligent validation system for diagnostic and prognosis of
ultrasound fetal growth analysis using Neuro-Fuzzy based on genetic algorithm. Egypt. Info.
J., 1110–8665 (2018)
8. M. Diwakar, P. Singh, CT image denoising using multivariate model and its method noise
thresholding in non-subsampled shearlet domain. Biomed Signal Process. Control (2020)
9. V. Wasule, P. Sonar, Classification of brain MRI using SVM and KNN classifier, in 3rd IEEE
International conference on Sensing ,Signal Processing and Security (2017)
10. D.N.H. Thanh, V.B.S. Prasath, L.M. Hieu, A review on CT and X-Ray images denoising
methods. Informatica 43, 151–159 (2019)
11. C.S. Bedi, H. Goyal, Qualitative and quantitative evaluation of image denoising techniques.
Int. J. Comp. App. 8(14), 31–34 (2010)
12. N. Kumar, M. Nachamai, Noise removal and filtering techniques used in medical images.
Oriental J. Comp. Sci. Tech. 10(1), 103–113 (2017)
13. M. Chowdhury, J. Gao, R. Islam, Fuzzy Logic based filtering for image de-noising, in IEEE
Conference on Fuzzy Systems, pp. 2372–2376 (2016)
14. P. Subbuthai, K. Kavithabharathi, S.Muruganand, Reduction of types of noises in dental images.
Int. J. App. Tech. Res., 436–442 (2013)
15. J. Han, M. Kamber, J. Pie, Datamining concepts and techniques, 3rd edn. (Elsevier, 2016)
16. B.F. Erickson, P. Korfiatis, Z. Akkus, T.L. Kline, Machine learning for medical imaging.
Radiographics RSNA, 505–515 (2017)
17. M.I. Razzak, S. Naz, A. Zaib, Deep learning for medical image processing: overview, challenges
and future
18. D. Levy, A. Jain, Breast mass classification from mammograms using deep Convolutional
Neural Networks (2016)
19. K. Kalyan, B. Jakhia, R.D. Lele., M. Joshi, A. Chowdhary, Artificial Neural Nework application
in the diagnosis of disease conditions with liver ultrasound images. Adv. Bioinf. (2014)
20. P. Kokil, S. Sudharson, Despeckling of clinical ultrasound images using deep residual learning.
Comp. Methods Prog. Biomed. 194 (2020)
21. S. Wang, R.M. Summers, Machine learning and radiology. J. Med. Image Anal. 16, 933–951
(2012)
22. W.K. Moon, C.M. Lo, N. Cho, J.M. Chang, C.S. Huang, J.H. Chenand R.F. Chang, Computer-
aided diagnosis of breast masses using quantified BI-RADS findings. Comput. Meth. Prog.
Biomed. 111(1), 84–92 (2013)
23. J. Shan, S.K. Alam, B. Garra, Y. Zhang, T. Ahmed, Computer-aided diagnosis for breast
ultrasound using computerized BI-RADS features and machine learning methods. Ultrasound
Med. Bio. 42(4), 980–988 (2015)
24. H. Ravishankar, S. Prabhu, V. Vadiya, N. Singhal, Hybrid approach for automatic segmentation
of fetal abdomen from ultrasound images using deep learning, in IEEE Conference, pp. 779–782
(2016)
25. K.J. Greas, S. Wolfson, Y. Shen, N. Wu, S.G. Kim, E. Kim, L. Heacock, U. Parikh, L. Moy,
K. Cho, High resolution breast cancer screening with multi-view deep convolutional neural
networks, vol. 3 (2018)
26. M.H. Yap, G. Pons, J. Marti, S. Ganau, M. Sentis, R. Zwiggelaar, A.K. Davison, R. Marti,
Automated breast ultrasound lesions detection using convolutional neural networks. J. Biomed.
Health Inf. 22(4) (2018)
27. L.J. Brattain, B.A. Telfer, M. Dhyani, J.R. Grajo, A.E. Samir, Machine learning for medical
ultrasound: status, methods, and future opportunities. Abdomen Radio 43(4), 786–799 (2018)
28. Q. Huang, F. Zhang, X. Li, Machine learning in ultrasound computer-aided diagnostic systems.
Biomed. Res. Int. (2018)
29. S.A. Ali, S. Vathsal, K.L. Kishore, An efficient denoising technique for CT images using
window based multiwavelet transformation and thresholding. Eur. J. Sci Res. 48(2), 315–325
(2010)
30. N.K. Ragesh, A.R. Anil, R. Rajesh, Digital image denoising in medical ultrasound images: a
survey, in ICGST AIML-11 Conference, Dubai, UAE, pp. 67–73 (2011)
31. M. Malik, F. Ahsan, S. Mohsin, Adaptive image denoising using cuckoo algorithm. Soft.
Comput. 20(3), 925–938 (2014)
32. L. Gondara, Medical image denoising using convolutional denoising autoencoders, in IEEE
16th International Conference on Data Mining Workshops, pp. 241–246 (2017)
33. P.U. Hepsag, S.A. Ozel, A. YazÕcÕ, Using deep learning for mammography classification, in
2nd International Conference on Computer Science and Engineering; UBMK 2017, pp. 418–
423 (2017)
34. A. Gnanaselvi, G.M. Kalavathy, Detecting disorders in retinal images using machine learning
techniques. J. Ambient Intell. Human Comput. (2020)
35. B. Meena, D. Bhavana, K.M.M. Avinash, P. Anuhya, M.S. Teja, K.B. Kumar, Brain Tumor
detection for MR Images using machine learning algorithm. J. Inf. Comput. Sci. 10(7)
36. L. Zhou, J.D. Schaefferkoetter, I.W.K. Tham, G. Huang, J. Yan, Supervised learning with
CycleGAN for low-dose FDG PET image denoising. Med. Image Anal. 65
37. D. Xie, Y. Li, H. Yang, L. Bai, T. Wang, F. Zhou, L. Zhang, Denoising arterial spin labeling
perfusion MRI with deep learning
38. C.Z. Basha, A. Likhitha, P. Alekhya, V. Aparna, Computerised classification of MRI
images using machine learning algorithms, in Conference on Electronics and Sustainable
Communication System (2020)
39. Z.U. Rehman, Z.U. Rehman, M.S. Zia, G.R. Bojja, M. Yaqub, F. Jinchao, K. Arshid, Texture
based localization of a brain tumor from MR-images by using a machine learning approach.
Medical Hypothesis (2020)
Heart Disease Prediction Using Deep
Learning Algorithm
Gouthami Velakanti, Shivani Jarathi, Malladi Harshini, Praveen Ankam,

and Shankar Vuppu
Abstract Heart diseases also known as cardiovascular diseases (CVDs) have been
the major cause of death in the whole world over the last few decades and have
risen to become the most dangerous disease not only in India but throughout the
world. Heart disease may refer to several conditions that affect your heart. Given
the number of variables in your body that can possibly contribute to heart disease,
it is one of the most difficult diseases to foresee. Detecting and predicting it are a
daunting job for physicians and researchers alike. As a result, a reliable, effective,
and practical way to diagnose such life-threatening diseases, a proper medication is
needed. In this project, we will try to solve this problem using different algorithms
with the Cleveland dataset. Our project will be helpful and will make an easy way
for predicting the occurrence of heart disease.
Keywords Deep learning · Heart disease · Recurrent neural network
1 Introduction
This section will cover how we can predict heart disease using the deep learning model
and the test results of heart disease diagnosis dataset by implementing a deep learning
algorithm. The important characteristic of deep learning is that it can automatically
extract the features, making learning simple and easy. The problem of unsupervised
and supervised learning can be solved with the help of the deep learning algorithm.
Deep learning has many techniques such as random forests, logistic regression, and
SVM with hyperparameter and feature selection to predict heart disease.
Deep learning is an effective parametric classifier under supervised settings for
predicting heart disease. Deep learning neural network model has extensive multi-
layer perceptrons, which means a greater number of hidden layers with linear and
nonlinear transfer functions, regularization, and activation function for binary classi-
fication like sigmoid. In this training process, all the output patterns will be checked
G. Velakanti (B) · S. Jarathi · M. Harshini · P. Ankam · S. Vuppu

Computer Science and Engineering, Kakatiya Institute of Technology and Science Warangal,
Warangal, India
84 G. Velakanti et al.
with the target variables to determine errors, then the error will be adjusted by the
algorithm. The training process will go on until the mistakes are minimized, and the
whole epochs are utilized.
After the training of the model is done, then, the weights are fed to the considered
model for prediction. The model can then predict and diagnose heart disease for the
list of new patients during their tests.
The other problem the deep learning model faces is over-fitting, which is a
common problem in the field of deep learning. The deep neural network classifi-
cation model is better in terms of training dataset compared to the testing dataset. To
solve the over-fitting issue, the model uses the regularization algorithm to decrease
the complexity of the model while keeping the same number of parameters.
Dropout layer is an effective regularization technique that can be used to prevent
the over-fitting in the deep learning model. It is used in each iterative process of
the training in each epoch, which will eliminate the neural network units or nodes
and their connections in the model, so using the dropout layer, we can prevent the
over-fitting in the deep neural network model.
2 Literature Survey
In [1], the authors used support vector machine (SVM), logistic regression, naive
Bayes algorithm to predict the heart disease in percentages. This system will have a
Web site where the users can register themselves to get the report of the risk of the
heart disease in terms of predictive analysis. The system’s database has users table,
medical history table. The authors used 75% entries in the dataset for training and
25% entries for testing the accuracy of the algorithm.
In [2], using ID3, naive Bayes, K-means algorithms, the authors predicted the heart
disease. From the experiment, they got more accuracy for naive Bayes algorithm if
the input data are cleaned and well-maintained; even though ID3 can clean itself, it
cannot give accurate results every time. They considered naive Bayes variables as
individual and can use combination of algorithms like naive Bayes and K-means to
get accuracy.
In [3], different data mining as well as machine learning techniques is used
which include artificial neural networks (ANN), decision tree, fuzzy logic, K-nearest
neighbor (KNN), naive Bayes, and support vector machine (SVM). This paper just
gives awareness and overview of the existing work. They introduced 26 different
papers published on heart disease prediction. The authors have also stated that many
feasible enhancements could be explored to improve the accuracy of prediction
system.
In [4], heart disease dataset was collected from UCI repository. The authors
mentioned different types of heart-related diseases that we are prone to. Different
algorithms were used for classification which include logistic regression, K-nearest
neighbor (KNN), support vector machine (SVM), naive Bayes, hyperparameter opti-
mization (Talos), and random forest, and accuracy was compared. Talos got the
Heart Disease Prediction Using Deep Learning Algorithm 85
highest accuracy of 90.78%. It also stated that using deep learning models increases
accuracy of prediction.
In [5], the authors used ANN model. First, the authors divided the dataset into
training and test data. Next, they have created the model with 13 nodes and 4 hidden
layers. They did training for 100 epochs with batch size 10 and got the accuracy as
85%.
In [6], the authors used dataset from UCI machine learning repository. They used
four algorithms, that is, naïve Bayes’ classifier, decision tree, K-nearest neighbor,
random forest algorithm to create a model with the maximum possible accuracy.
The authors have compared all the four algorithms and got more accuracy for KNN
algorithm.
In [7], the authors used background methods such as logistic regression, naïve
Bayes, SVM, KNN for comparison and produced accuracy for all the methods.
In [8], the authors used proposed classifiers optimized by FCBF, PSO, and ACO
against other classification models used for health diseases classification and define
the most efficient one. They found the best one as the KNN algorithm.
In [9], the authors used dataset from Kaggle Web site and used logistic regression,
Gaussian naïve Bayes, KNN, artificial neural network. They worked with a stack of
11 ML algorithms, and their results were compared with their proposed model.
3 Proposed System
3.1 Objectives
• To build the project which makes easy, simple, and time-saving method for people
to predict the occurrence of heart-related diseases just by providing a set of features
or the details of the patient.
• To avoid human involvement.
• To show efficiency of the models and suggest the best one.
The implementation of the project is done in Python. This is done using two
different models. They are categorical model and binary model.
4 Methodology
Recurrent neural network (RNN) is an algorithm which is inspired by the structure

of the human brain. These neural networks take the data and train themselves and
then it predicts the output for a new set of data. Recurrent neural network is made up
of neurons in the form of layers, which are the main working units of the network.
Firstly, the input layer is present which takes the input, and with the given input,
output layer will predict the final output, and in between, the hidden layers are
present which perform the important computation of the algorithm. These are the
networks where the output from preceding step is fed as the input to the present
step such that when the network wants to find the following word of a sentence, the
preceding words are needed, and therefore, it is required to remember the earlier
words. So, recurrent neural network has come into occurrence, which cleared up this
problem with the help of the hidden layers which are present in between the input
and output layers. So, the most dominant feature of recurrent neural network is the
hidden layers because it memorizes information about the data. In Fig. 1, starting
with the input layer, the number of hidden layers used are two and an output layer
as well. In the hidden layer one and two, the no. of neurons mapped is 16 and 8,
respectively.
We take the attributes such as age, sex (male or female), chest pain, resting
blood pressure, cholesterol, fasting blood sugar, electrocardiographic measurement,
maximum heart rate, exercise induced angina, ST depression induced by exercise
relative to rest, peak exercise ST segment, major vessels, thalassemia and are fed as
inputs in the input layer. Through channels, neurons from one layer are associated
to the neurons in the following layers. Each and every channel from the input layer
is assigned with a probability value which is known as weight. Next, the inputs in
the input layer are multiplied with the corresponding weights as shown in the below
figure, and this result sum is sent as input to the neurons which are present in the
hidden layer.
All the neurons in the next layer are bounded with a value called bias which is
added to the preceding input result sum. Then, next, this final result value is passed
through an activation function. This result of the activation function decides the
function of the neuron which means if the neuron gets activated, it goes to next
layer, if not no. These activated neurons from the previous layer transfer the data to
Fig. 1 Structure of RNN

Fig. 2 Back propagation
the neurons which are of the next layer through channels. In this manner, data are
propagated through the network and is called as forward propagation.
In the output layer, the neuron with the highest probability value determines the
predicted output of the neural network. The values are basically probability. For
example, when we give all the inputs of a single person which are known to us and
get the wrong value as the prediction, then it means we must train it more by the
adjusting the weights. Now, the final forecasted output is compared against the real
output to realize the error in the prediction to adjust the weights. We must calculate
the error change to adjust the weights. We must calculate the slope of each tiny step
and multiply them together. In deeper neural network, if we want to know how much
the error will change if we adjust a weight, and then just calculate the derivative
of each step from back to the weight, then multiply them all. This is called back
propagation.
Based on the information from the above-said calculations, the weights are
adjusted. As mentioned in Fig. 2, we find the derivative of each tiny step and multiply
them together. This is called as back propagation.
This cycle of forward and backward propagation is continuously repeated with
multiple inputs. This is continued until the proper weights are allocated such that the
network can predict the output correctly in most of the cases.
Fig. 3 Procedural flowchart

5 Implementation
• In Fig. 3, the first part of the project deals with the collection of dataset. The
dataset was collected from Cleveland patients (from Kaggle Web site) [10] using
read_csv method in Pandas’ library.
• Data preprocessing and data cleaning are done to remove null values present in
the data. Data cleaning is a process to check for any null values or missing values
or not a number values and update or drop them according to our requirements.
So, some missing values are added to the dataset. The last process of cleaning
deals with dividing the dataset into input and output variables of data.
• To understand the dataset clearly, data visualization must be done. Data visual-
ization is a method which gives a clear idea or an overview of the data using some
pictorial representations. For the visualization part, libraries like matplotlib and
Seaborn are required. In this project histogram, Seaborn and heat map are plotted.
– Histogram: It plots a graph for each of the variables present in the dataset.
Since the “heart” dataset collected from Cleveland city patients (from Kaggle
Web site) [10] contains 14 variables, a total of 14 histograms are plotted. Each
graph reveals some information regarding the variable.
– Heat Map: It is a 2D graphical representation of data where individual
values contained in matrix are represented as colors. The intensity reveals
the correlation between two values.
– Cross-tab: It is just like a bar graph which computes relation between two or
more factors of the data against frequency.
• Next part deals with importing certain required modules and packages, named:
Sys (which is a Python module), Pandas (it provides different data structures and
operations to import and analyze data easily), NumPy (for dealing with arrays),
Sklearn (statistical modeling of data), Keras (it is a deep learning model used for
neural network-related projects), matplotlib, pyplot, scatterplot_matrix, Seaborn
(all these help in data visualization by providing an OO API or a grid or an
interface or functions for plotting).
• Classification of dataset to training as well as testing data.
• Execution of RNN Algorithm: It begins with importing some required packages:
sequential (from keras.models), dense and dropout (from keras.layers), Adam
(from keras.Optimizers) and regularizes module from Keras. We have imple-
mented it using two different models: categorical model and binary model. In
both the models, the input dimension has 13 neurons and two hidden layers.
Furthermore, the first 13 neurons in input dimension are connected to 16 other
neurons in first hidden layer which are reconnected to 8 neurons in the succeeding
hidden layer, with kernel_initializer normal and kernel regularizer l2 as well both
these layers are applied with activation function: “ReLU” in both the models. The
difference comes in output layer which will be explained below
Fig. 4 Structure of categorical model
In the categorical model depicted in Fig. 4, the output variables are converted
to categorical labels. The output layer has two neurons applied with activation
function: “softmax.”
In the binary model depicted in Fig. 5, the output variables represent the inte-
gers: 0 and 1, representing “no heart disease” and “heart disease,” respectively.
In this model, output layer consists of only one neuron applied with activation
function: “sigmoid.”
• The final procedure is the compilation of both the models and fitting them as
well as checking for model losses and accuracy and displaying the results. The
compilation of the model is done using Adam compiler, and some graphs are
plotted for the accuracy metric.
6 Experimental Analysis
• In Figs. 6 and 7, if test size = 0.1, we get the accuracy as 74% approximately for
both the models.
• In Figs. 8 and 9, if test size = 0.3, then we get the accuracy as 79% and 76%
approximately for categorical and binary models, respectively.
Fig. 5 Structure of binary model
• In Figs. 10 and 11, if test size = 0.4, then we get the accuracy as 80% and 82%
approximately for categorical and binary models, respectively.
• In Figs. 12 and 13, if we add a hidden layer with 12 neurons, then we get the
accuracy as 82% approximately for both the models.
But in this observation, when we add an extra hidden layer, the processing time
is high. The output is generated slowly.
Fig. 6 Result 1 of categorical model

• If we use l1 regularizer for the hidden layers of neurons 17, 12, 9, 7, 2 and 16, 12,
9, 7, 1 for categorical and binary models, respectively, then we get the results as
around 54% with “undefined metric warning” error.
• In Figs. 14 and 15, if we use l1 regularizer for hidden layers of neurons 17, 9, 2
and 16, 8, 1 for categorical and binary models, respectively, then we get the same
accuracy of 82% approximately.
Table. Accuracy for categorical and binary models for different parameters is as
follows:
Experiment parameter Categorical model Binary model

Test size = 0.1 74 74
Test size = 0.3 79 75
Test size = 0.4 80 81
Hidden layers = 16, 12, 8, 2 81 81
Regularizer = I1 81 81
Fig. 7 Result 1 of binary model


7 Results
Results have been visualized using the model accuracy and model loss graphs. The
model accuracy graph plots the accuracy against the number of iterations (epoch) as
depicted in Figs. 16 and 19, and the model loss graph plots the loss of our model
against the number of iterations (epoch) as shown in Figs. 17 and 20. The prediction
of the working of model can be done using some metrics like precision, recall rate,
F1-score and support as depicted in Figs. 18 and 21.

Fig. 16 Graph for model

accuracy of training and
testing data as no. of
iterations increase of
categorical model

loss of training and testing
data as no. of iterations
increase of categorical model
Fig. 18 Final accuracy measure result of categorical model
Categorical Model
Binary Model

accuracy of training and
testing data as no. of
iterations increase of binary
model

loss of training and testing
data as no. of iterations
increase of binary model
Fig. 21 Final accuracy measure result of binary model
8 Conclusion and Future Scope
From the above project, we know that the categorical model has more accuracy, i.e.,
85%. This project can be further extended by creating a web interface or an application
for better and organized results. Moreover, our dataset has only few instances. So,
we may improve the accuracy and learning of this project by increasing the dataset
size.
Limitations
The size of the dataset is small and try using the dataset with a greater number of
instances. As we have used deep learning algorithm, it improves the accuracy of
project. Due to the pandemic, many instances were not able to be collected in the
dataset.
References
1. A. Jagtap, P. Malewadkar, O. Baswat, H. Rambade, Heart disease prediction using machine

learning. Int. J. Res. Eng. Sci. Manage. 2(2) (2019). www.ijresm.com | ISSN (Online) 2581-
5792
2. N. Rajesh, T. Maneesha, S. Hafeez, H. Krishna, Prediction of heart disease using machine
learning algorithms. Int. J. Eng. Technol. 7(2.32), 363–366 (2018)
3. M. Marimuthu, M. Abinaya, K.S. Hariesh, K. Madhankumar, V. Pavithra, A review on heart
disease prediction using machine learning and data analytics approach. Int. J. Comput. Appl.
(0975-8887) 181(18) (2018)
4. S. Sharma, M. Parmar, Heart diseases prediction using deep learning neural network model.
Int. J. Innov. Technol. Explor. Eng. (IJITEE) 9(3) (2020). ISSN 2278-3075
5. S.N. Pasha, D. Ramesh, S. Mohmmad, A. Harshavardhan, Shabana, Cardiovascular disease
prediction using deep learning techniques. IOP Conf. Ser.: Mater. Sci. Eng. 981, 022006 (2020)
6. D. Shah, S. Patel, S.K. Bharti, Heart disease prediction using Machine learning techniques. SN
Comput. Sci., 342–345 (2020)
7. T.K. Sajja, H.K. Kalluri, A deep learning method for prediction of cardiovascular disease using
convolutional neutral network. Int. Inf. Eng. Technol. Assoc. 34(5), 601–606 (2020)
8. Y. Khourdifi, M. Bahaj, Heart disease prediction and classification using Machine learning
algorithms optimized by particle swarm optimization and ant colony optimization. Int. J. Intell.
Eng. Syst. 12(1), 242–252
9. G. Jignesh Chowdary, G. Suganya, M. Premalatha, Effective prediction of cardiovascular
disease using cluster of machine learning algorithms. J. Critical Rev. 7(18), 2192–2201 (2020).
ISSN-2394–5125
10. https://www.kaggle.com/ronitf/heart-disease-uci
Tracking Misleading News of COVID-19
Within Social Media
Mahboob Massoudi and Rahul Katarya
Abstract The individuals locked themselves in their residences and turned to social
media to stay updated on COVID-19 news and to pass the time when the pandemic
struck and governments announced lockdowns. As a consequence, dealing with the
fake news about COVID-19 posed a significant challenge for the public. So, the
World Health Organization (WHO) has asked that its formally approved information
and reports be portrayed as top results in any COVID-19-related search on Google,
YouTube, Facebook, LinkedIn, Microsoft, Yahoo, and Twitter. We conducted a thor-
ough investigation to assess and select appropriate solutions, knowledge, and skills
for current issues, as well as the use of tools for tracking fake news within social
media in this study. The search for this work started with a physical search of scien-
tific publications and papers concerning fake news related to COVID-19. During
pandemic, the majority of hospitals utilized artificial intelligence techniques to diag-
nose virus-infected patients, and most governments and businesses used robots to
limit the virus’s exposure to their employees and customers, distribute sanitizers, and
advise the public to “stay safe, stay home.”
Keywords COVID-19 · Artificial intelligence · Fake news · Robots · Social

media · WHO
1 Introduction
A pile of serious illness pneumonias was discovered in Wuhan City, China, and
was stated to the World Health Organization (WHO) at the end of December 2019.
Acute pneumonia was originally diagnosed in these patients. Some of them labored
in Wuhan’s fish market, where they developed fever, sore throat, fatigue, and, in
more extreme cases, shortness of breath. Conversely, as was first thought, these
signs were not of acute pneumonia. With the growing number of cases, in early
January 2020, China notified the WHO of the circumstances and its unknowable
cause [1]. As a result, the WHO has designated the virus as a coronavirus causing
M. Massoudi (B) · R. Katarya

Delhi Technological University, New Delhi, India
98 M. Massoudi and R. Katarya
Severe Acute Respiratory Syndrome 2 (SARS-COV-2) and the disease as coronavirus

disease “COVID-19.” As a result, COVID-19 is a worldwide health obstacle that
necessitates utmost care, personal-specific cleanliness, and sanitation of all locations.
These procedures help to prevent mutants, allowing them to keep the virus under
control and contained [2].
The world has now become a small village as a result of transportation services
and social networks. It is now convenient to take passengers through one location to
the next, at least in terms of transportation. So, it accelerated the spread of COVID-
19 and turned it into a global epidemic. As for as the online media networks are
involved, they play a significant and precise role not just in publishing false COVID-
19-related news, but also in all aspects of our daily lives, as well as possibly even in
the globe’s numerous problems and conflicts. It is dangerous to spread and circulate
false information about this virus and its effects, as well as to create a sense of fear and
anxiety among the people [3]. As a result, the WHO has released a slew of COVID-
19 and “Infodemic”-related data, instructions, and alerts. An “Infodemic” is indeed
a viral infection through false information. It is difficult to check the authenticity,
validation, and accuracy of the information provided, particularly when it is about a
terrifying viral infection that threatens mankind [4]. In this manner, The WHO has
requested influential search engines as well as many other social digital networks like
LinkedIn, YouTube, Microsoft, Google, Yahoo, Facebook, and Twitter, to show its
formally approved information and reports as top searches in any COVID-19-related
search [5]. This WHO request indicates that when picking sources of information,
extreme caution must be exercised. Instead of what is promoted on social media,
we could rely on credible and valid communication channels including the WHO,
scientific research, and non-governmental organizations (NGOs).
The following is how the rest of the paper is laid out: The related work can be
found in Sect. 2. Third section depicts the research questions and the search process,
while Section 4 expresses the conclusion and future work.
2 Related Works
Experts have defined fake news in a variety of ways, but they all seem to agree on
the same definition. Fake news is defined as intentionally false information that is
disseminated in order to deceive and mislead people to believe untruths or unveri-
fiable facts. Fake news, according to this viewpoint, is defined as information that
looks like a legitimate news article and besides contains incorrect information [6, 7].
Simultaneously, spreading fake news on digital media networks is one of the massive
challenges to the world today, with the majority of users relying on it solely to obtain
news. Thus, social media can indeed be thought of as amplification for both real and
fake news. For instance, Facebook, one of the most influential social media network
in 2019, had around 1.5 billion registered members, with 62 percent of them using
it to keep up with news [8]. Therefore, the majority of false information identifica-
tion systems employ machine learning to assist users in determining whether or not
Tracking Misleading News of COVID-19 Within Social Media 99
the data they are accessing is false. This classification is obtained by analyzing the
provided data to certain preexisting datasets containing multiple false and true data
[9]. Furthermore, before being used to develop misleading information prediction
model employing machine learning methods, all training data need to go through the
following stages: data preparation and preprocessing, feature selection and extrac-
tion, and model selection and establishment. These common steps make dealing with
the massive amounts of data required to build prediction models much easier [10].
Many automated systems for detecting misleading information have been proposed
so far. The [11] conducted a comprehensive investigation into the identification of
misinformation on social media sites. Thus, Xichen as well as Ali [12] represented
a thorough review of the current findings in the field of false information. They also
described the impact of online false information, presented cutting-edge identifi-
cation techniques, and addressed the most frequently utilized datasets for building
fake news classification models. Asaad and Erascu [13] described a solution that
integrates numerous machine learning techniques for text classification to determine
news credibility. They analyzed the performance of their system on a dataset of
fake and true news incorporating multinomial naive Bayes as well as support vector
machine classification techniques.
The authors on 3,047,255 COVID-19-related twitter posts, assessed their proposed
classifier. Thus, among the 10 machine learning algorithms the decision tree, neural
network, and logistic regression classifiers provided the highest performance results
[3]. Therewith, Marina and Kin [14] looked at different descriptions of fake news
and came up with one based on complete factual accuracy. They also present a
fake news detection system that integrates both manual and computerized content
authentication, as well as stylistic features. Moreover, the [15] presented a frame-
work for collecting, detecting, and visualizing fake news. They used fact-checking
webpages to gather fake and real news articles. The researchers then created a variety
of fake news detection systems using news and social contact features. Ultimately,
they demonstrated a conceptual platform to the discovered false news data. William
[16] announced the launch or a recent dataset for fake news detection. In addition,
this data collection includes 12,800 individually labeled short presentation from the
POLITIFACT website in a variety of contexts. This dataset was also notable for
being the first large dataset dedicated to detecting fake news. It is also bigger than
previous fake news datasets that have been made public. Consequently, throughout
the scenario of COVID-19 infection in Morocco, the [17] proposes a system that
uses a machine learning technique to analyze feedback on Facebook comments in
order to identify false propaganda and also an aggregation framework to detect and
investigate fake stories.
3 Research Questions
The task of our comprehensive study is to assess and pick the appropriate solutions,
knowledge and skills for current problems and the use of tools for monitoring fake
news within social digital media networks. Consequently, we have formulated a few
questions for which we want to figure out the best answers incorporating primary
studies.
1. Which method, machine learning or deep learning, is better for detecting fake
news?
2. What are the elements that contributed to the dissemination of fake news during
the COVID-19?
3. How could artificial intelligence help to stifle the spread of COVID-19 and
protect us from false information during the pandemic?
4. In the fight against COVID-19, how could robots assist frontline workers?
3.1 Search Process
The search for this work began with a physical search of scientific publications and
systematic review papers on fake news during COVID-19. In addition, this research
study has included articles from Science Direct, IEEE, Springer Link, A.C.M, and
other related journals.
3.2 Results
We looked over all of the research questions and came up with the following solutions.
1. Which method, machine learning or deep learning, is better for detecting fake
news?
Detecting false news is becoming one of the artificial intelligence scientists’ most
important tasks. So, the machine learning and the deep learning approaches are the
two primary techniques for detection. So, the [18] used a variety of deep learning and
machine learning algorithms on their proposed model, including logistic regression,
support vector machine, naive Bayes, recurrent neural network (RNN), and long
short-term memory (LSTM), with the best result coming from support vector machine
with an accuracy of 89.34%. Accordingly, the machine learning techniques achieved
utmost result in their model when comparing deep learning and machine learning
techniques.
The [19] compared the different techniques of deep learning and machine learning
in their proposed system and the high performance obtained by the recurrent neural
network as well as hybrid convolution neural network model. So, in their proposed
models the deep learning algorithms outperformed machine learning algorithms.
Hence, deep learning algorithms are more stable and accurate than cutting-edge
methods, and they have represented their effectiveness in a variety of applications,
including false information, spam, rumor, fake news, and disinformation detection.
Deep learning techniques are also extremely flexible and therefore can be easily
adapted to a new challenge [20, 21]. In a nutshell, both machine learning and deep
learning have displayed superior performance in various models so far [22], but the
majority of researchers prefer deep learning techniques for identifying fake news.
2. What are the elements that contributed to the dissemination of fake news during
the COVID-19?
In early 2020, the COVID-19 outbreak led to a widespread lockdowns around the
globe. With billions of humans stranded at home, people are constantly turning to
social media, which is performing a critical task in the spread of false news as well as
people have shared COVID-19 posts within social media even if they were inaccurate
in order to remain informed, assist others, communicate with others, or keep busy[23].
Moreover, since the COVID-19’s discovery, fake news has spread across the internet,
claiming to have therapies for the virus and advice on how to handle it. In addition, a
deluge of false information about deadly virus has led several people to believe that
they can be healed by drinking bleach or salty and ocean water [18, 24]. Furthermore,
the majority of people use social media to kill time and have fun. Hence, users are
using social media to keep busy during the total lockdown imposed by the COVID-
19 pandemic. As a result, people are less willing to check COVID-19 data before
exchanging it, potentially leading to the proliferation of fake and false news [25].
Figure 1 shows users looking for a variety of applications simultaneously.
3. How could artificial intelligence help to stifle the spread of COVID-19 and
protect us from false information during the pandemic?
Every minute of every day, people are surrounded with information. 98,000 tweets,
160 email messages, and 160 video clips are sent, received, and posted each minute.
As a result, the best way to tackle fake news is to develop artificial intelligence-
based automated systems [27]. Moreover, as the number of cases of coronavirus
increased throughout China, hospitals turned to artificial intelligence to help them
diagnose infected patients more rapidly. With hospitals already overburdened by the
Fig. 1 Searching for COVID-19 news [26]

Fig. 2 Robots helping the health workers [30]
pandemic’s scope, artificial intelligence has been used to “identify visual cues of
the pneumonia linked with COVID-19 on photographs from lung CT scans,” as per
Wired [28]. Thus, according to Professor Andrew Hopkins, artificial intelligence
could be used to develop antibodies and vaccines, as well as design a medication, to
combat both present and future coronavirus outbreaks, due to its ability to cope with
large amounts of data [29].
4. In the fight against COVID-19, how could robots assist frontline workers
Because the COVID-19 virus is a “new” phenomenon, people have really no immu-
nity to it. Anyone who is attached to it can become infected, resulting in severe
disease and death. As a result, to decrease the risk of healthcare practitioners’ inter-
actions with ill patients. So, instead of putting personnel at hazard at intake points,
robots have been used to check for patients who may have symptoms such as extreme
temperatures or sneezing. Moreover, not only do robots improve the hospital’s abil-
ities, but they also reduce the danger to both patients and staff. Robotic assistants
could significantly enhance our ability to combat and eliminate this challenge to our
loved ones [30]. In addition, limiting populations exposed to COVID-19 are among
the most effective efforts to tackle it. Robots allow businesses to keep functioning
even while defending their staff and customers in a time of social disconnection [31].
Thus, several South Korean companies have used robots to monitor temperature and
disseminate sanitizer in the wake of COVID-19 outbreak. In a similar vein, the Singa-
pore government has also begun employing spy robots to inform citizens to “remain
safe, stay at home.” Singapore’s state has been using robots to ensure that no public
crowds in public areas spread the devastating coronavirus [32, 33]. Figure 2 shows
the example of using robots in industries.
4 Discussion
The most important takeaway from this survey is that using artificial intelligence
to combat pandemics whenever they occur is the best way to go. Because artificial
intelligence is designed to mimic the human brain, so whenever a task poses a risk
to human life, we can use artificial intelligence instead, particularly robotics, to
complete it [34]. Undoubtedly, the use of robotics and artificial intelligence can
aid in breaking the chain of human exposure to COVID-19, as well as reducing
the number of COVID-19 cases seen on a daily basis. Swiftly, after the COVID-19
was revealed, the World Health Organization (WHO) suggested which AI might be
a useful tool for dealing with the virus’s crisis. In addition, AI has demonstrated
the best performance in accurately recognizing COVID-19 patients. In a nutshell,
artificial intelligence (AI) is a revolution for the world, particularly for human health.
5 Conclusion and Future Works
The proliferation of the COVID-19 is one of the most dangerous events. So, people
look to social media for reliable information on how to defend themselves. Individ-
uals could even die as a result of misleading information. We conducted a systematic
review of tracking misleading information during the COVID-19 outbreak in this
article and discovered the following information. Consequently, when the pandemic
struck, people were confronted with two potentially dangerous phenomena: COVID-
19 and fake news. Individuals are using social media to pass the time and stay
informed about the latest pandemic news, which is one of the motives for the diffu-
sion of fake news during the pandemic. As a result, there is less willingness to eval-
uate COVID-19 information before expressing it, eventually leading to the spread
of fake news. In addition, artificial intelligence and robots are presenting a progres-
sively significant task during the pandemic. For example, most hospitals use arti-
ficial intelligence to diagnose patients, and the majority of businesses use robots
to reduce the risk of humans coming into contact with the virus and to distribute
sanitizers. Researchers also proposed methods for detecting COVID-19 related false
news within social media utilizing deep learning and machine learning algorithms
to address the problem of false information dissemination.
In the future, we want to develop a model for identifying fake news and rumors
within various social digital networks in both text and video formats. In addition, we
will also look to expand the model to detect fake news written in languages other
than English.
References
1. WHO | World Health Organization, https://www.who.int/. Last accessed 27 Jan 2021

2. Naming the coronavirus disease (COVID-19) and the virus that causes it, https://www.who.int/
emergencies/diseases/novel-coronavirus-2019/technical-guidance/naming-the-coronavirus-
disease-(covid-2019)-and-the-virus-that-causes-it. Last accessed 27 Jan 2021
3. M.K. Elhadad, K.F. Li, F. Gebali, Detecting misleading information on COVID-19. IEEE
Access 8, 165201–165215 (2020). https://doi.org/10.1109/access.2020.3022867
4. W.H.O. Fights a Pandemic Besides Coronavirus: An ‘Infodemic’—The New York Times.

https://www.nytimes.com/2020/02/06/health/coronavirus-misinformation-social-media.html.
Last accessed 3 March 2021
5. Facebook, Reddit, Google, LinkedIn, Microsoft, Twitter and YouTube issue joint statement on
misinformation | TechCrunch. https://techcrunch.com/2020/03/16/facebook-reddit-google-lin
kedin-microsoft-twitter-and-youtube-issue-joint-statement-on-misinformation/. Last accessed
3 March 2021
6. T. McGonagle, “Fake news”: false fears or real concerns? Netherlands Q. Hum. Rights. 35,
203–209 (2017). https://doi.org/10.1177/0924051917738685
7. A. Duffy, E. Tandoc, R. Ling, Too good to be true, too good not to share: the social utility
of fake news. Inf. Commun. Soc. 23, 1965–1979 (2020). https://doi.org/10.1080/1369118X.
2019.1623904
8. N. Thompson, X. Wang, P. Daya, Determinants of news sharing behavior on social media. J.
Comput. Inf. Syst. 60, 593–601 (2020). https://doi.org/10.1080/08874417.2019.1566803
9. J.Y. Khan, M.T.I. Khondaker, A. Iqbal, S. Afroz, A benchmark study on machine learning
methods for fake news detection. arXiv. 1–14 (2019)
10. M.O. Дaвiдeнкo, T.O. Бiлoбopoдoвa, Model-oriented fake news detection on social
media. BICHИК CXIДHOУКPAЇHCЬКOГO HAЦIOHAЛЬHOГO УHIBEPCИTETУ iмeнi
Boлoдимиpa Дaля, 31–36 (2019). https://doi.org/10.33216/1998-7927-2019-253-5-31-36
11. R.K. Kaliyar, N. Singh, Misinformation detection on online social media—a survey, in 2019
10th International Conference on Computing, Communication and Networking Technologies
(ICCCNT 2019), pp 1–6 (2019). https://doi.org/10.1109/ICCCNT45670.2019.8944587
12. X. Zhang, A.A. Ghorbani, An overview of online fake news: characterization, detection, and
discussion. Inf. Process. Manag. 57, 102025 (2020). https://doi.org/10.1016/j.ipm.2019.03.004
13. B. Al Asaad, M. Erascu, A tool for fake news detection, in 20th International Symposium on
Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2018), pp. 379–386
(2018). https://doi.org/10.1109/SYNASC.2018.00064
14. M.D. Ibrishimova, K.F. Li, A Machine Learning Approach to Fake News Detection Using
Knowledge Verification and Natural Language Processing (Springer International Publishing,
2020).https://doi.org/10.1007/978-3-030-29035-1_22
15. K. Shu, D. Mahudeswaran, H. Liu, FakeNewsTracker: a tool for fake news collection, detection,
and visualization. Comput. Math. Organ. Theory. 25, 60–71 (2019). https://doi.org/10.1007/
s10588-018-09280-3
16. W.Y. Wang, “Liar, liar pants on fire”: a new benchmark dataset for fake news detection, in
The 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017) (Long
Pap. 2), pp. 422–426 (2017). https://doi.org/10.18653/v1/P17-2067
17. O. Maakoul, S. Boucht, K. El Hachimi, S. Azzouzi, Towards evaluating the COVID’19 related
fake news problem: case of Morocco, in 2020 IEEE 2nd International Conference on Elec-
tronics, Control, Optimization and Computer Science (ICECOCS 2020) (2020). https://doi.
org/10.1109/ICECOCS50124.2020.9314517
18. Abdullah-All-Tanvir, E.M. Mahir, S. Akhter, M.R. Huq, Detecting fake news using machine
learning and deep learning algorithms, in 2019 7th International Conference on Smart
Computing and Communications (ICSCC 2019), pp. 1–5 (2019). https://doi.org/10.1109/
ICSCC.2019.8843612
19. W. Han, V. Mehta, Fake news detection in social networks using machine learning and deep
learning: performance evaluation, in Proceedings of the IEEE International Conference on
Industrial Internet, ICII 2019, pp. 375–380 (2019). https://doi.org/10.1109/ICII.2019.00070
20. M.R. Islam, S. Liu, X. Wang, G. Xu, Deep learning for misinformation detection on online
social networks: a survey and new perspectives. Soc. Netw. Anal. Min. 10, 1–20 (2020). https://
doi.org/10.1007/s13278-020-00696-x
21. R. Katarya, M. Massoudi, Recognizing fake news in social media with deep learning: a
systematic review, in 4th International Conference on Computer, Communication and Signal
Processing, ICCCSP 2020. Institute of Electrical and Electronics Engineers Inc. (2020). https://
doi.org/10.1109/ICCCSP49186.2020.9315255
22. M. Massoudi, N.K. Jain, P. Bansal, Software defect prediction using dimensionality reduction
and deep learning, in Proceedings of the 3rd International Conference on Intelligent Commu-
nication Technologies and Virtual Mobile Networks, ICICV 2021, pp. 884–893 (2021). https://
doi.org/10.1109/ICICV50876.2021.9388622
23. How does fake news of 5G and COVID-19 spread worldwide? https://www.medicalnewst
oday.com/articles/5g-doesnt-cause-covid-19-but-the-rumor-it-does-spread-like-a-virus#Fac
tors-behind-the-spread-of-misinformation. Last accessed 22 April 2021
24. V. Lampos, M.S. Majumder, E. Yom-Tov, M. Edelstein, S. Moura, Y. Hamada, M.X. Rangaka,
R.A. McKendry, I.J. Cox, Tracking COVID-19 using online search. NPJ Digit. Med. 4 (2021).
https://doi.org/10.1038/s41746-021-00384-w
25. O.D. Apuke, B. Omar, Fake news and COVID-19: modelling the predictors of fake news
sharing among social media users. Telemat. Inf. 56, 101475 (2021). https://doi.org/10.1016/j.
tele.2020.101475
26. Our itch to share helps spread Covid-19 misinformation | MIT News | Massachusetts Institute
of Technology. https://news.mit.edu/2020/share-covid-19-misinformation-0709. Last accessed
22 April 2021
27. (1) New Message! https://www.mygreatlearning.com/blog/role-of-ai-in-preventing-fake-
news-weekly-guide/. Last accessed 22 April 2021
28. The Role of AI during the Coronavirus Pandemic | Blue Fountain Media. https://www.bluefo
untainmedia.com/blog/role-ai-during-coronavirus-pandemic. Last accessed 22 April 2021
29. Coronavirus: How can AI help fight the pandemic? BBC News, https://www.bbc.com/news/
technology-51851292. Last accessed 22 April 2021
30. How robots are helping combat COVID-19. https://www.automate.org/blogs/how-robots-are-
helping-combat-covid-19. Last accessed 22 April 2021
31. 10 examples of robots helping to fight COVID. https://www.forbes.com/sites/blakemorgan/
2020/04/22/10-examples-of-robots-helping-to-fight-covid/?sh=768f77d0f4bf. Last accessed
22 April 2021
32. South Korea: Robot with artificial intelligence helps fight COVID-19 spread. https://www.rep
ublicworld.com/world-news/rest-of-the-world-news/robot-with-artificial-intelligence-helps-
fight-covid-19-spread.html. Last accessed 22 April 2021
33. Coronavirus: Will Covid-19 speed up the use of robots to replace human workers? BBC News,
https://www.bbc.com/news/technology-52340651. Last accessed 22 April 2021
34. M. Massoudi, S. Verma, R. Jain, Urban sound classification using CNN, in Proceedings of the
International Conference on Inventive Computation Technologies (ICICT 2021), pp. 583–589
(2021). https://doi.org/10.1109/ICICT50816.2021.9358621
Energy aware Multi-chain PEGASIS
in WSN: A Q-Learning Approach
Ranadeep Dey, Parag Kumar, and Guha Thakurta
Abstract A battery level-aware Q-learning [BLAQL] technique-based PEGASIS in

wireless sensor networks is proposed to improve energy efficiency and, henceforth,
the lifetime of nodes of the network. The learning agent will interact with an action,
and its Q-value gets updated by the reward receiving from the working environment.
The proposed routing method is introduced to all sink-based multiple chains available
in the network for reaching to the gateway. The Q-value of source node is updated
by reward that is coming from neighbor’s battery level information. The feedback
section of the data packet has to carry the component of the Q-value. By this proposed
routing technique, it is possible to learn the source node with preferable routes toward
gateway, which in turn a better way for transmitting packet is obtained with the
increment of rounds. The simulation results show the effectiveness of the proposed
method over the existing techniques.
Keywords Wireless Sensor Network · PEGASIS · Q-Learning · Battery Level ·

Residual Energy · NodesLifetime
1 Introduction
Wireless sensor networks (WSNs) are a group of multiple low-powered sensor nodes
that are responsible to collect readings from the environment [1]. In order to obtain
energy efficiency [2], the routing of information is performed by following a hierar-
chical structure of nodes in WSNs. In a hierarchical WSN, at least one gateway is
there to assemble all sensed and processed data for future use. Sink nodes along with
the gateway are responsible to collect processed and aggregated raw data from cluster
heads (CHs) [3] as shown in Fig. 1. The sensor nodes send their data to their corre-
sponding CHs which in turn are forwarded to the respective sink nodes. Recently,
a chain-based hierarchical routing such as power-efficient gathering in sensor infor-
mation systems (PEGASIS) is used for its simplicity to set up and easy to maintain
R. Dey (B) · P. Kumar · G. Thakurta

Department of Computer Science & Engineering, National Institute of Technology, Durgapur,
India
108 R. Dey et al.
Fig. 1 Hierarchical WSN layout
[4]. In PEGASIS, all nodes are organized into a linear chain for data transmission,
and more importantly, this chain can be formed via any sink node with a centralized
approach [5]. Furthermore, PEGASIS supports multi-chain topology for mobile or
static hierarchical structure of nodes in WSN [6]. However, simple PEGASIS is not
that much robust and scalable with multi-path options [5].
In this paper, a battery level-aware Q-learning [BLAQL] technique-based
PEGASIS in WSNs is proposed to improve energy efficiency and, henceforth, the
lifetime of nodes of the network. Here, the learning agent will interact with an action,
and its Q-value gets updated by the reward receiving from the working environment.
In order to reach the gateway, the proposed routing method is introduced to all sink-
based multiple chains available in the network; where Q-value of source node is
updated by reward that is coming from neighbor’s battery level information in Fig. 1.
As the cost function for any route is based on battery level, so the feedback section
of the data packet has to carry the component of the Q-value. As a result of this
proposed routing technique, it is possible to learn the source node with preferable
routes toward destination gateway which in turn a better way for transmitting data
packet is obtained with the increment of rounds. The simulation results show the
effectiveness of the proposed method over the existing techniques.
The rest of this paper is organized as: Sect. 2 presents a brief of related works
for completeness of the work. In Sect. 3, the system model is presented. Next,
Sect. 4 discusses the proposed approach. The simulation studies are shown in Sect. 5
followed by the conclusions in Sect. 6.
Energy aware Multi-chain PEGASIS in WSN: A Q-Learning Approach 109
2 Related Works
In recent times, a WSN is used to collect raw data from dynamic environment and
process as well as send them to a gateway for future use. A chain-based routing tech-
nique like PEGASIS is used in the cluster-based hierarchical WSN [1–7] for several
scenarios [8]. In [9], PEGASIS is used to maximize the lifetime of any WSN using
sink mobility. An improvement of PEGASIS routing protocol has been attempted in
[10]. The PEGASIS protocol constructs the chain, and each node delivers the sensing
data to the nearest neighbor node [9]. However, as PEGASIS is not useful in dynamic
scenarios with multiple chains. Hence, a chain-based hierarchical routing protocol
needs to design which considers for obtaining better network performance [11] in
terms of various quality of service (QoS) parameters such as packet loss tolerance,
delay, network bandwidth, energy awareness. Simultaneously, extensive research
has been carried out in routing for WSNs using various machine learning (ML)
techniques. A reinforcement learning (RL)-based energy-efficient routing protocol
is discussed in [12]. As working on a WSN, it is always preferable to reduce time
for routing and for improvement of the energy efficiency of different levels of nodes;
a Q-learning-based technique [13] provides a thought for better preferable route
by learning a source node to select the precise way toward the destination. For
completeness of proposed work, a brief outline of Q-learning is discussed next.
2.1 Q-learning
In reinforcement learning (RL) [14], the learning agent takes action (At ) toward
environment, and the agent is getting a reward (Rt ) from environment based on (At ).
By having this reward, the agent will take its next action toward the environment
whether the reward may be a positive or negative one. This procedure, as shown
in Fig. 2, is continued until the agent learns to take better actions for future. With
Fig. 2 RL technique
110 R. Dey et al.
iterations, the environment also sends the statement of the state of task (S t ) to the
agent which helps the agent to learn the scenarios of the system as feedback in Fig. 2.
A Q-learning [15] technique is a popular method of RL. Here, the learning agent
interacts with an action, and its Q-value gets updated by the reward receiving from
the working environment as follows.
Q i ← (1 − α)Q i + α (Ri + γ (maximum of Q i at any iteration)) (1)
Here, Qi is the current Q-value of any agent i; and getting updated by taking
action. Here, the parameters α and γ used in (1) denote the learning rate and the
discount factor, respectively. One of the most important features of Q-learning is a
model-free RL technique to learn from the rewards of an action in a particular state.
It does not require a fixed model of environment to perform, so it has an extensive
and efficient use in routing problems of WSN [16].
3 System Model
In Fig. 1, it is already discussed that the sensor nodes in the network are used to
collect unprocessed data from environment, and aggregated data are available at
CHs [17]. All of these nodes in hierarchical WSNs are assumed to be stationary and
homogenous for the setup. Here, four sink nodes are considered as S1, S2, S3, and
S4, and these are directly connected with 12 number of CHs such as CH1, CH2, …,
and CH12 as shown in Fig. 3.
In order to send data toward gateway, sink nodes are requested to deliver processed
data from CHs. For example, CH1 is sending data toward gateway via immediate
Sink Node S1 as shown in Fig. 3. However, in a common scenario, it is not suitable
as any CH is connected with only one sink, this may provide a catastrophic result
when the sink failed to forward data packets toward gateway by means of any reason.
Fig. 3 CH to gateway via sink

For example, if S2 is failed, then CH5, CH6, and CH7 would not be able to send data
packets for destination gateway as shown in Fig. 3. To overcome from such scenario,
a hierarchical WSN structure is considered in the proposed work with availability of
multiple paths to route data packets toward gateway as shown in Fig. 4. Here, CH1
is connected with all other possible four sink nodes of the network. To reach the
gateway, these four possible routes via sink nodes are connected in one hop distance.
By following such all possible connections between CHs and sink, a layer-wise
mess connection for WSNs is obtained which is shown in Fig. 5.
In order to select the desired path among multiple alternatives as shown in Fig. 5,
the transmission traffic via sink for any moment is reduced in that hierarchical WSN
which in turn decreases the energy requirement for communication. Hence, the
network shown in Fig. 5 can be mapped in to an equivalent presentation as shown
Fig. 4 Multiple routes from CH to gateway via sink
Fig. 5 All possible connections from CH to gateway via sink

112 R. Dey et al.
Fig. 6 Routing network with CHs, sinks, and gateway
in Fig. 6. The proposed work follows the network structure shown in Fig. 6. Here,
CHs and gateway are the source and the destination, respectively. In addition, the
sink nodes are the immediate neighbors of these source nodes which can signify that
the data packets are routed toward destination via neighbors.
4 Proposed Approach
In the proposed work, a multi-chain PEGASIS for WSN is discussed depending

on a Q-learning technique for routing packets. Here, the Q-value is updated at the
source node by the reward receiving from the neighbor’s node using (1). Considering
QS t (destination, neighbor) is the time required for node (S) to send data packet
to destination node via neighbor. Every node in the network stores a table which
estimates the Q-value for each pair of neighbors and destination. So, when node
S sends a packet to a neighbor node with destination information, it immediately
hits back to node S with its own estimate of time (T ) for the remaining travel to
destination; and then, Node N updates its Q-value as per the positive or negative
reward it received. Henceforth, the new Q-value is obtained as follows.
Q tS (D, N ) = Q tS (D, N ) + Q tS (D, N ) (2)
where D and N denote destination node and neighbor node, respectively. The in
(2) is the change factor after receiving reward. This (2) can be further expressed by
the following.

Q tS (D, N ) = α p + q + T − Q tS (D, N ) (3)
In (3), p denotes the time between receiving a message and forwarding it; and, q
is the time during which a message spent traveling to the next node.
The proposed routing (BLAQL technique) is dependent on the neighbor’s battery
level as well as the battery level of all sink nodes which may not be same, or some
of them may fail to transmit packet at any time. Here, the Q-value of source node
is updated by a reward, which is based on battery level information (BS ) of the
neighboring nodes. So, if the source node S has m direct (one hop) neighbors, then
the neighbor selection is based on the following.

f BL = Maximum of Battery_ Leveli ; i = 1, 2, 3 . . . , m (4)
Whenever the network is reconstructed, the new route costs must be re-learned by
analysis and update the Q-value based on neighbor’s battery level. Hence, proposed
BLAQL technique is to learn the agent source node about the preferable chain or route
to follow for destination gateway. This procedure of BLAQL based on multi-chain
routing technique is described by the following Algorithm 1.
Algorithm 1: BLAQL based on multi-chain routing technique
Step 1: Sink nodes initiate the data packet transmission by sending signal to all
CHs.
Step 2: After receiving the signal CHs, start to prepare the data packets for sending,
and maintain a Q-value (Qi ) for cost to reach destination. Qi = QS t (D, N).
//* D → destination node, N → neighbor sink nodes, S → source node/CH,
and QS t → the estimated time for transmission *//
Step 3: CHs send the packets to all the possible neighbor sink nodes (m numbers
of sink nodes) of this network with battery level (BS ) feedback section.
Step 4: All sink nodes send feedback to all CHs along with the information BS .
Step 5: After getting feedbacks as reward, CH updates its Qi using selection
function f BL , and choose the preferable neighbor for data transmission.
Step 6: Data packets received by destination (D) gateway following the selected
route from any source (S) CH.
5 Simulation Studies
To carry out the simulations for the proposed work, the following simulation setup
is considered as shown in Table 1.
Figure 7 shows the difference between the residual energy for both of the proposed
and existing works in accordance with the number of rounds. In Fig. 7, it is observed
that the energy loss is higher at the early stages for proposed BLAQL-based PEGASIS
over simple PEGASIS. It is obtained as to make learn the source node about the choice
of the proper chain to reach the destination. By that reason for initial rounds, residual
114 R. Dey et al.
Table 1 Parameters used in

Parameters Value
simulations
Network size 100 m × 100 m
Number of nodes 100
Initial energy of nodes for PEGASIS 50 Unit
Total number of rounds 5000
Packet size 2000 bits
Number of sink nodes 4
Number of simulation runs 10
Tool used for implementation MATLAB R2010a
Fig. 7 Comparison on residual energy
energy for BLAQL-based PEGASIS is in lower side. However, they merge on same
scale later.
Now, Fig. 8 provides a comparison between normalized average energy used
per round for both of these proposed as well as existing works. Here, it is clearly
found that the used energy on the later rounds is lesser for BLAQL-based PEGASIS
compared to the existing work [9] as source nodes are learning to take better actions.
Fig. 8 Comparison on normalized average energy used per round
In Fig. 9, the number of alive nodes is shown with number of rounds for both
proposed as well as existing works. For these cases, the number of alive nodes is
reducing with number of rounds. Similarly, the number of dead nodes increases as
the simulation is initiated with 100 number of nodes. Hence, it can determine the
number of dead nodes by comparing the alive nodes along with the total number of
initial nodes; where increasing number of dead nodes with rounds will be counted
as a shortcoming for any routing technique.
Now, for every 10 simulations, the scenarios change with alive and dead nodes.
Based on these multiple simulations, the average cases of first node dies (FND), half
nodes dies (HND), and last node dies (LND) [18] are determined accordingly for
both of these proposed as well as existing works. In Fig. 10, it is observed that the
average FND with respect to the number of rounds is quite high for the proposed
work over the existing one. However, the average HND and the average LND show
a marginal equality for both of these procedures.
Perhaps, with increasing sink nodes along with sensor nodes, possibility of
multiple chains increases, and there all possible paths come under the considera-
tion to be chosen, and from these above comparisons, it can easily find that once
the proposed BLAQL-based PEGASIS is used to learn the source node for taking
116 R. Dey et al.
Fig. 9 Comparison on number of alive nodes
Fig. 10 Comparison on
average FND, HND, and
LND
better chain to deliver packets to destination node provides a better outcome than the
existing one.
6 Conclusion
In this paper, an energy-efficient BLAQL-based PEGASIS is introduced for multi-

chain hierarchical routing protocol in a WSN. The concept of Q-learning is used
to re-calculate Q-values accordingly. The proposed work is initiated using simple
PEGASIS to find multiple chains available in the network which is followed by
choosing the preferable chain to route packets from source to destination node. With
the increasing networks size, the re-calculation of Q-values at agent node will make it
learn to choose preferable neighbor wisely. Simulation results show an improvement
obtained by the proposed approach over existing one. The energy efficiency of the
routing approach can be improved further by using network load-balancing approach.
References
1. C. Buratti, A. Conti, D. Dardari, R. Verdone, An overview on wireless sensor networks

technology and evolution. Sensors 9, 6869–6896 (2009). ISSN 1424-8220
2. A.B. Patel, H.B. Shah, Reinforcement learning framework for energy efficient wireless sensor
networks. Int. Res. J. Eng. Technol. (IRJET) 02(03) (2015)
3. V.V. Ghate, V. Vijayakumar, Machine learning for data aggregation in WSN: a survey. Int. J.
Pure Appl. Math. 118(24) (2018). ISSN 1314-3395
4. L. Chan, K.G. Chavez, H. Rudolph, A. Hourani, Hierarchical Routing Protocols for Wireless
Sensor Network: A Compressive Survey. Springer Science + Business Media, LLC, part of
Springer Nature (2020)
5. X. Liu, Atypical hierarchical routing protocols for wireless sensor networks: a review. IEEE
Sensors J. (2015). https://doi.org/10.1109/JSEN.2015.2445796
6. B.G. Min, J.S. Park, H.G. Kim, J.G. Shon, Improvement of Multi-chain PEGASIS Using Relative
Distance. Springer Nature Singapore Pte Ltd. (2019)
7. L. Malathi, R.K. Gnanamurthy, Cluster based hierarchical routing protocol for WSN with
energy efficiency. Int. J. Mach. Learn. Comput. 4(5) (2014)
8. S.-M. Jung, Y.-J. Han, T.-M. Chung, The concentric clustering scheme for efficient energy
consumption in the PEGASIS, in ICACT2007 (2007). ISBN 978-89-5519-131-83560
9. M.R. Jafri, N. Javaid, A. Javaid, Z.A. Khan, Maximizing the lifetime of multi-chain PEGASIS
using sink mobility. World Appl. Sci. J. 21(9), 1283–1289 (2013)
10. A.A. Hussein, R.A. Khalid, Improvements of PEGASIS routing protocol in WSN. Int. Adv. J.
Eng. Res. (IAJER) 2(11), 01–14. ISSN 2360-819X
11. M. Haque, T. Ahmad, M. Imran, Review of Hierarchical Routing Protocols for Wireless Sensor
Networks. Springer © Nature Singapore Pte. Ltd. (2018)
12. S.E. Bouzid, Y. Serrestou, K. Raoof, M.N. Omri, Efficient routing protocol for wireless
sensor network based on reinforcement learning, in 5th International Conference on Advanced
Technologies For Signal and Image Processing, ATSIP’ 2020, Tunisia (2020)
118 R. Dey et al.
13. K.-L. Alvin Yau, H.G. Goh, D. Chieng, K.H. Kwong, Application of Reinforcement Learning to
Wireless Sensor Networks: Models and Algorithms. Springer-Verlag Wien, © Springer (2014)
14. X. Wang, Q. Zhou, C. Qu, G. Chen, J. Xia, Location updating scheme of sink node based on
topology balance and reinforcement learning in WSN. IEEE Access 7 (2019)
15. A. Arya, A. Malik, R. Garg, Reinforcement learning based routing protocols in WSNs: a survey.
Int. J. Comput. Sci. Eng. Technol. (IJCSET) 4(11) (2013). ISSN 3345
16. M.A. Alsheikh, S. Lin, D. Niyato, H.-P. Tan, Machine learning in wireless sensor networks:
algorithms, strategies, and applications. IEEE Commun. Surv. Tutor. 16(4) (2014)
17. A. Diop, Y. Qi, Q. Wang, S. Hussain, An efficient and secure key management scheme for
hierarchical wireless sensor networks. Int. J. Comput. Commun. Eng. 1(4) (2012)
18. A. Mansura, M. Drieberg, A.A. Aziz, V. Bassoo, Multi-energy threshold-based routing protocol
for wireless sensor networks, in 2019 IEEE (ICSGRC 2019), Shah Alam, Malaysia (2019)
Textlytic: Automatic Project Report
Summarization Using NLP Techniques
Riya Menon, Namrata Tolani, Gauravi Tolamatti, Akansha Ahuja,

and R. L. Priya
Abstract Academic project reports can be very verbose and lengthy since they
include comprehensive descriptions, diagrams, tables, charts, graphs, and illus-
trations. Such reports tend to be too long and detailed for quick assessment or
perusal. The proposed system aims to generate a concise extractive summary of
technical project reports. As each section of the report contains important details and
contributes to a sequence, it must be summarized separately. To achieve this objec-
tive, the system accepts a multi-page document as input and performs section-wise
segregation before processing the contents. It summarizes each section, retaining
the topic structure of the original document in the resulting output. Additionally, the
proposed system implements figure and caption extraction for respective sections
and also generates a downloadable summary output file. The resultant summary was
evaluated using the BLEU metric on an open-source dataset. An average score of
38.996% was obtained.
Keywords Textlytic · Natural language processing · Report summarization ·

Extractive text summarization · K-means clustering · Word2vec
R. Menon (B) · N. Tolani · G. Tolamatti · A. Ahuja · R. L. Priya

Computer Department, Vivekanand Education Society’s Institute of Technology, Mumbai, India
e-mail: 2018.riya.menon@ves.ac.in
N. Tolani
e-mail: 2018.namrata.tolani@ves.ac.in
G. Tolamatti
e-mail: 2018.gauravi.tolamatti@ves.ac.in
A. Ahuja
e-mail: 2018.akansha.ahuja@ves.ac.in
R. L. Priya
e-mail: priya.rl@ves.ac.in
120 R. Menon et al.
1 Introduction
Across institutions and types of academic training, professors are found to spend
roughly half [1] of their working hours in non-teaching activities like planning,
collaborating. Corrections and various forms of assessments are the major activities
among them and occupy a huge chunk of their valuable time. This added load is
often multifold in the scientific field.
Writing detailed reports is central to any scholarly pursuit, be it research, project
implementations, or any other academic work. Such reports are used to assess a
candidate for their academic year. These documents are typically lengthy and abound
with images, diagrams, graphs, and detailed descriptions of every stage and aspect
in the implementation of the project, making them tedious to assess.
In-depth coverage of a project is crucial for documentation and publication.
However, a succinct version usually suffices for the trained and experienced eye
of a mentor or professor who would be grading these. With piles of submissions to
assess and deadlines to meet, a summarization tool dedicated to and customized for
in-house project reports could prove to be useful.
In most of the similar existing systems, the structure and sequence of sections in
the report are not retained, and many of the tools do not allow PDF documents as
input, let alone multi-page documents. Moreover, these systems are usually not very
user-friendly. The proposed approach attempts to overcome these shortcomings by
implementing an interface that is easily navigable and user-friendly.
The paper presents the proposed solution as a web application that aims at summa-
rizing a project report by dividing the detailed project report into sections and then
preprocessing and summarizing each section to retain the important points under
respective headings.
The methodology used divides the uploaded PDF of the report into sections using
PyPDF2 and Camelot, both libraries native to Python. An extractive summarization
technique is then applied using word2vec followed by K-means clustering. Figures
and diagrams, if properly tabulated and indexed, are also extracted and retained in the
resulting summary. This avoids the accidental exclusion of any useful information
and, hence, allows to extract a precise inference.
The final output summary is generated by combining the summaries and respective
figures or diagrams of each of the sections formed and a provision to download the
summary as a PDF for convenience. A brief survey of existing systems and algorithms
has been accounted for in the literature review (Sect. 2). Further, Sect. 3 elucidates
upon the workflow of the proposed model. Section 4 explains the various stages in
implementation and evaluation. Section 5 analyzes the final system and evaluation
results. Finally, Sect. 6 states the conclusion and discusses the possible improvements
in the future.
Textlytic: Automatic Project Report Summarization … 121
2 Literature Review
Several existing systems aim to summarize text effectively without missing out on
important details of the text and also doing justice to the gold standards. A few of
the prevalent and widely used algorithms are discussed below based on the survey.
2.1 Recurrent Neural Network (RNN)
Abstractive text summarization was implemented using two-layered bidirectional

RNN [2]. This model is restricted to only short texts and demands strong hard-
ware configuration to train the dataset. Besides the high processing cost, abstractive
summaries are often riddled with grammatical and syntactic errors [3] which make
them unsuitable for practical use as of now.
2.2 TF-IDF
TF-IDF is used to determine the importance of the sentences and picks the top-ranked
sentences. This gave good results [4] with respect to feature extraction while scoring
sentences. Another implementation [5] ranked a sentence by calculating the product
of the TF-IDF score and index feature. Though the performance of TF-IDF is better
than the TextRank algorithm, scoring the sentences does not always end up selecting
the most important sentences, and results in losing important details are given a lower
rank.
2.3 Cross-entropy Method
This technique implemented a clean method of summarizing research papers section-

wise [6] and used a library to not only separate the sections but also extract images
and captions. The cross-entropy method implemented uses several iterations to deter-
mine the importance of the sentence. Though this will be great to obtain optimum
summaries as there will be an improvement at every step, the iterations can be costly
and time-consuming for longer documents such as reports.
122 R. Menon et al.
2.4 LSA
Latent semantic analysis uses SVD [7]. It determines the similarities between
sentences and gives weightage to each word accordingly. The most weighted
sentences are then selected. Selecting sentences that are responsible for introducing
a new topic in the document makes this method efficient. Although, it considers
a particular word to have the same meaning in every sentence—its context is not
considered.
2.5 TextRank
TextRank algorithm is based on PageRank. This is implemented by adding weights to

the relation of two vertices which is considered as an addition to the PageRank score
[7]. This method helps to connect the sentences based on the similarities. Although,
if a sentence does not have much similarity with other text, its score might be low
despite possibly containing important details. Thus, it may omit the keywords which
have a lower chance to appear though being meaningful in context.
2.6 K-Means and Word2vec
Implementation of this method [9] resulted in better BLEU scores, and the sentence-
based model gave better results than graph-based models. But, this model concen-
trated on only news articles and did not implement image or caption extractions for
the final summary.
3 Proposed Approach
The proposed approach focuses on splitting the PDF into sections and then summa-
rizing each resulting PDF to ensure the inclusion of a summary of every section.
The steps entailing the proposed approach can be further understood as follows as
depicted in Fig. 1.
3.1 Data Input
Any document in PDF format containing an index of sections and their page numbers
is considered as data input for the proposed system. Besides this, the user is required
Fig. 1 Modular diagram for Textlytic
to enter the page number of the index table and the page number of the first section.
The proposed approach is aimed to provide section-wise summarization to retain the
structure of the summary and to ensure important details of every section that are
included in the final report generation.
3.2 Document Splitting
The very first step in the proposed approach is to split the main PDF into sections
based on the topics mentioned in the index table using Camelot. Once the document is
divided into respective sections, cleaning the data is essential in the proposed system.
Before cleaning, PDF files are converted into text files using Python’s PyPDF2
library. Preprocessing of extracted data is further explained in the next section.
3.3 Preprocessing of Text
Data cleaning forms the base of any summarizer to remove noise from the given text
while preparing the data. There are different preprocessing steps, and the ones used
are mentioned below:
Tokenization—“Tokenization [8] is essentially splitting a phrase, sentence, para-
graph, or an entire text document into smaller units, such as individual words or
terms. Each of these smaller units is called tokens.” This step helps to interpret the
words present in the text and increases the efficiency of the summarization process.
124 R. Menon et al.
Decapitalizing—Since the proposed idea aims at summarizing academic project

reports, there is no mandate for the statements to be case-sensitive. Thus, converting
the entire text into a lowercase increases the efficiency in the process of cleaning the
text further.
Stop word removal—This step involves the removal of words that are used
frequently due to which they lose their semantic meaning. In summarization of project
reports, keywords are of greater importance, and so, removal of stop words does not
affect the final summary. The code checks for the words in the text against a prede-
termined list of stop words, which can be removed from the list of tokens/tokenized
sentences.
3.4 Methodologies Applied
After the preprocessing of text, word2vec and K-means algorithms are employed by
the proposed method. Word2vec generates vectors of words considering the syntactic
and semantic similarity of the words. These vectors are basically vectors of numbers
that represent the word. K-means clustering algorithm is used to group the word
vectors and form clusters containing similar word vectors. Along with these algo-
rithms, steps for image extraction and caption extraction are also carried out which
are elaborated upon in the upcoming sections.
3.5 Final Report Generation
Once the summarization of every section is done, the summaries of all the sections
are combined, and a final PDF is generated as an output. The text files of summaries
of all the sections are combined, and a Word file is formed, and finally, the Word
document is converted into PDF.
3.6 Development of Web Application
The final step in the proposed approach is to develop a web application with a user-
friendly interface. The functionalities included in the application are user registration
and authentication, a report upload form, a display of the final summary, and an option
to download the report summary.
4 Implementation
4.1 Preprocessing
Figure 2 depicts the document input for the proposed system before cleaning the text.
Tokenization, decapitalization, and stop word removal are the three preprocessing
techniques implemented, and the cleaned text is represented in Fig. 3. The natural
language toolkit (NLTK) libraries in Python were utilized. Preprocessing was also
Fig. 2 Before preprocessing

126 R. Menon et al.
Fig. 3 After preprocessing
done section-wise. Further, the resulting text was split along sentences, and new lines
were removed.
4.2 Methodology Employed
• Word2vec:
The word2vec model is a combination of the continuous bag of words (CBoW)
model and the Skip-gram model. Both the models are neural networks [10, 11]. The
CBOW model takes the context of each word as the input and predicts the target
word. The Skip-gram model predicts the context of a word by taking the target
word as input. Gensim’s word2vec model is used to generate vector representations
of all the words belonging to a particular section’s text [12]. The vectors are formed
in such a way that the words having syntactic and semantic similarities will be
close together in the vector space. Each sentence is associated with the average
of the word vectors present in that sentence.
• K-means clustering:
K-means clustering is an unsupervised learning algorithm that groups unla-
beled datasets into predefined clusters. The sentence vectors computed with the
help of the word2vec model act as input to the K-means clustering algorithm imple-
mented using Python’s Scikit learn machine learning library. The output is a list
of centroids computed for the predefined number of clusters in the vector space.
The Euclidean distance between the cluster’s centroid and sentences belonging to
that cluster is calculated using Python’s SciPy library [13]. The sentence closest
to each centroid is included in the final summary.
• Image and Caption Extraction:
Images and captions are an equally important part of the project report. For
extracting images, the PyMuPDF library in Python is used. For extracting the
captions, two methods are implemented—one using regular expressions (re library
in Python) and the other using an index table of figures that are a part of the project
report.
The first method, using regular expressions, was found to be too rigid and made
mapping a bit difficult as even the mentions of the figure were extracted, though it
extracted all captions perfectly. The index table for figures is typically a part of the
standard format of a project report. Using Camelot, just like the main index, this
table was also extracted, and the captions were saved into a list before being mapped
to the respective images. This method was found to be cleaner and effective.
4.3 Generating the Final Output
The in-built document function from the docx library in Python is used to combine the
text files generated as summaries of all the sections. This Word file is then converted
into the final PDF using the convert method from the docx2pdf library.
4.4 Evaluation Metrics
The proposed system uses an unsupervised approach for speedy development and a
dearth of relevant data. This is because not enough handwritten reference summaries
were available for the technical project reports which the system was intended to
process. Currently, the most prevalent evaluation metrics for text summarization
models are ROGUE and BLEU [2–10, 14–16]. Unfortunately, these rely on the
degree of overlap between the machine-generated summary and the reference (gold
standard) summary.
Scientific research papers and their reference summaries from an open-source
corpus by the WING NUS group [14] were used. The evaluation metric that is used
to analyze the performance of the proposed summarization tool is bilingual evaluation
understudy (BELU). Unigrams have been considered to measure the overlap. The
BLEU score lies between 0 and 1 for any given machine-generated summary and its
reference(s). Table 1 provides a helpful guide to interpret these scores [15].
The previously mentioned corpus had research papers in XML format. Input text
was extracted from this format and written into .txt files. These were then given
Table 1 Rough guidelines to

BLEU score (%) Interpretation
interpret BLEU scores
< 10 Almost useless
10–19 Hard to get the gist
20–29 Gist is clear but significant grammatical
errors
30–40 Understandable to good translations
40–50 High-quality translations
50–60 Very high quality, adequate, and fluent
translations
> 60 Quality often better than human
128 R. Menon et al.
as input against the reference summaries available in the corpus. BLEU score was
calculated using the nltk.translate package which is native to Python.
The aforementioned functionalities and output are made accessible through a user-
friendly interface, and the homepage of the same is shown in Fig. 4. Upon first
using Textlytic, the user first sees the homepage. Besides being able to navigate to
registration and login, a brief explanation of how Textlytic works and what makes it
different is available for reading. The description has been written keeping in mind
that the system will majorly be used by faculty and examiners.
The system can be scaled to store a repository of an institute’s project reports. In
such cases, authentication is essential before the system is used.
Having successfully logged into the system, the user needs to provide a few more
additional details before the summarization can begin. These are to be entered as
shown in Fig. 5. The page numbers where the index table starts and ends are required
so that the document can be split into corresponding sections correctly. A sizable
Fig. 4 Home page for Textlytic

Fig. 5 Required inputs before document summarization
number of reports contain a tabulated list of figures at the beginning of the docu-
ment. This list is utilized for image and diagram extraction later. It also helps capture
respective captions since the sequence and number are now known. Most reports
have several common pages at the beginning like cover page, certificate(s), acknowl-
edgment, and the like. These are not to be considered for generating the summary.
Therefore, the user is asked for the page number that marks the beginning of the
actual contents/first section. After these inputs, the user can upload the PDF docu-
ment of the report to be summarized within a few clicks. Textlytic begins processing
the document after the “summarize” button is clicked in Fig. 5.
After the various steps of splitting, cleaning, preprocessing, processing, and
assembly have been applied to the input data, the final summary is available to
view section-wise as shown in Fig. 6. Each section, as mentioned earlier, has under-
gone processing independently. Each resulting summary is displayed beneath the
corresponding heading, as in the original document. Retaining the original structure
of the document helps with consistent outputs. Images, figures, or graphs present in
any section are also captured and make it to the relevant portion of the summary.
Images in sections like “implementation” and “results” are prioritized.
The summary visible in Fig. 6 is available to peruse only as long as the user is
logged in and is viewing that particular page. However, if the user wishes to save the
summary for later or keep a record, they can also download the generated summary
as a PDF on their local machine. The PDF would also contain the relevant figures
and diagrams. The format of the resulting document is as shown below in Fig. 7.
On an AMD Ryzen 5 4500U with Radeon Graphics 2.38 GHz CPU with 8 GB
RAM, the system took 15.50 s to generate a summary for an input of a 93 pages
document. Finally, the proposed system was evaluated using state-of-the-art evalua-
tion metrics. The testing was carried out for five research papers with one reference
(gold standard) summary available for each. Figure 8 puts forth the resulting BLEU
scores for the respective documents in the scisumm-corpus by the WING NUS group.
These scores are to be interpreted keeping in mind that even human translators do
130 R. Menon et al.
Fig. 6 Section-wise summarized output
not achieve a perfect score of 1.0 or 100% [15]. Every score falls within a range, and
the significance of which has been listed in Table 1.
Detailed and comprehensive project reports are frequently drafted in all sorts of
academic work. They help the reader to understand, step by step, the author’s work,
and accomplishments. But, such a detailed account is often a hassle to go through
during assessment. The professor or mentor assessing the document is usually looking
only for the salient points and figures/charts if any. Instead, most have to go through
long documents and scan for the main points. The proposed system strives to elim-
inate this inconvenience by providing a crisp section-wise summary including the
important figures/charts. An extractive approach proves to be efficient for the same.
As interpreted in Table 1, the evaluation carried out suggests that the proposed system
is capable of generating relevant summaries.
Fig. 7 Downloadable output summary as PDF document
Fig. 8 Resulting BLEU scores

132 R. Menon et al.
To further improve the existing system, more efforts can be directed toward
reducing the number of constraints and prerequisites required for the input document.
Also, the efficiency of the algorithm could be improved by reducing or limiting the
number of temporary or intermediate files being created during processing. Construc-
tion of a dataset using technical book reports and their respective human summaries
to further train and refine the system can be done.
References
1. OECD, How much time do teachers spend on teaching and non-teaching activities?, in Educa-
tion Indicators in Focus 29 (OECD Publishing, 2015). https://ideas.repec.org/p/oec/eduaaf/29-
en.html
2. A.K. Mohammad Masum, S. Abujar, M.A. Islam Talukder, A.K.M.S. Azad Rabby, S.A.
Hossain, Abstractive method of text summarization with sequence to sequence RNNs, in 2019
10th International Conference on Computing, Communication and Networking Technologies
(ICCCNT), Kanpur, India, pp. 1–5 (2019). https://doi.org/10.1109/ICCCNT45670.2019.894
4620
3. S. Modi, R. Oza, Review on abstractive text summarization techniques (ATST) for single and
multi documents, in 2018 International Conference on Computing, Power and Communication
Technologies (GUCON), Greater Noida, India, pp. 1173–1176 (2018). https://doi.org/10.1109/
GUCON.2018.8674894
4. S.S. Naik, M.N. Gaonkar, Extractive text summarization by feature-based sentence extraction
using rule-based concept, in 2017 2nd IEEE International Conference on Recent Trends in
Electronics, Information & Communication Technology (RTEICT), Bangalore, pp. 1364–1368
(2017)
5. T. Zhang, C. Chen, Research on Automatic Text Summarization Method based on TF-IDF,
ed. by F. Xhafa et al. (ed.): IISA 2019, AISC 1084 (Springer Nature Switzerland AG, 2020,
pp. 206–212)
6. S. Erera, M. Shmueli-Scheuer, N. Guy, B. Ora, H. Roitman, D. Cohen, B. Weiner, Y. Mass,
O. Rivlin, G. Lev, A. Jerbi, J. Herzig, Y. Hou, C. Jochim, M. Gleize, F. Bonin, D. Konopnicki,
David,A Summarization System for Scientific Documents (2019)
7. D. Shen, Text summarization, in Encyclopedia of Database Systems, ed. by L. Liu, M.T. Özsu
(Springer, New York, NY, 2018). https://doi.org/10.1007/978-1-4614-8265-9_424
8. https://www.webstep.se/an-introduction-for-natural-language-processing-nlp-for-beginners
9. M. Haider, M. Hossin,H. Mahi, H. Arif, Automatic Text Summarization Using Gensim
Word2Vec and K-Means Clustering Algorithm, pp. 283–286 (2020). https://doi.org/10.1109/
TENSYMP50017.2020.9230670
10. T. Hailu, J.-Q. Yu, T. Fantaye, A framework for word embedding based automatic text summa-
rization and evaluation. Information (Switzerland) 11, 78 (2020). https://doi.org/10.3390/inf
o11020078
11. https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
12. https://radimrehurek.com/gensim/auto_examples/tutorials/run_word2vec.html
13. https://towardsdatascience.com/understanding-k-means-clustering-in-machine-learning-6a6
e67336aa1
14. https://github.com/WING-NUS/scisumm-corpus/tree/master/data/Training-Set-2019/Task2/
From-ScisummNet-2019
15. https://cloud.google.com/translate/automl/docs/evaluate
16. J.N. Madhuri, R. Ganesh Kumar, Extractive text summarization using sentence ranking, in
2019 International Conference on Data Science and Communication (IconDSC), Bangalore,
India, pp. 1–3
Management of Digital Evidence
for Cybercrime Investigation—A Review
Chougule Harshwardhan, Dhadiwal Sunny, Lokhande Mehul,

Naikade Rohit, and Rachana Patil
Abstract With development in technology, the scale of cybercrimes is increasing

drastically, which in turn increases the workload to manage the digital evidence.
Beside managing the evidence, ensuring the integrity and security of evidence is
crucial for delivering correct verdicts. With the traditional system, the evidence is
vulnerable to tampering, hence using a chain of custody is beneficial. In this paper,
we have analyzed and compared various proposed systems over the past years and
identified their pros and cons. This study would be beneficial in future to propose a
better system for evidence management.
Keywords Blockchain · Chain of custody · Digital forensics · Digital evidence ·

Cybercrime
1 Introduction
Laws play an important part in the judicial systems of any country. It helps in main-
taining peace and harmony. A crime in simple terms is an unlawful act punishable by
a state or other relevant authority. With the advancements in technology and various
other problems, there is always increase in diversity and frequency of crimes [1, 2].
This increases the workload of the authorities right from bottom to top level. Evidence
plays an important role to prove the facts or convict the person involved in the crime.
C. Harshwardhan (B) · D. Sunny · L. Mehul · N. Rohit · R. Patil

Department of Computer Engineering, Pimpri Chinchwad College of Engineering, Pune, India
e-mail: harshwardhan.chougule18@pccoepune.org
D. Sunny
e-mail: sunny.dhadiwal18@pccoepune.org
L. Mehul
e-mail: mehul.lokhande18@pccoepune.org
N. Rohit
e-mail: rohit.naikade19@pccoepune.org
R. Patil
e-mail: rachna.patil@pccoepune.org
134 C. Harshwardhan et al.
Fig. 1 Statistics of cybercrime cases registered versus arrests made in India
With increase in workload, it becomes difficult to manage and handle the evidence
as there are great chances of evidence mishandling and tampering cases. This would
lead to false decisions and punishments. Hence, it becomes crucial to ensure the
integrity and authenticity of the evidence. As shown in Fig. 1 according to NCRB
India, the number of cybercrime cases registered are much more than the number
of arrests made. This discrepancy is due the lack of proper evidence management
system.
1.1 Evidence
Evidence can be defined as anything that one sees, experiences, reads, or facts which
prove something is true or has really happened. Management of evidence becomes
critical in respect to the outcomes of criminal proceedings. If any aspects of evidence
management fail in protecting the evidence required for a prosecution, then it can
compromise the outcome of the judicial proceedings.
Types of Evidence
I. Scientific Evidence—This evidence either supports or rejects a scientific theory,
laws or hypothesis.
II. Digital evidence—The Electronic devices or gadgets related to the crime form
the electronic evidence. The data extracted in digital form from these sources acts as
digital evidence, which can be used in trials.
Management of Digital Evidence for Cybercrime Investigation … 135
III. Personal experience—This includes the personal experience of the individual

in first person, who was actively present during the crime.
IV. Physical evidence—Any material object that plays a role in proving or disproving
a fact in a trial forms physical evidence.
V. Testimonial evidence—This evidence includes the individuals who act as witness
and contribute meaningful statements or facts related to the judicial lawsuit. These
are third party members.
1.2 Digital Evidence and Its Features
With advancements in technology, the amount of data generated through digital

evidence is enormous. Hence, it becomes critical to effectively manage this evidence
[3]. These seek special attention as compared to physical evidence due to its char-
acteristics like it can be encrypted to conceal information, easy to transmit, highly
fragile which makes it vulnerable to damage, destruction or modification.
1.3 Cybercrime Investigation Process
The phases involved in crime scene investigation are, preserving the scene; surveying
and searching for evidence; documenting the evidence and scene; reconstruction of
scene. Documentation of evidence is the most important step, crucial for maintaining
a chain of custody. The Actors involved in the management of Digital Evidence are,
Victims and the Suspects, First Responders, Forensic investigators, Police officers,
Court experts and Judicial Authority. In the life cycle of evidence, the evidence must
be managed and administered over its entire lifetime, which is divided into various
phases, from the Evidence collection to its disposal, as described in Fig. 2.
• Acquisition which involves collection and capturing in one place.
• Description which involves combining, describing and arranging the evidence.
• Analysis which involves meaningful interpretation, scientific testing and investi-
gation.
• Assessment which involves Evaluation and judgment of outcomes in this case
facts.
• Presentation or disclosure in the court trials and proceedings.
• Disposal which involves return, or destruction, sale or donation of the evidence.
Fig. 2 Cybercrime investigation process
1.4 Chain of Custody
Citing the challenges associated with handling of digital evidence, maintaining a

chain of custody would be helpful. Chain of custody (CoC) is an ordered documen-
tation which stores the series of events in investigation from collection of evidence to
presentation in courts. It helps to track and establish a check on the current ownership
of the evidence. Documentation of evidence forms a crucial part of CoC to maintain
the modifications and ownerships of the evidence, which prevents its contamination.
The documentation includes the state in which the evidence is obtained, details of
the issuing authority, time span of issue of evidence, medium in which the evidence
is stored, maintained and handled, and timestamp and medium while transferring
of evidence. The actor, say detective, issues evidence, documents it, and hands it to
storage, where it is protected. It is crucial to maintain the documentation of these
transactions throughout the investigation process to ensure integrity of evidence.
2 Literature Review
The authors in this paper [4] propose a Blockchain based system, bringing trans-
parency in chain of custody and ensuring integrity of evidence, while transfer-
ring between one participant to another within a blockchain. Here, the evidence
is encrypted using a hash, generated by Base64 algorithm, which is transferred to the
recipient, who decodes it to retrieve the original evidence. Base64 algorithm forms
a perfect fit, due to its capacity to encrypt various files like audio, images and video
into String format, which can be easily transported over the network without any
data loss. The system uses chaincode to facilitate interaction between the application
and blockchain ledger and validates a transaction.
The authors in this paper [5] propose a model based on Blockchain technology,
which helps to secure evidence from external agents. A chain of blocks is maintained,
where each block stores the cryptographic hash of the previous block using crypto-
graphic hashing. Chain of custody process is implemented with this model, which
ensures data integrity, authenticity and security of the evidence, making the process
tamperproof. The system is implemented in the form components which interact
with each other. The core modules execute the main functions to change the state
of blockchain. The participants are under a consensus agreement and connected to
each other, through a blockchain peer-to-peer network. The Blockchain implemented
chain of custody guarantees security, integrity and authenticity to the authorized
users.
The authors in this paper [6] present a valid time stamping algorithm for digital
signature of evidence, to bring transparency at each stage of the investigation process.
The time stamp, obtained from a secured third party, helps to identify each individual
accessing the evidence. The hashing function generates a unique numeric value based
on the input called a hash value. This hash value is generated for each piece of
evidence, which is further sent to a time stamp authority, which adds it back to the
client by signing with a private key. The received time stamped value is verified with
a public key and stored locally. This system depends on a third party to generate a
timestamp, due to its complexity in implementing it.
The authors in this paper [7] contribute to different phases of chain of custody
followed in different countries. This is implemented by incorporating various layers
in the technical domain, which brings accountability, legality and authenticity. This
system is implemented in two parts. Firstly, the procedural safeguards which ensure
transparency and privacy. Secondly, the data protection safeguards bring account-
ability measures in the system. The whole system safeguards privacy by using
encrypted data at all stages, right upto the court of justice, by providing encryption
key.
The authors in this paper [7] propose the following idea. In today’s world, modern
day technology is achieving advancements in terms of portability and power. So
preservation of evidence online is a little bit challenging. This study uses blockchain
technology which provides integrity, security to collect, store and analyze digital
evidence. We are using proof of concept in hyper ledger composer. Digital evidence
will be considered as an admissible evidence if it satisfies properties: Authentic,
Complete, Reliable and Believable. Also, as a digital evidence, it should satisfy
technical aspects for admissibility such as transparency, explain ability. The objec-
tive is to develop a private blockchain with roles as Creator, forensic investigator,
prosecutor, defense, court using hyper ledger to store history of handling digital
evidence.
The authors in this paper [8] propose the following idea. As the existing system
imposes a weak security model, this study will supervise the entire evidence flow from
collection of evidence by police investigators to court trials and juror votes. In this
process, jurors can vote securely based on evidence and their data will be recorded for
decision. The main objective of this study is to develop a secure, integrated evidence
management process from police investigation to court hearings.
The authors in this paper [9] propose a study which focuses on developing a
blockchain based system on proof of concept to assure legal agreements. This appli-
cation will provide reception, storage and maintenance of collected digital evidence.
In this study, users should not take action which would modify the evidence contents.
In case of modification, he should explain implications of modification to expert
authorities.
The authors in this paper [10] propose the following idea. Most of the existing
Evidence Management systems are centralized, not tamper proof. So, such evidence
is prone to get tampered before actual law proceedings. As a result, the authors have
come up with a solution to use the Blockchain technology which is built over a private
Ethereum and by maintaining a chain of custody (CoC) (viz. a log file used to store
the chronological sequence of the evidence collection). The main objectives of the
author is to maintain the integrity, evidence admissibility in the court of law and to
allow only certain people to access the evidence. They have used the Raft Consensus
Algorithm to achieve their objective. However, the Raft algorithm is slower compared
to other algorithms (IBFT).
The authors in this paper [4] are trying to maintain the integrity and trustworthiness
of the digital evidence by use of Blockchain technology and using the chain of
custody. Basically, in the system, evidence will be stored in blocks and each block
will store the cryptographic hash value of the previous block. The authors tried to
focus on integrity, traceability, authenticity and verifiability of the evidence.
The authors in this paper [11] have focused on the trustworthiness and more
transparency of the evidence management process. In their proposed system, they
have to implement their functionalities viz. Digital Evidence Inventory—It is based
on the Blockchain technology, and it is used to collect the evidence. This is immutable
and everyone can have access to DEI. Forensics Confidence Rating—All the evidence
are given a rating (score) to check the trustworthiness. Global Digital Timeline—
It provides the time of order of evidence. Thus, their framework provides a better
transparency and confidence of evidence to court or investigators.
The authors in this paper [12] have used the chain of custody with the Blockchain
technology. Their framework comprises of a Blockchain Digital Evidence Bag (B-
DEC), and the system is built over Ethereum.
The authors in this paper [13] propose the following idea. Generally, an inves-
tigation is initiated with collection of evidence. Investigating officials analyzes the
evidence to predict why and how the crime was occurred. The evidence is uploaded to
the blockchain to make them tamper-proof. Evidence should be further admissible in
the court of law [14, 15]. In this paper, the author comes up with a friendly approach
for access, so anybody can view the details. It satisfies all the requirements of chain
of custody. It provides integrity and authentication by issuing identity for every user
who gets logged in for using the database and the block undergoes mining to make
sure it is secure, transparent and tamper-proof.
The authors in this paper [16] propose a Blockchain based system to bring trans-
parency in CoC. Each block contains a cryptographic hash of the previous block
and timestamp. Here, the author thought of implementing one of the three types
of Blockchain. The public blockchain in which every transaction is checked by
each of its nodes which makes it a fully distributed network. Private/Permissioned

Blockchains whose transactions are faster and cheaper than Public Blockchains.
They are accessible to only certain verified users and their transaction details are
also private.
The authors in this paper [17] proposed the Hyperledger Fabric provides a system
between participating organizations in the blockchain, and can manage privilege
rights, identities and access roles in the blockchain. There is an advantage of providing
privacy of institutions and users. It helps to improve the reliability and integrity of
data.
3 Analysis of Review
From the analysis and review (Table 1) of existing schemes described in Table 1. It is
observed that the arrest made in cybercrime cases is still on rise. To overcome these
issues, we are currently working on development of a blockchain based evidence
management system that will help in managing the evidence in such a way that the
integrity of evidence will be preserved.
4 Challenges in Digital Evidence Management
1. Security and integrity of evidence and protection against data breaches is a

primary concern which may lead to leaked identities or loss of evidence.
2. Data Storage and Volume—The digital evidence in the form of videos, images
and other digital forms generates huge amounts of data. Due to storage
constraints, it becomes difficult to retain the information and analyze it.
3. Detection of tampering due to high risk of cyber-attacks, as these attacks seem
to make the evidence intact.
4. Access management to restrict access to the authorized users according to their
roles, which ensures a secured and controlled physical location.
5. Errors and Mishaps which may affect the proceedings negatively like accessi-
bility or modification of evidence due to errors.
6. Transfer of Data between different entities, due to high vulnerability and
exposure to hacking attacks or data breach during transfer.
7. Presentation of evidence securely in court without the loss of integrity [18].
5 Conclusion
In this paper, we have successfully analyzed and compared various systems and their
algorithms. We were able to identify the challenges present in the current systems
Table 1 Analysis of existing schemes

References Blockchain Contributions Advantages Shortcomings
platform
(consensus used)
Ahmad et al. Ethereum Focused on the To keep the Raft algorithm is
[4] integrity, confidentiality of slower compared
admissibility of the evidence to other algorithms
the evidence in
the court of law.
It is able to
handle the
realistic
workload
Harihara et al. PoW User-friendly and Integrity, Some limitation of
[5] more transparent traceability, POW
EMS authenticity and
verifiability of the
evidence
Billard [6] Scrybe System allows a Trustworthy, Complex to
provenance better confidence secure implement
framework rating of the
evidence for the
investigators and
the courts
Yunianto et al. Ethereum Use of digital More secure Complex
[7] evidence bag
(DEB) with
Blockchain
technology
Lone et al. [7] Hyperledger Forensic chain Reduction in fraud High complexity
composer, PoC, with hyper ledger due to due to use of
HyperLedger to develop a transparency in assets, transactions
caliper private audit trail and roles
blockchain to
store history of
handling digital
evidence
Li et al. [8] Blockchain Extend the cycle Covers the Maintaining the
of evidence complete cycle of security of jurors
collection and evidence and votes will be
access during management from treated
police investigation
investigation to voting from jurors
the jury voting in
the court trial
(continued)
Table 1 (continued)
platform
(consensus used)
Petroni et al. Blockchain, Develop a CoC Reduces the Centralized
[9] chain of custody of digital malicious approach where
evidence proof of modifications in only expert
concept of evidence authority has
blockchain based access to modify
system to assure evidence
legal agreements
Raorane, et al. Blockchain Develop a The private Current system is
[10] Blockchain based blockchain enables not capable of
Coc, which is even distribution working on storing
stored in a of access to large amounts of
database like authorized evidence
mongoDB with personnel, thus
timestamp guaranteeing
security
Chopade et al. Blockchain Develop a Coc Security from data The size of the
[4] based on breach and attacks hash generated
blockchain, is guaranteed, due using Base64
where the to use of encrypted Algorithm is large,
evidence is evidence, while which is not
encrypted using a transfer within a feasible
Base64 algorithm network
Ćosić et al. [11] Time stamp Develop an Maintains a The system is not
authority improved system detailed and accustomed to
which uses transparent Chain handle evidence
timestamp of custody, safely and needs
generated by a providing security research to add
third party, for to evidence more authenticity
digital signature through
of evidence cryptographic
evidence
Gopalan et al. PoW Develop a CoC The evidence is Due to the rise in
[12] in a tenable way stored as a data volume, there
which is distributed ledger is reduction in
user-friendly for so as to maintain flexibility and
accessing and the data which can capability of the
which assures be further system
tamper-proof of admissible in the
evidence court of law
Khateeb et al. Blockchain, CoC Develop a better CoC High complexity
[13] system which documentation
supports DFIR improves data
by implementing availability,
CoC legibility with
proof of validity,
so data is not
altered
(continued)
Table 1 (continued)
platform
(consensus used)
Jeo et al. [17] Hyperledger Develop a It assures integrity Complex to
Fabric blockchain based of forensic data as implement
model using it is shared with all
Hyperledger peers in the
Fabric network
and architectures. These challenges would form the base for further development and
research of the system, which would guarantee security, integrity and authenticity of
the evidence in all stages of the investigation process. This would bring transparency
in court proceedings and trials.
References
1. R.Y. Patil, S.R. Devane, Network forensic investigation protocol to identify true origin of cyber
crime. J. King Saud Univ. Comput. Inf. Sci. (2019)
2. P.R. Yogesh, S.R. Devane, Primordial fingerprinting techniques from the perspective of digital
forensic requirements, in 2018 9th International Conference on Computing, Communication
and Networking Technologies (ICCCNT ) (IEEE 2018), pp. 1–6
3. R.Y. Patil, S.R. Devane, Unmasking of source identity, a step beyond in cyber forensic, in
Proceedings of the 10th International Conference on Security of Information and Networks
(2017), pp. 157–164
4. L. Ahmad, S. Khanji, F. Iqbal, F. Kamoun, Blockchain-based chain of custody: towards real-
time tamper-proof evidence management, in Proceedings of the 15th International Conference
on Availability, Reliability and Security (2020), pp. 1–8
5. S. Rao, S. Fernandes, S. Raorane, S. Syed, A novel approach for digital evidence management
using Blockchain (2020). Available at SSRN 3683280
6. J. Ćosić, M. Bača, (Im)proving chain of custody and digital evidence integrity with time stamp,
in The 33rd International Convention MIPRO (IEEE, 2010), pp. 1226–1230
7. J. Rajamäki, J. Knuuttila, Law enforcement authorities’ legal digital evidence gathering:
legal, integrity and chain-of-custody requirement, in 2013 European Intelligence and Security
Informatics Conference (IEEE, 2013), pp. 198–203
8. A.H. Lone, R.N. Mir, Forensic-chain: Blockchain based digital forensics chain of custody with
PoC in hyperledger composer. Digit. Investig. 28, 44–55 (2019)
9. M. Li, C. Lal, M. Conti, D. Hu, LEChain: A blockchain-based lawful evidence management
scheme for digital forensics. Futur. Gener. Comput. Syst. 115, 406–420 (2021)
10. B.C.A. Petroni, R.F. Gonçalves, P.S. de Arruda Ignácio, J.Z. Reis, G.J.D.U. Martins, Smart
contracts applied to a functional architecture for storage and maintenance of digital chain of
custody using blockchain. Forensic Sci. Int. Digit. Invest. 34, 300985 (2020)
11. S.H. Gopalan, S.A. Suba, C. Ashmithashree, A. Gayathri, V.J. Andrews, Digital forensics using
Blockchain. Int. J. Recent Technol. Eng. (IJRTE) 8(2S11) (2019). ISSN: 2277-3878
12. D. Billard, Weighted forensics evidence using blockchain, in Proceedings of the 2018
international conference on computing and data engineering (2018), pp. 57–61
13. E. Yunianto, Y. Prayudi, B. Sugiantoro, B-DEC: digital evidence cabinet based on Blockchain
for evidence management. Int. J. Comput. Appl. 975, 8887 (2019)
14. R.Y. Patil, S.R. Devane, Hash tree-based device fingerprinting technique for network forensic
investigation, in Advances in Electrical and Computer Technologies (Springer, Singapore,
2020), pp. 201–209
15. P.R. Yogesh, Formal verification of secure evidence collection protocol using BAN logic and
AVISPA. Proc. Comput. Sci. 167, 1334–1344 (2020)
16. H. Al-Khateeb, G. Epiphaniou, H. Daly, Blockchain for modern digital forensics: the chain-
of-custody as a distributed ledger, in Blockchain and Clinical Trial (Springer, Cham, 2019),
pp. 149–168
17. J. Jeong, D. Kim, B. Lee, Y. Son, Design and implementation of a digital evidence management
model based on hyperledger fabric. J. Inf. Process. Syst. 16(4) (2020)
18. P.R. Yogesh, Backtracking tool root-tracker to identify true source of cyber crime. Proc.
Comput. Sci. 171, 1120–1128 (2020)
Real-Time Human Pose Detection
and Recognition Using MediaPipe
Amritanshu Kumar Singh, Vedant Arvind Kumbhare, and K. Arthi
Abstract Significance of human action recognition has increased manifolds due to

its wide-scale application in the field of public security, gaming, etc., due to the intro-
duction of various new technologies. We propose a framework that detects human
action under different conditions and viewing angles that enable the identification
of divergent patterns based on different spatiotemporal trajectories. In this paper,
we use new technology such as MediaPipe Holistic which provides pose, face, and
hand landmark detection models which parses the frames obtained through real-time
device feed using OpenCV through our MediaPipe Holistic model and provide a total
of 501 landmarks which is exported as coordinates to a CSV file upon which we train
a custom multi-class classification model to understand the relationship between the
class and coordinates to classify and detect custom body language pose. The machine
learning classification algorithms implemented in this paper are random forest, linear
regression, ridge classifier, and gradient boosting classifier.
Keywords Human pose · MediaPipe · Machine learning
1 Introduction
In our paper, we perform real-time human action detection through live video
analysis using MediaPipe. MediaPipe is an open-source framework fabricated for
constructing sophisticated pipelines by taking advantage of accelerated GPU or CPU.
It provides accurate and fast tailored machine learning applications for students,
developers, and researchers. Its key uses quick prototyping of perception pipelines
A. K. Singh (B) · V. A. Kumbhare · K. Arthi

School of Computing, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur,
Chengalpattu District, Tamil Nadu 603203, India
e-mail: aa1815@srmist.edu.in
V. A. Kumbhare
e-mail: va1494@srmist.edu.in
K. Arthi
e-mail: arthik1@srmist.edu.in
146 A. K. Singh et al.
with inference models and other reusables [1]. It provides custom python solutions
[1] which can be easily used by installing Pandas, NumPy, and Scikit libraries.
MediaPipe essentially consists of three parts: (a) infer sensory data in a substruc-
ture, (b) tools for performance evaluation, (c) collection of computational nodes for
reusing deducible and processing constituents. This paper proposes to extract action
from our video data from different poses through MediaPipe Holistic. The Medi-
aPipe Holistic consists of in-built models that show the relationship between all of
the landmark components to classify the connection between different body parts to
predict human body pose and emotions (refer Fig. 2).
To classify actions based on data, we use several machine learning classifiers in
our paper. Random forest classifier is widely used in the field of image classification,
human pose detection, action recognition, etc. Ridge regression is another classi-
fication algorithm that showcases the different actions in a video sequence. Linear
regression is class of supervised machine learning (ML) algorithms which carries
out regression tasks. The goal is to forecast a non-independent variable value (y) that
is dependent on a given independent variable (x). The gradient boosting classifier is
an incrementing combination of a base model whose errors are fixed in consecutive
recurrence by incorporating regression trees that rectify the residual elements which
are previous stage errors to optimize random loss functions [1, 2].
OpenCV 4.0 permits the designation of sequences of OpenCV image processing
functions in a graphical form. It provides native data streaming support that is far
more suitable for audio and video analyzing [1].
The major contributions are:
1. An efficient and robust four-stage human pose tracking pipeline that can detect
various human actions and emotions in real time.
2. A face, pose, and hand prediction model that is adept at estimating a 3-D
landmark model with only RGB as the input.
3. Open-source hand, pose, and face tracking pipeline framework, which is ready
to use, besides being able to form customizable machine learning models in
JavaScript as well as in Python.
2 Literature Review
Gupta et al. [3]

It comprises the scene/event, the analysis of human gestures, recognition of manip-
ulated objects, and the observation of the impact on these objects of human activity.
The method moves beyond conventional methods and imposes spatial and functional
restrictions for a consistent semantic representation of each perceptual feature.
Liu et al. [4]

A new approach is proposed for understanding of human actions focuses on the
selection and inclusion of key positions. Video frame styles that are represented
suggests (EPFs) [4] extensive pyramidal features that include Gabor and wavelet
Real-Time Human Pose Detection and Recognition Using MediaPipe 147
pyramids. These characteristic attributes can encode details about the body, strength,
and contours and thus offer an insightful depiction of human positions.
Ramanan and Forsyth [5]

This document outlines a device capable of annotating a video clip, namely a
rundown of each player’s presence and when the actor is in sight. The machine
does not need a set and automated context. The system operates by (1) monitoring
people in two dimensions and (2) synthesizing an annotated 3D motion series of 2D
tracks. The system works by capturing 3D movement capture data that are annotated
automatically using a descriptive class system.
Niu et al. [6]

This paper represents an environment for detection and recognition of persons activi-
ties in external video data observation using frame differentiation and feature correla-
tion. Human action is recognized using simple statistics without building complicated
Markov models.
In this paper, we decode human body pose and implement MediaPipe Holistic, a
solution provided by MediaPipe ML framework, made up of upto eight different
models that coordinate with each other in real time while minimizing memory transfer
to provide pose estimation, face detection, and hand tracking into a single efficient
end-to-end pipeline [7, 8].
As discussed in the workflow diagram (Fig. 1), we create a live dataset consisting
of 501 landmarks made up of pose, face, and hand landmark coordinates and then
perform train–test split in 70:30 ratio to obtain random train and test subsets, respec-
tively. We train a custom machine learning pipeline including four individual separate
machine learning classification models. Finally, using our best performing ML clas-
sification model (Table 1), we render the predictions onto the real-time device feed
using OpenCV (as shown in Fig. 5) [9, 10].
3.1 Pipeline Design and Detection
MediaPipe Holistic incorporates separate independent models—pose, face, and hand

detection (as shown in Fig. 2) where each model utilizes its own input frame from
the real-time captured video feed. Thus as a result, we obtain a multistage pipeline
that treats each model with different region of interest using a resolution deemed
appropriate for that specific region.
First in the pipeline, pose estimation model. MediaPipe Holistic estimates human
pose using BlazePose pose estimation model. Therefore, using the deduced pose
Fig. 1 Workflow diagram
Table 1 Accuracy scores for

Machine learning model Accuracy score
the machine learning
classification model in Random forest ~96.78
pipeline Linear regression ~92.68
Ridge classifier ~92.2
Gradient boosting ~90.8
Fig. 2 MediaPipe Holistic pipeline overview

landmarks, it obtains 3 ROI crops: one each for left hand and right hand and one for
the face. It also employs a re-crop and tracking model in the pipeline which allows
the pipeline to crop full-resolution input frames to achieve better region of interest
(ROI) and presumes that the object detected is not moving significantly from frame-
to-frame, using the result from the preceding frame as a pathway to body ROI in the
current one.
To add further, the low resolution in captured frames (256 × 256) from the human
body pose model means that the subsequent region of interests for hand and face
are, however, too inconsistent to lead subsequent models. The re-cropping of those
regions [7] act as spatial transformers that correct this and are designed to remain
lightweight, costing only around 10% of the correlative to the model’s inference
time.
3.2 Structure
The pipeline is architecturally constructed using a MediaPipe landmark graph and

Holistic landmarks as subgraph. Subgraph of MediaPipe Holistic library landmark
basically employs three different features:
3.2.1 BodyPose Landmark Model
For the purpose of identifying a human body, we use BlazePose’s pose detector.
Using this model, we are able to identify 33 pose landmarks (as shown in Fig. 3), 3D
Fig. 3 BlazePose model 33 keypoints topology

Fig. 4 Hand landmark model
landmarks of a single picture of a body from RGB video frames which is more than
that of the current standard COCO topology. This method achieves real-time perfor-
mance on mobile devices in python. It utilizes a two-step ML pipeline, wherein the
pipeline first detects a person’s region of interest (ROI) within the frame and performs
re-cropping on the frame to predict pose landmarks. This pipeline is implemented as
a subgraph of the MediaPipe graph.
3.2.2 Hand Landmark Model
MediaPipe Hands is a devoted solution for hand and fingers tracking. It can deduce
from only one picture up to 21 3D markings of a hand (as shown in Fig. 4), with the
ability to scale multiple hands. It is a combination of a palm detection system that
works on the whole picture, returning an orientated hand boundary model and the
hand landmarks structure that functions on the clipped palm detector-defined picture
area, returning 3D hand key points with high reliability. This pipeline is implemented
as a subgraph of the MediaPipe graph and renders using a pose renderer subgraph.
There is also the freedom to build with CPU or GPU.
3.2.3 Face Landmark Model
The MediaPipe Facial Mesh calculates face geometry and estimates 468 three-
dimensional [1] facial landmarks. It uses machine learning to deduce a three-
dimensional plane configuration that only requires a single camera feed and does
not need a separate depth sensor [1]. With potential hardware acceleration, it can
monitor landmarks on individual faces using a lightweight model framework across
the processing pipelines. Furthermore, a system of measuring three-dimensional
space is established and the facial landmark screen positions are used to measure
the facial morphology throughout the facial region. In order to promote a durable,
effective, and mobile logic, Procrustes analysis is used. The study is conducted on
the central processing unit and induction of the machine learning model is predicated
on minimum speed/memory.
Fig. 5 Real-time predictions on live feed
We propose to extract action from our video data from different poses through Medi-
aPipe Holistic. MediaPipe Holistic incorporates separate models, namely pose, face,
and hand (as shown in Fig. 1). All these separate models provide landmarks which
when merged with their respective models yield 501 landmarks. These landmarks
consist of four coordinates (x, y, z, and visibility). Now we train a custom machine
learning model which shows the relationship between all of the landmark compo-
nents to classify the connection between different body parts to predict human body
pose and emotions.
4.1 Dataset
The data are presented in the form of landmarks which consist of four coordinates (x,
y, z, and visibility) that are exported in the form of numerical coordinates (x, y, z, and
v) to a CSV file. Here ‘v’ represents visibility coordinate meaning if the particular
landmark is displayed on the screen or not which has a range from 0 to 1. Our dataset
contains 2004 columns representing 501 landmarks.
4.2 Implementation
4.2.1 Load the Dataset
Once we have captured our live dataset consisting of 501 landmarks (including 33
pose landmarks and 468 face and hand landmarks), it is stored in a ‘.csv’ file. Now,
we access the pandas library to read the dataset in the respective csv file in the form
of a dataframe. Once the data frame has been created consisting of all the landmark
components, we perform a train–test split in the proportion of 30% testing and 70%
training using the ‘train_test_split’ module.
4.2.2 Train Custom Machine Learning Classification Model
In this paper, we train four machine learning models—linear regression, random

forest, ridge classifier, and gradient boosting rather than relying on one.
Here we set up a pipeline using the ‘sklearn.pipeline’ library which consists of
four individual separate machine learning models. The pipeline is fabricated in the
form of a dictionary where each machine learning model can be accessed using a key.
For example, if we reference the key ‘lr’, it will map to the respective linear regression
model; similarly if we reference key ‘rf’, it will map to the respective random forest
classification model. In our pipeline, the steps are defined in the following way:
[(‘standardscaler’, standardscaler()), (‘randomforest’, randomforest()].
So every machine learning model pipeline is divided into two steps. First, the data
passes through standard scaler (discussed in the next section) which normalizes the
data and then in the next step, the respective machine learning model is trained. Here,
using for loop we loop through each one of the individual pipelines consisting of a
machine learning model. Then we assess each machine learning model in the pipeline
by passing the training data (including X-train and y_train) to the respective model.
Then we perform predictions on the trained machine learning models by passing test
data through each of the respective machine learning model pipeline.
4.2.3 StandardScaler()
It normalizes the data by making mean = 0 and scales the data to unit variance. The
standard score of sample ‘x’ is calculated as:
z = (x − u)/s. (1)
where u is the mean of the training samples or 0 if with_mean = False and s is the
standard deviation (SD) of the training samples or 1 if with_std = False [8].
Centering and scaling occur individually on each function by calculating the

appropriate statistics on the instances in the training set. Mean and SD standard
deviation (SD) are then preserved to be used for subsequent data via transform [8].
For many machine learning estimators, standardization of a dataset is typical
requirement: They might behave poorly if each feature does not almost look like
standard normally distributed data [8].
4.2.4 Evaluation of the Model
Performing predictions using each one of the machine learning models in the pipeline,
we obtain the following accuracy in Table 1.
Here in our project, we consider random forest classification algorithm over other
algorithms in the pipeline due to the following reasons:
1. After executing the model multiple times in a real-time environment (as shown
in Table 1), we were able to achieve the highest accuracy of 96.78 (approx) for
our paper with this algorithm.
2. The random forest classification model is more robust due to its randomness and
less overfitting on the training data. Thus, making it more suitable than other
machine learning model used in this pipeline.
4.2.5 Predictions
Finally, using the pickle library we dump and save our best machine learning model
based on accuracy metrics in ‘.bkl’ format. Then again using pickle, we load our best
performing ML classification model to render landmarks and predict body pose in
the real-time device feed using OpenCv.
The results have been shown (in Fig. 5). The model combines all landmarks (pose,
face, and hand) into one large array of landmark coordinates (x, y, z, and v) and uses
this to create a new dataframe. Then the model makes further detections on this new
dataframe. The model determines the class label and also predicts the maximum
probability of the class that is detected. The model then employs OpenCv’s cv2
module to render the results to the screen over the live prediction window (as shown
in Fig. 5). As showcased in results (Fig. 5), this model correctly detects emotions
as well as makes predictions on the human pose which proves the versatility of this
custom model. The random forest classification model used in the final steps to make
predictions on the real-time device feed has been successfully able to achieve the
highest classification accuracy of 96.78% on testing sequences.
5 Conclusion
This paper describes the way modern image processing is going to be used for a
vast number of use cases. In this paper, we have particularly shown how we can
analyze human actions and emotions in real time. For our model to be more realistic,
we created our own dataset such as the model assimilates an authentic and real-life
environment rather than using a generic dataset. This can be used as a base for many
forthcoming technologies in the field of artificial intelligence, security, augmented
reality, and video analysis. Other machine learning classification algorithms can also
be used for this model. We suggest that this paper be expanded for the detection
of multi-human action subject to further improvements in the MediaPipe Holistic
library.
For this paper, accuracy may not be the best evaluation metric since the objective
has been a real-time implementation (as shown in Fig. 5). Accuracy may vary due
to different surroundings, discrete datasets made by separate users, user’s device
capability, lighting conditions, etc.
References
1. C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M.

Yong, J. Lee, W.-T. Chang, W. Hua, M. Georg, M. Grundmann (2019) MediaPipe: a framework
for building perception pipelines
2. M. Sun, P. Kohli, J. Shotton, Conditional regression forests for human pose estima-
tion, in Proceeding/CVPR, IEEE Computer Society Conference on Computer Vision and
Pattern Recognition (IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, 2012), pp. 3394–3401. https://doi.org/10.1109/CVPR.2012.6248079
3. A. Gupta, A. Kembhavi, L.S. Davis, Observing human-object interactions: using spatial and
functional compatibility for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1775–
1789. https://doi.org/10.1109/TPAMI.2009.83
4. L. Liu, L. Shao, X. Li, K. Lu, Learning spatio-temporal representations for action recognition:
a genetic programming approach. IEEE Trans. Cybernet. 46(1), 158–170 (2016). https://doi.
org/10.1109/TCYB.2015.2399172
5. D. Ramanan, D. Forsyth (2004) Automatic annotation of everyday movements
6. W. Niu, J. Long, D. Han, Y. Wang, Human activity detection and recognition for video
surveillance 1, 719–722 (2004). https://doi.org/10.1109/ICME.2004.1394293
7. I. Grishchenko, V. Bazarevsky, MediaPipe holistic—simultaneous face, hand and pose predic-
tion, on device. Google AI Blog, Google, 10 Dec 2020. https://ai.googleblog.com/2020/12/
mediapipe-holistic-simultaneous-face.html
8. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Pret-
tenhofer, R. Weiss, V. Dubourg, J. Vanderplas, D. Passos, M. Brucher, M. Perrot, E. Duchesnay,
Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
9. S. Gautam, An improved mammogram classification approach using back propagation neural
network, in Data Engineering and Intelligent Computing (Springer, Singapore, 2018), pp. 369–
376
10. M. Navyasri, Robust features for emotion recognition from speech by using Gaussian mixture
model classification, in Information and Communication Technology for Intelligent Systems
(ICTIS 2017), vol 2 (Springer International Publishing, 2018), pp. 437–444
Charge the Missing Data
with Synthesized Data by Using SN-Sync
Technique
Yeswanth Surya Srikar Nuchu and Srinivasa Rao Narisetty
Abstract The performance evaluation of the algorithm is based on the data we feed
to it, mainly in training the machine learning models. If the data we sent to the
model was inconsistent or missing, then this might lead to some false predictions.
For example, if the model is something related to a healthcare system that needs to
predict a patient’s condition over a while, then predictions must be accurate. Here, the
accuracy depends on the data we feed to the ML model. In the process of training the
ML model, data preprocessing is a crucial step. In this step, handling the missing data
is essential, and it is vital to take it correctly; else, it might lead to some inconsistent
results. This work aims to introduce a new method named SN-Sync for charging the
missing data, analyze an algorithm, and compare the efficiency with some traditional
techniques.
Keywords Machine learning · Data preprocessing · Missing data
1 Introduction
Many researchers had found that filling the missing data without proper estimates
will give inconsistent results. The following are traditional techniques that are used
most to charge the missing values.
Literature study-1, deleting the entire column or a row in a given data set. If the
percentage of the null values is greater than or equal to 50%, then the entire column
is dropped. Similarly, if a row contains one or more null values, then the entire
row is dropped [1]. The disadvantages of using this technique are losing some valu-
able information and working inefficiently if the percentage of missing values are
increasing.
Y. S. S. Nuchu (B)
Reputation.com, Hyderabad, India
S. R. Narisetty
Assistant Professor, Department of CSE, Lakireddy Bali Reddy College of Engineering
(Autonomous), Mylavaram, Krishna District, Andhra Pradesh 521230, India
156 Y. S. S. Nuchu and S. R. Narisetty
Literature study-2, ascribe missing values for unceasing data; in a given data set,
this technique includes calculating the mean or median and replacing all the null data
blocks with the calculated mean or median [2]. This technique only works if the data
is numerical and can cause data leak issues.
1
n
mean(X ) = xi (1)
n i=0
Literature study-3, ascribe missing values for categorical data; for given informa-
tion, the missing values are replaced by calculating the mode. This technique works
only with categorical data.
Literature study-4, implementing a model that includes an algorithm that accepts
missing values [3]. Some example algorithms are k-nearest neighbor’s,

n

K -NN, similarity(x, y) = − f (xi , yi ) (2)
i=0
K-Mean Clustering,

c
ci
K -MC, J (v) = (||xi − vi ||)2 (3)
i=1 j=1
Naive Bayes, random forest, where the results are much more accurate when
compared to products when the data has missing values [4]. In this paper, we introduce
a new method for charging the missing values using synthesized data.
2 Synthetic Data and Its Computations
2.1 Synthetic Data Introduction
In data science, synthetic data is the fastest growing trend and a rising, most valuable
tool. What do precisely synthetic data mean? Synthetic data is composed of data
that is not based on any real-world readings or events [5]. It is purely generated by
a computer program based on use cases, scenarios, or a real-world data set.
The primary goal of generating the synthetic data set is to be flexible and powerful
enough to train a machine learning model. There are numerous advantages of using
synthetic data and mainly used in data science [6]. The central use case of using
synthetic data in ML and data science reduces the need to record real-world data
Charge the Missing Data with Synthesized Data … 157
and events. Thus, it is more obvious to generate and construct data based on more
quickly than waiting on a data set generated based on real-world events.
2.2 Use Cases
This use case is commonly valid for the events that are very frequently occurring.
Synthetic data has Nemours use cases. It can be applied in machine learning tasks
[7]. Some of the common use cases for synthetic data are self-driving vehicles, health
care, robots, and security.
It is effortless and fast to generate synthetic data. Once the environment is ready,
it is very cheap to generate as much data as needed. Synthetic data help to generate
labels that are very costly to produce from real-world events.
Synthetic environments are very flexible to modify and improve the model
training. Synthetic data can be used to replace certain parts of data, most sensi-
tive data. For example, in some activities, personal data like personally identified
information and personal health-related are prevented information [8]. To avoid such
consequences, it is recommended to create and use synthetic data.
2.3 Computations on Synthetic Data
To generate any data, there is a process that involves few steps. Machine learning
techniques programmatically generate synthetic data. Traditional machine learning
techniques like decision trees and deep learning techniques can be used. We used a
library in R name “synthpop” to generate the synthetic data. Syn() function returns
synthesized data for the given original data set. We can set the bucket size and method
to generate. Figure 1 shows one such example and very commonly used when the
input fields contain numerical data.
A real-time data set [9] that contains body temperature, sex, heart rate as labels
is passed as an input to the Syn(). Figures 2 and 3 show the comparison between the
original and synthetic data. By varying the input params, the accuracy may change.
With a smaller number of events records, we can generate a large data set.
Figures 2 and 3 are generated by passing the method property value as cart and
min bucket size as 10. Figure 2 states the comparison between original and synthetic
data of body temperature. The correlation between observed and synthetic values is
Fig. 1 Generating synthetic

data in cart mode Synthetic_data <-
syn(df_original, m = 10, method
= "cart",cart.minbucket = 10)
Fig. 2 Comparing observed data and synthetic data for body temperature in cart mod
Fig. 3 Comparing observed data and synthetic data for heart rate in cart mode
very close. This gives a tremendous advantage to train various ML models quickly
[10]. Figure 3 states the same with heart rate.
Figures 5 and 6 are generated by passing the method property value as a sample
which can take categorical values in the input.comparision as shown in Fig. 7.
Fig. 4 Generating synthetic

data in sample mode Synthetic_data <- syn(df_original, m =
10, method = "sample")
Fig. 5 Comparing observed data and synthetic data for heart rate in sample mode
Fig. 6 Comparing observed data and synthetic data for body temperature in sample mode
Fig. 7 Z-value comparison

between the synthetic and glm.synds(I(HeartRate=="YES") ~
observed data BodyTemp + HeartRate, data =
df_generated, family = "binomial")
3 Proposed System
In this paper, we proposed a new technique named SN-Sync to charge the missing
data. Figure 9 shows the process flow of how the SN-Sync technique works. For
testing this process flow, we collected a real-time healthcare data set [9] that contains
body temperature, sex, heart rate as labels and computed in R.
Step-1: Load the data into R space.
df_ original<-read.csv(file=“input.csv”) (4)
Step-2: Identify the columns and rows that contain missing data or null values and
find the correlation matrix for the input data set as shown in Table 1.
Step-3: Generate synthetic data for columns and rows that contain missing data and
null values. Based on the correlation matrix in step-2, find the group of clusters based
on the K-means clustering algorithm [11].
Figure 8 shows the group of clusters between heart rate and body temperature.
By using the elbow method, the size of the cluster is calculated which is 4.
km = K Means(n_clusters = 4) (5)
Table 1 Correlation matrix

Body temp Heart rate
for body temp and heart rate
Body temp 1.0000000 0.7536564
Heart rate 0.7536564 1.0000000
Fig. 8 Group of clusters

from heart rate and body
temperature
Fig. 9 Proposed system

flow diagram
Table 2 Mean cluster values

Cluster Body temperature Heart rate
of body temp and heart rate
0 98.017391 71.130435
1 98.517073 78.048780
2 98.095833 63.208333
3 98.426316 84.210526
Midpoints for each of the four clusters are marked [12]; it determines the average
cluster values for both heart rate and body temperature. Table 2 gives the mean values
for both body temperature and heart rate in each cluster. In Fig. 8,
Step-4: Replace the missing values with the mean value of a cluster formed by
synthesized data. To replace a null value in a row then get the exact row value for
a column with the highest correlation. For example, to replace a null value in body
temperature at row 14 then match the column that has the highest correlation; in
this case, it was heartrate and got the column value at row 14 and compare the heart
rate value with the cluster’s means as shown in Table 2. Replace the null with the
corresponding body temperate cluster value in Fig. 9.
Step-5: Send the final data set to a machine learning algorithm and compare the
results with some traditional techniques.
4 Results
Decision tree classification machine learning algorithm is used to compare the results
between some standard traditional techniques like replacing the missing data with
mean, replacing the missing data with mode, deleting the entire row or column that
contains null values or missing data, with the SN-Sync technique.
Mean: All the missing data and null values are replaced by calculating the mean for
that row for the given data set. After replacing all the blanks with the mean value and
passing the data to the decision tree classification algorithm, the confusion matrix
value is
[[ 6 5]
[10 12]]
0.54545454545454
Mode: All the missing data and null values are replaced by calculating the mode for
that row for the given data set. After replacing all the blanks with the mode value and
passing the data to the decision tree classification algorithm, the confusion matrix
value is
[[ 4 7]
[8 14]]
0.54545454545454
Delete row or column: All the missing data and null values are dropped either a
row or the entire column for the given data set. After dropping all the blank values,
input data is passed to the decision tree classification algorithm, the confusion matrix
value is
[[ 4 3]
[8 13]]
0.607142857142857
SN-Sync: All the missing data and null values are replaced with the mean value of
a cluster formed by synthesized data for the given data set. After replacing all the
blank values and passing the data to the decision tree classification algorithm, the
confusion matrix value is
[[ 2 2]
[7 13]]
0.625
Fig. 10 Comparing model

accuracy between different
techniques
Figure 10 shows the comparison graph between mean, mode, dropping rows and
columns and SN-Sync.
5 Conclusion
This palimpsest reviews a substantial number of missing data generation approaches,

for different missing data mechanisms. Their drawbacks are clearly discussed and
aims to introduce a new method named SN-Sync for charging the missing data.
Additionally, SN-Sync is compared with some traditional techniques. In the future,
we can apply this technique for all the real-world systems, to produce accurate results,
where the observed data is missing.
References
1. M.S. Santos, R.C. Pereira, A.F. Costa, J.P. Soares, J. Santos, P.H. Abreu, Generating synthetic
missing data: a review by missing mechanism. IEEE Access 7, 11651–11667 (2019)
2. P. McMahon, T. Zhang, R.A. Dwight, Approaches to dealing with missing data in railway asset
management. IEEE Access 8, 48177–48194 (2020)
3. R. Madhuri, M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy, S.C. Satapathy, Cluster anal-
ysis on different data sets using K-modes and K-prototype algorithms, in International Confer-
ence and Published the Proceeding in AISC and Computing, vol. 249 (Springer, indexed by
SCOPUS, ISI proceeding etc., 2014), pp. 137–144. ISBN 978-3-319-03094-4
4. T. Kose, S. Ozgur, E. Coşgun, A. Keskinoglu, P. Keskinoglu, Effect of missing data imputation
on deep learning prediction performance for vesicoureteral reflux and recurrent urinary tract
infection clinical study. BioMed Res. Int., 15 (2020). Article ID 1895076
5. B.S. Panda, R.K. Adhikari, A method for classification of missing values using data mining
techniques, in 2020 ICCSEA (Gunupur, India, 2020), pp. 1–5
6. P.J. García-Laencina, P.H. Abreu, M.H. Abreu, N. Afonoso, Missing data imputation on the
5-year survival prediction of breast cancer patients with unknown discrete values. Comput.
Biol. Med. 59, 125–133 (2015)
7. D.C. Howell, The treatment of missing data in The Sage Handbook of Social Science
Methodology (Sage, London, UK, 2007), pp. 208–224
8. S.R. Narisetty, S. Farzana, P. Maheswari, L-semi-supervised clustering for network intrusion
detection. IJEAT 8(3S) (2019). ISSN: 2249-8958
9. https://tuvalabs.com/datasets/body_temperature_sex__heart_rate/activities.
10. J.P. Reiter, J. Drechsler, Releasing multiply-imputed synthetic data generated in two stages to
protect confidentiality. IAB Discussion Paper 200720 (2007)
11. A. Naik, S.C. Satapathy, K. Parvathi, Improvement of initial cluster center of c-means using
teaching learning based optimization. Proc. Technol. 6, 428–435 (2012)
12. M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy et al., Dimensionality reduction text data
clustering with prediction of optimal number of clusters. IJARITAC 2(2), 41–49 (2011)
Discovery of Popular Languages
from GitHub Repository: A Data Mining
Approach
K. Jyothi Upadhya, B. Dinesh Rao, and M. Geetha
Abstract Usage of Open Source Software (OSS) has been increased over the past
fifteen years among programmers and computer users. OSS communities work as
a “Bazaar” where the project constructors and end-users meet together and search
for suitable matches to their skills and requirements. OSS is emerging as a strong
competitor to commercial or closed software. GitHub is an OSS forge started in
2008 in order to simplify code sharing. It is a Web site and cloud-based service that
aids software developers to store, manage, track, and control changes to their code.
When a GitHub project fails, it results in the loss of time, effort, and resources of this
large community. The current need is to build models that find interesting factors
that contributes to the success of these projects. The massive repositories make this
domain a good candidate for exploratory research using the data mining approach.
In this work, the FP-Growth method is used to find the popular two programming
language combinations and is validated using the SPSS tool. The outcome of this
work benefits the OSS community in terms of time and resources.
Keywords Open source software · GitHub repository · FP-Growth
Usage of Open Source Software (OSS) has been increased from past fifteen years
among the programmers and computer users. These communities work as a “Bazaar”
where the project developers and end-users meet together and search for suitable
matches to their skills and requirements. Members of this community can view
and update the software for its improvization. They can also detect, fix bugs, and
K. J. Upadhya (B) · M. Geetha

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal
Academy of Higher Education, Manipal, Karnataka 576104, India
e-mail: jyothi.k@manipal.edu
M. Geetha
e-mail: geetha.maiya@manipal.edu
B. D. Rao
Manipal School of Information Sciences, Manipal Academy of Higher Education, Manipal,
Karnataka 576104, India
e-mail: dinesh.rao@manipal.edu
166 K. J. Upadhya et al.
contribute to the software. It is said that this evolutionary process can work faster than
the traditional hierarchical and closed model and produce better quality software.
Linux, Apache Server, and Mozilla Firefox are some best outcomes of the OSS
development [1, 2].
GitHub is an OSS forge setup in 2008 with the aim of simplifying code sharing. It is
a Web site and cloud-based service that assists software developers to store, manage,
track, and control changes to their code. Usage of GitHub is rapidly increasing. In
2017, the GitHub community associated 24 million people functioning across 67
million repositories. The current need is to build models that can be used to detect
the factors that are critical in the most successful projects among GitHub repositories.
This will benefit the users and the corporations to make the right choice of projects
[3].
Pattern mining is an important area within data mining that discovers interesting
patterns from the database. Rules found in a database can be classified as frequent
and rare rules. The support count or frequency of an item set is the total number of
records that contain the item set in the database file. A frequent item set is an item
set whose support count satisfies a user-defined minimum threshold. Otherwise, it
is called a rare item set that occurs infrequently which may represent unexpected or
previously unknown associations [4].
1 Problem Statement
Keeping the idea of free sharing, OSS communities allow access to the project source
code and other artifacts (e.g., e-mail communications, number of downloads, infor-
mation about developers, and bug reports) of these projects. As the GitHub usage
community is growing day by day, the necessity of tested models and generating
hypotheses for successful OSS projects also increases. The massive repositories
make this domain the best candidate for exploratory research using the data mining
approach. When an OSS project fails, it results in the loss of time, effort and resources
of the community. There is a requirement of building models that can predict the
outcome of OSS projects and find the features that contribute to the success of these
projects.
The remainder of the paper is organized as follows: Sect. 2 presents the literature
review of the research carried out. Section 3 describes the research methodology.
Section 4 puts forth the results and analysis. Inferences and future directions are
highlighted in Sect. 5.
2 Related Works
The papers reviewed have been surveyed in the direction of data mining techniques
used, types of association rules discovered, and different sources of data. Sanjay et al.
Discovery of Popular Languages from GitHub Repository … 167
[1] researched OSS by collecting data from the Sourceforge Web site, www.source
forge.net, with the goal of extracting the success patterns of the OSS projects. The
research discovered interesting classes of association rules by making use of a novel
concept called association rules network (ARN). The research concentrated on rules
with singleton consequents and the outcome of the research is validated using factor
analysis.
Raja et al. [2] developed a robust model which finds the factors that lead to success
of the OSS projects. The highlight of this research is the combined effects of logistic
regression (LR), neural networks (NN), and decision trees (DT). The research uses
SAS Enterprise Miner useful in the creation and validation of new models. This
work has shown that the projects developed after 2003 are in demand compared to
the older projects because of the movement of OSS projects.
Andi et al. [5] collected projects from the Sourceforge Web site, www.sourcefor
ge.net, to study the success factors. In this research, two item set association rules are
extracted to find the success factors of OSS projects. It has considered the number
of downloads as the critical parameter for success and formulated six success factors
for OSS projects.
Fragkiskos et al. [3] gathered the projects from GitHub with the intention of
finding six different association rules for successful OSS projects. Here, the main
focus was on GitHub user behavior. Association rules are discovered using the Apriori
algorithm, and the collected data are discretized using the k-means algorithm.
Hu et al. [6] researched GitHub repositories with the focus of understanding
the importance and influence of GitHub repositories. Using the GitHub user data
and repository data of a monthly star, graph was constructed. It performed a social
analysis by applying the HITS algorithm on the constructed star graph. This research
demonstrated how the repositories influence value changes every month.
Pattern Mining Approaches: Pattern mining approaches can be divided into
Apriori-based or tree-based algorithms. In Apriori-like algorithms [7], large number
of items are allowed to participate in an item set generation. Scanning the
database multiple times and the generation of large candidate sets decreases mining
performance.
Most of the pattern mining approaches follow tree-based [8–10] method that
employs the traditional FP-Growth [11] approach which constructs a frequent-pattern
tree (FP-tree). Here, the transactions are arranged in frequency ordered, and during
insertion of a transaction if there is any common prefix found, then the count of
all the items in that common prefix is incremented. The remaining part (if any) is
attached to the tree from the last node of that common prefix. If the entire transaction
is not found in the tree, then a new path is constructed in the tree by adding it to
the root node and initializing the support count of every node with value 1. This
tree is traversed in order to mine all the frequent patterns. The conditional pattern
base is generated for every item which is a small database of pattern counts that
appear with this item. This database is converted to a conditional FP-tree that is
recursively processed to discover the required pattern. Anindita et al. [12] performed
a remarkable comparison on the tree-based approaches based on several decisive

parameters.
3 Research Methodology
The sequence of research process is as shown in Fig. 1.

Steps followed in the research methodology are:
Step 1: Data is preprocessed, and the different programming languages used by each
user are identified.
Data Collection and Identification of Variables: From the GitHub Repository, 60
different GitHub user accounts are collected using the Random Sampling method.
From the GitHub repository schema, two different suitable variables named Program-
ming Languages and Number of followers are identified. The conceptual model is
as shown in Fig. 2. The variable Number of Downloads is identified as the extra-
neous variable. From the data obtained from GitHub user accounts, ten different
programming languages are selected.
Fig. 1 Sequence of research

process Data collection and data
preprocessing
Identify dependent, independent

and extraneous variables(if any)
Build model based upon

appropriate data mining technique
Discover frequent itemsets by

giving different samples to the model
Analyze and validate the extracted

frequent itemsets
Fig. 2 Conceptual model
Step 2: As Apriori method suffers from the drawback of multiple scans for finding
the support count, building the model FP-Growth method is employed.
A model is built with the following specifications:
• Input: Programming languages used by each user
• Output: 2-programming language combinations and their support count
• Algorithm Used: Frequent pattern growth algorithm
– Input: Transaction database
– Output: 2-programming language combinations with support count
– Preprocessing: The transaction items (programming languages in text form)
are converted into numbers
– Programming Language Used: C language.
In sample Table 1, transaction-id and the various programming languages used by
some user of the GitHub repository is shown. In the first scan, the support count of
different programming languages are calculated. These programming languages are
sorted based upon the support count in descending order and stored in the FP-tree
as shown in Fig. 3. Then, using the FP-Growth method, only the two programming
language combinations are generated.
Step 3: The hypothesis is built for the 2-programming language combinations
discovered and tested as shown in Sect. 4.
Table 1 Static dataset

TID Programming languages
T1 Python, C++, Javascript
T2 Python, Javascript, HTML
T3 Python, C++, HTML
T4 Python, Javascript, C++
T5 Python, Javascript
Fig. 3 FP-tree construction for the dataset shown in Table 1
4 Results and Analysis
The GitHub user accounts are divided into two groups depending upon fan followers.
Top 2-programming language combinations and their support count extracted by
the model for the two groups are given in Tables 2 and 3. Independent sample
T-Test [13] is performed for the two different groups as shown in the first hypoth-
esis test analysis. Another independent sample T-Test is performed within a single
group with fan followers >70. According to this test, the 2-programming language
combinations having either Javascript or Python shows more support counts.
The 2-programming language combinations discovered from these two program-
ming languages are considered popular languages compared with the remaining 2-
programming language combinations. The result of the test is validated in the second
hypothesis analysis.
The 2-programming language combinations discovered for two different groups
are analyzed using the following hypothesis:
• Null Hypothesis H0: Usage of popular languages has no significance contribution
in increase of number of fan followers
• Alternate Hypothesis Ha: Usage of popular languages has significance contri-
bution in increase of number of fan followers
As there are two different datasets according to the two different group of fan
followers independent sample T-Test is used to test the hypothesis. Corresponding
result obtained using SPSS tool for independent sample T-Test is used during
validation process.
The calculated and tabulated independent sample T-Test is obtained as below:
• Degrees of freedom: 29
• Level of significance: 0.05
Table 2 2-programming
2-programming language Support count Fan followers >70
language combinations with
combinations
support count and more fan
followers group Python-C++ 7 12,285
Javascript-Python 6 14,198
Javascript-C++ 5 24,893
Python-Go 4 1516
C++ -HTML 4 874
Javascript-HTML 3 1564
Javascript-Go 3 961
Python-HTML 3 967
Python-Shell 3 428
Javascript-Shell 2 190
Javascript-CSS 2 220
Javascript-Ruby 2 1421
Python-PHP 2 310
C++-Shell 2 538
C++-PHP 2 310
HTML-PHP 2 310
Shell-CSS 2 252
Javascript-PHP 1 273
• Significance obtained for two tailed = 0.049 = (0.049)/2 = 0.024 (for one tailed)
< 0.05.
• Tabulated independent sample T-Test value t tabulated : 1.699
• Calculated independent sample T-Test value t calculated : 1.892.
Since t calculated ≥ t tabulated , it is observed that the null hypothesis is rejected. So
Usage of popular languages has significance contribution in increase of number of
fan followers.
2-programming language combinations generated by considering the single group
with more number of fan followers given in Table 2 are analyzed using the following
hypothesis:
• Null Hypothesis H0: Usage of popular languages has no significant contribution
in the increase of the number of fan followers
• Alternate Hypothesis Ha: Usage of popular languages has a significant contri-
bution to the increase of the number of fan followers
As there are two different datasets according to the popularity of 2-programming
language combination, independent sample T-Test is used to test the hypothesis.
The corresponding result obtained using SPSS Tool for independent sample T-Test
is used during the validation process.
The calculated and tabulated independent sample T-Test is obtained as below:
Table 3 2-programming
2-programming language Support count Fan followers < 70
language combinations with
Combinations
support count and less fan
followers group Python-C++ 2 16
Javascript-Python 6 175
Javascript-C++ 3 43
Python-Go 1 26
C++-HTML 0 0
Javascript-HTML 4 83
Javascript-Go 0 0
Python-HTML 6 89
Python-Shell 1 26
Javascript-Shell 1 51
Javascript-CSS 5 217
Javascript-Ruby 2 39
Python-PHP 0 0
C++-Shell 0 0
C++-PHP 2 29
HTML-PHP 0 0
Shell-CSS 1 51
Javascript-PHP 31 85
• Degrees of freedom: 25
• Level of significance: 0.05
• Significance obtained for two tailed = 0.079 = (0.079)/2 = 0.04 (for one tailed)
< 0.05.
• Tabulated independent sample T-Test value t tabulated : 1.708
• Calculated independent sample T-Test value t calculated : 1.892.
As t calculated ≥ t tabulated , it is observed that the null hypothesis is rejected. So usage
of popular languages has a significant contribution to the increase of the number of
fan followers.
5 Inferences and Future Directions
The model designed using the FP-Growth algorithm extracts popular 2-programming
language combinations along with support count. These combinations are analyzed
and validated using the variable fan followers. The analysis shows the extraction
of popular language combinations contributed to the success of the GitHub reposi-
tory. This work can be extended to discover popular triplet programming language
combinations and also frequent item sets using the other variables from the GitHub
repository which contributes to the success of the project. New research direction can
be followed to search association rules that find factors for the failure of the project.
References
1. S. Chawla, B. Arunasalam, J. Davis, Mining open source software (OSS) data using association
rules network, in Pacific-Asia Conference on Knowledge Discovery and Data Mining (Springer,
2003), pp. 461–466
2. U. Raja, M. Tretter, Investigating open source project success: a data mining approach to model
formulation, validation and testing, in Proceedings of SUGI, vol. 31 (2006)
3. F. Chatziasimidis, I. Stamelos, Data collection and analysis of github repositories and users,
in 2015 6th International Conference on Information, Intelligence, Systems and Applications
(IISA) (IEEE, 2015), pp. 1–6
4. Y.S. Koh, S.D. Ravana, Unsupervised rare pattern mining: a survey. ACM Trans. Knowl. Discov.
Data (TKDD) 10(4), 45 (2016)
5. A.W.R. Emanuel, R. Wardoyo, J.E. Istiyanto, K. Mustofa, Success factors of OSS projects from
source forge using data mining association rule, in International Conference on Distributed
Framework and Applications (DFmA) (IEEE, 2010), pp. 1–8
6. Y. Hu, J. Zhang, X. Bai, S. Yu, Z. Yang, Influence analysis of github repositories. SpringerPlus
5(1), 1268 (2016)
7. R. Agrawal, R. Srikant, et al., Fast algorithms for mining association rules, in Proceedings
of the 20th International Conference on Very Large Data Bases, VLDB. vol. 1215 (1994),
pp. 487–499
8. G. Grahne, J. Zhu, Fast algorithms for frequent itemset mining using fp-trees. IEEE Trans.
Knowl. Data Eng. 17(10), 1347–1362 (2005)
9. S.K. Tanbeer, M.M. Hassan, A Almogren., M. Zuair, B.S Jeong, Scalable regular pattern mining
in evolving body sensor data. Future Gener. Comput. Syst. 75, 172–186 (2017)
10. S. Tsang, Y.S. Koh, G. Dobbie, Rp-tree: rare pattern tree mining, in International Conference
on Data Warehousing and Knowledge Discovery (Springer, 2011), pp. 277–288
11. J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation. ACM Sigmod
Rec. 29, 1–12 (2000)
12. A. Borah, B. Nath, Tree based frequent and rare pattern mining techniques: a comprehensive
structural and empirical analysis. SN Appl. Sci. 1(9), 972 (2019)
13. C.R. Kothari, Research Methodology Methods and Techniques (New Age International
Publications, 2004)
Performance Analysis of Flower
Pollination Algorithms Using Statistical
Methods: An Overview
Pratosh Bansal and Sameer Bhave
Abstract Flower pollination algorithm and its variants are bio-inspired metaheuris-
tics. The performance analysis of flower pollination algorithm and variants of the
same, has been carried out with the help of statistical analysis to a certain degree.
Their comparison with other metaheuristic algorithms has also been done; some-
times, with the help of statistical methods and mostly with the help of benchmarking
functions. More exploration can be done in this regard. This paper is an attempt to
take a bird’s eye view of some of the work done in the context of flower pollination
algorithm and a few of its variants and the insights gained in the context of each one
of them, along with an overview of the statistical methods that have been used so far
in carrying out the performance analysis of the same. The insights listed herein also
point toward further research that can be possibly conducted in this context.
Keywords Flower pollination algorithm (FPA) · Statistical methods · Performance

analysis
1 Introduction
Artificial intelligence and computational intelligence are ruling the computing

scenario today. There is a lot of overlap between these two fields. In the context
of intelligence-based problem solving via computers, heuristics or rules of thumb
are utilized at times to infuse intelligence in our algorithms and programs. General
purpose heuristics are also known as metaheuristics and they primarily belong to the
domain of computational intelligence.
A major chunk of bio-inspired metaheuristics belongs to the domain of Stochastic
Methods. These algorithms mimic some natural phenomenon via mathematical
implementations. Ant Colony Optimization, Particle Swarm Optimization, Bat
P. Bansal · S. Bhave (B)

Information Technology Department, Institute of Engineering and Technology, Devi Ahilya
Vishwavidyalaya, Indore, Madhya Pradesh, India
176 P. Bansal and S. Bhave
Algorithm and Firefly Algorithm are some of the well-known bio-inspired algo-
rithms. The fitness functions used in these algorithms help to ascertain whether the
problem-solving process is heading in the correct direction.
Xin-She Yang proposed the flower pollination algorithm in 2012 [1]. It mimics
the natural phenomenon of transferring pollen grains between different flowers of
the same type that is commonly known as cross pollination or biotic pollination.
Birds, wind, etc., play an important role in transferring pollen grains from one flower
to another. They are known as pollinators. The activity of transferring pollens from
one flower to another is termed as global pollination and has been modeled after
Levy Distribution, which is popularly known as Levy Flight. Local pollination takes
place within a single flower itself. Flower constancy ratio is related with the degree
of similarity between two flowers. It has been observed that the standard FPA works
because of flower constancy and long-distance pollinators. Both local and global
pollination mechanisms are controlled by a parameter P that lies in the interval [0,
1].
The FPA was basically proposed for solving continuous optimization problems.
There is a possibility to utilize it in its original form or a modified form to solve other
types of problems as well. There have been several attempts in this direction as well,
and there is a plethora of possibilities to further explore with FPA and its variants.
Population diversity needs to be maintained, while using the FPA as it will help to
obtain precise solutions in Fig. 1.
2 An Overview of Some of the Work Done in the Context

of Flower Pollination Algorithm/Its Variants
Many hybrid versions and modifications of the original FPA have been proposed
in the recent years and comparative studies have been conducted by running these
variants and a few other bio-inspired algorithms on the same data set pertaining to a
specific domain or have been executed on the same benchmarking functions to gain
insight about their performance. Nabil [2] added the cloning operator from clonal
selection algorithm to flower pollination algorithm. Twenty-three test functions were
used with this modification and the modified approach worked well with them.
Cui and He [3] used Orthogonal Learning Strategy and Catfish Effect Mechanism
for solving global optimization problems. The Orthogonal Learning Strategy is based
on Orthogonal Experiment Design, and it is embedded into the local pollination
operator of flower pollination algorithm. Catfish Effect Mechanism helped them to
maintain population diversity. There is still vent for exploring the orthogonal learning
method on various sources of information. Moreover, the application of OL based
flower pollination algorithm to real world problems in engineering, for example,
identification of parameters and selection of features, is still to be done.
Abdel-Basset and Shawky [4] have carried out a detailed study of the original
flower pollination algorithm comparing it with Genetic Algorithm, Particle Swarm
Performance Analysis of Flower Pollination Algorithms … 177
Fig. 1 Flowchart of flower pollination algorithm
Optimization, Ant Colony Optimization, Cuckoo Search, Gray Wolf Optimizer and
Grasshopper Optimization Algorithms. They concluded that flower pollination algo-
rithm needs to be improved on the grounds of avoiding premature convergence and
time required to execute.
Lukasik et al. [5] used Calinski-Harabasz Index in the cost function of flower polli-
nation algorithm and performed cluster analysis on various data sets. They concluded
that flower pollination algorithm based solution provides high clustering accuracy.
They have suggested the usage of other metaheuristic mechanisms with flower polli-
nation algorithm and comparative analysis of algorithms as future work in their paper.
Rodrigues et al. [6] have shown that the binary version of flower pollination algo-
rithm in which the positions of agents are converted into binary strings to denote the
presence or absence of features; and a discretization mechanism using a constrained
sigmoid function to map them to the Boolean lattice is employed, has results compa-
rable to some of the evolutionary techniques in vogue. They suggest exploration of
various offshoots of flower pollination algorithm for analytical purpose as well.
The binary flower pollination algorithm has been used to solve antenna posi-
tioning problem also and has performed well according to Dahi et al. [7]. The
mapping techniques used affect the performance of the nature inspired optimiza-
tion algorithm under consideration. A modified flower pollination algorithm with
two quadratic objective functions has been used for solving multi-objective environ-
mental/economic dispatch problems by Gonidakis [8]. This version of flower pollina-
tion algorithm was able to find lower values for the concerned conflicting objectives.
Gonidakis [8] indicates flower pollination algorithm as a powerful metaheuristic that
can be further applied and explored in various domains.
Shambour et al. [9] endeavored to improve the exploration rate of flower polli-
nation algorithm by guiding the search process toward more promising areas of the
search space via modification to original flower pollination algorithm. The perfor-
mance of the new version of flower pollination algorithm was evaluated for Artifi-
cial Neural Network Weight Adjustment and numerical benchmark functions. Six
nature inspired algorithms including the modification proposed were compared. The
modified algorithm gave better or equal performance to standard flower pollination
algorithm in 80% of the total experiment cases.
Zhou et al. [10] applied the discrete greedy flower pollination algorithm for
spherical traveling salesman problem and also compared their approach with a few
version(s) of genetic algorithm and tabu search and concluded that their discrete
greedy flower pollination algorithm works faster in most cases and is relatively stable.
They had modified the biotic pollination process by using order-based crossover
and pollen discarding behavior. Order-based crossover accelerates the convergence
of discrete greedy flower pollination algorithm and pollen discarding step adapted
from artificial bee colony algorithm helps to avoid local minima trap and improves
the global search ability of the discrete greedy flower pollination algorithm. It has
been pointed out by these researchers that the efficiency of discrete greedy flower
pollination algorithm may decrease due to increase in the number of cities.
According to Yang et al. [11] nature inspired algorithms have been applied effec-
tively in the areas of telecommunication, image processing, engineering design,
vehicle routing, etc. These algorithms are simple and flexible and can be applied
to hard problems but their computational cost is high owing to several internal evalu-
ations in these algorithms. They have even been used to solve hard problems like the
Traveling Salesman Problem where suboptimal solutions given by these algorithms
have been found to be very useful.
Li et al. [12] have proposed a new adaptive version of the flower pollination
algorithm based on opposition-based learning strategy and t-distribution. They have
named this new algorithm as OTAFPA. The t-distribution variation has been utilized
in OTAFPA to create a new search direction so that the population may be diverse,
and this assists this new algorithm to avoid getting trapped in the local optimum. The
initial pollen population has been optimized using the opposition-based learning
strategy. Eight test functions were used in simulation runs, and they support the
fact that OTAFPA has better optimization ability as compared to the standard flower
pollination algorithm and a few of its variants.
Ma and Wang [13] used the concept of Random Walk, in local pollination instead
of Levy Flight that is used in the standard flower pollination algorithm; to create a
modified algorithm. They have also used the Clonal Selection Algorithm (CSA) for
generating the population. The solutions generated from random walks drawn from
random uniform distribution in [0, 1]; were found to converge faster as compared
to those generated via Levy Flights. They concluded that different functions have
varied requirements for the parameters and the rate of convergence and precision of
optimization are related very closely to the setting of parameters.
Galvez et al. [14] proposed the multimodal flower pollination algorithm that is a
modified version of the original flower pollination algorithm with multimodal capa-
bilities so that it can find all possible optima for a given optimization problem. Exper-
imentation has indicated that this version is more accurate and robust as compared
to some other multimodal optimization algorithms.
Several benchmarking functions like Rastrigin function, Griewank function,
Shwefel function, Rosenbrock function, Sphere function, etc., have been used to
compare the performance of the flower pollination algorithm and its off-shoot
algorithms with other nature inspired algorithms. These benchmark functions help
to gauge the effectiveness and robustness of optimization algorithms like flower
pollination algorithm and of course its variants.
There are yet many possibilities for designing better variants based on the stan-
dard flower pollination algorithm, to address specific needs of different optimization
problems. See Table 1 for a quick reference.
3 Importance of Statistics in Performance Analysis

of Flower Pollination Algorithm and Its Off-Shoots
Abdel-Basset and Shawky [4] studied the flower pollination algorithm and its variants
in detail along with other evolutionary algorithms and stressed on the requirement of
better statistical analysis of results generated by these algorithms. Statistical analysis
may assist a lot in gauging the performance of nature inspired algorithms, including
the flower pollination algorithm and its variants.
According to Chiroma et al. [15] the binary variant of flower pollination algorithm
has not been explored much and statistical analysis for validation of results has
not been done much for validation of experimental results with reference to the
existing variants of flower pollination algorithm, as of 2015. Very few research papers
have statistical analysis with reference to this algorithm and algorithms based on it.
Moreover, even if it has been carried out, only a few statistical aspects have been
taken into consideration.
Most of the studies done so far on the flower pollination algorithm and its variants
on the basis of statistics have focused on mean, mode and standard deviation. Some
researchers have included one or more of statistical measures like Friedman’s two-
way analysis of variances by ranks, Confidence Intervals, Wilcoxon Signed Rank
Table 1 A glimpse of a few advancements with respect to the family of flower pollination algorithm(s)
180
Serial number Researcher(s) Observation Number of benchmarking functions used

1 Nabil [2] The modified algorithm works well with the benchmark 23
functions used
2 Cui and He [3] Catfish effect gives population diversity 14
3 Abdel-Basset and Shawky [4] FPA performs optimally with respect to the algorithms None
selected for comparative study
4 Lukasik et al. [5] Performed cluster analysis on various data sets None
5 Rodrigues et al. [6] Has results comparable to some of the evolutionary None
techniques
6 Dahi et al. [7] Optimal behavior as compared to a few existing None
evolutionary algorithms
7 Gonidakis [8] Works at par with some existing evolutionary algorithms None
8 Shambour et al. [9] Endeavored to improve the exploration rate of flower 23
pollination algorithm
9 Zhou et al. [10] Discrete Greedy flower pollination algorithm works faster None
in most cases and is relatively stable
10 Li et al. [12] The t-distribution variation results in population diversity 8
and helps the algorithm to jump out of the local optima
trap
11 Ma and Wang [13] Solutions converged faster with Random walks as 6
compared to Levy Flights. Rate of convergence and
precision of optimization are related very closely to the
setting of parameters
12 Galvez et al. [14] This version is more accurate and robust as compared to 14
some other multimodal optimization algorithms
(continued)
P. Bansal and S. Bhave
Table 1 (continued)
Serial number Researcher(s) Observation Number of benchmarking functions used
13 He et al. [17] Behavior of standard FPA was satisfactory None
Performance Analysis of Flower Pollination Algorithms …
181
Test, Markov Chains, Kruskal Wallis Test, etc., to analyze the solution(s) obtained
by applying FPA and/or its variants. There are many statistical tests that may give
more insight regarding the performance of algorithms. Not all of the statistical tests
have been applied in every case or research so far and hence there is a chance of
obtaining more insights by using a greater number of tests uniformly over a set of
algorithms of this shown in Table 1 type. It is not feasible to apply all of the tests
in every situation as some tests may be more suitable to analyze the output of the
algorithm in a given scenario, while other statistical tests may not fit well in the same
context.
Here is an overview of the statistical techniques that have been employed at
times by researchers to evaluate the performance of flower pollination algorithm and
algorithms that have branched out of it:
Mean, Mode and Standard Deviation of the results obtained by executing the stan-
dard flower pollination algorithm, and its variants using different benchmarking func-
tions have been tabulated by a few researchers but the domain of statistics does have
many other mechanisms to guide regarding the overall performance of algorithms.
Statistical tests are of two types: Parametric and Non-Parametric. Parametric tests
focus on ratio data and interval data. Non-Parametric tests focus on rank-based data,
ordinal data and categorical data.
Friedman’s two-way analysis of variances by ranks is a non-parametric test. This
method is used to compare, rather find the difference between two related samples.
Nabil et al. [2] used this test for comparing the performance of their modified version
of the flower pollination algorithm with other algorithms. This test has also been
used by Dahi et al. [7] in the context of their modified version of flower pollination
algorithm for antenna positioning.
Confidence interval is a range of values within which there is a greater possibility
that the estimates would lie. Liu et al. [16] have proposed a mechanism based on
visualization of confidence intervals to benchmark stochastic algorithms for global
optimization problems. This has relevance in the context of flower pollination algo-
rithm and other algorithms based on it as this group of algorithms is also stochastic
in nature.
Wilcoxon’s Signed Rank Test is a non-parametric test that is used in the context
of paired data. Nabil et al. [2] used this test as well to compare the performance of
their modified version of the flower pollination algorithm with other algorithms.
Markov Chains loosely belong to the category of models based on statistics and
rely on historical data. Stochastic processes can be modeled using Markov Chains.
He et al. [17] have performed the global convergence analysis of flower pollination
algorithm using discrete time Markov Chain approach. They assumed the parameter
values to be fixed and used simple vectors as solution vectors to keep the analysis
simple. According to them, there is a gap between the theory and practice of bio-
inspired algorithms.
A lot of theoretical analysis needs to be carried out further to understand the
convergence of these algorithms. The rate of convergence clearly depends on the
parameter settings and the structure of the algorithm shown in Table 2. Parameter
tuning is also an important concern in this context. The standard flower pollination
Table 2 A glimpse of some research work with respect to the family of flower pollination
algorithm(s) and the use of statistical analysis therein
Serial number Researcher(s) Limitation Details of the statistical
analysis done
1 Nabil [2] Cloning ratio is fixed Friedman’s two-way
analysis of variances by
ranks and Wilcoxon’s
signed rank test
2 Cui and He [3] Application to real world Wilcoxon’s signed rank test
problems such as
parameter identification
and feature selection is
pending
3 Abdel-Basset and Limited Statistical analysis Friedman’s test
Shawky [4]
4 Lukasik et al. [5] Hybridized versions are Rand index, standard
yet to be explored deviation and pairwise
T-tests
5 Rodrigues et al. [6] Parameters of flower None
pollination algorithm need
to be considered
6 Dahi et al. [7] Better mapping techniques Friedman’s two-way
are required analysis of variances by
ranks, Kruskal–Wallis
one-way analysis of
variance test, Bartlett test
for testing the homogeneity
of variance and
Kolmogorov–Smirnov test
for testing the normality of
distribution
7 Gonidakis [8] Hybridization with other None
approaches needs to be
studied
8 Shambour et al. [9] Real world application is Mean and standard
lacking deviation
9 Zhou et al. [10] Performance may decrease Mean, standard deviation
as the number of cities is and rank
increased. This needs to be
tackled
10 Li et al. [12] No statistical analysis None
11 Ma and Wang [13] No statistical analysis None
12 Galvez et al. [14] No statistical analysis None
13 He et al. [17] Restricted to original Markov Chains
flower pollination
algorithm
algorithm has been shown to have global convergence for sure with the help of
Markov models.
Dahi et al. [7] have also used Kruskal–Wallis one-way analysis of variance test,
Bartlett test for testing the homogeneity of variance and Kolmogorov–Smirnov test
for testing the normality of distribution for the statistical analysis of their approach.
Table 2 summarizes some of the research endeavors with respect to the flower polli-
nation algorithm and variants along with the limitations that can act as pointers for
future research work and the role of statistical tests in analyzing their respective
performance.
4 Future Work and Conclusion
The future work in this context includes the usage of a greater number of statistical
tests for obtaining a clearer insight about the overall performance of the nature
inspired algorithm like flower pollination algorithm at hand. A special strategy for
using a set of widely applicable statistical tests may be created for this purpose. This
may include computerized as well as manual statistical analysis of algorithms. This
detailed statistical analysis may lead to better conclusions regarding the application
of the nature inspired algorithms like flower pollination algorithm and its variants to
real world problems.
The performance of algorithms also relates with their exploration as well as
exploitation of the search space. These aspects need to be studied in detail and
statistical tests/measures may assist in this aspect as well.
Each statistical value or estimate gives some insight about the performance of the
algorithm. For instance, standard deviation is connected with the credibility of the
algorithm. Many insights may be gained by delving deeper into the analysis of flower
pollination algorithm and algorithms based on this basic strategy with the help of
statistical mechanisms.
Statistical analysis of an algorithm gives us concrete results regarding the perfor-
mance of the algorithm and also helps us to know which algorithm performs better
than the other under which set of conditions. An important point to note is that not
all statistical tests can be applied in all situations. Different statistical tests are to be
applied in different scenarios for a variety of reasons. A greater number of appli-
cable tests may be used to reveal deeper and better insights so that better versions of
algorithms may be designed to solve a wider range of problems.
References
1. X.S. Yang, Flower pollination algorithm for global optimization, in Unconventional Computa-
tion and Natural Computation. UCNC 2012, ed. by J. Durand-Lose, N. Jonoska. Lecture Notes
in Computer Science, vol. 7445 (Springer, Berlin, Heidelberg, 2012). https://doi.org/10.1007/

978-3-642-32894-7_27
2. E. Nabil, A modified flower pollination algorithm for global optimization, in Elsevier Expert
Systems with Applications (2016)
3. W. Cui, Y. He, Biological flower pollination algorithm with orthogonal learning strategy and
catfish effect mechanism for global optimization problems. Hindawi J. Math. Probl. Eng. (2018)
4. M. Abdel-Basset, L.A. Shawky, Flower Pollination Algorithm: A Comprehensive Review
(Springer Science+Business Media B.V. part of Springer Nature, 2018)
5. S. Lukasik et al., Clustering using flower pollination algorithm and Calinski-Harabasz index,
in IEEE Congress on Evolutionary Computation (CEC) (2016)
6. D. Rodrigues, X.-S. Yang, A.N. de Souza, J.P. Papa, Binary flower pollination algorithm and
its application to feature selection, in Recent Advances in Swarm Intelligence and Evolutionary
Computation (Springer, 2015), pp. 85–100
7. Z. Abd El Moiz Dahi, C. Mezioud, A. Draa, On the efficiency of the binary flower pollination
algorithm: application on the antenna positioning problem, in Applied Soft Computing (Elsevier,
2016)
8. D. Gonidakis, Application of flower pollination algorithm to multi-objective environ-
mental/economic dispatch. Int. J. Manag. Sci. Eng. Manag. (2015)
9. M.K.Y. Shambour, A.A. Abusnaina, A.I. Alsaibi, Modified global flower pollination algorithm
and its application for optimization problems, in Interdisciplinary Sciences: Computational
Life Sciences (Springer, 2018)
10. Y. Zhou, R. Wang, C. Zhao, Q. Luo, M.A. Metwally, Discrete greedy flower pollination algo-
rithm for spherical travelling salesman problem, in The Natural Computing Applications Forum
(2017)
11. X.-S. Yang, S. Deb, S. Fong, X. He, Y.-X. Zhao, From swarm intelligence to metaheuristics:
nature- inspired optimization algorithms, in Cover Feature in Emerging Computing Paradigms.
COMPUTER, 2016 (IEEE Computer Society, 2016)
12. W. Li, Z. He, J. Zheng, Z. Hu, Improved flower pollination algorithm and its application in
user identification across social networks, in IEEE Access, Special Section on AI-Driven Big
Data Processing: Theory, Methodology and Applications (2019)
13. X.-X. Ma, J.-S. Wang, An improved flower pollination algorithm to solve function optimization
problem. IAENG Int. J. Comput. Sci. 45(3), 364–370 (2018)
14. J. Galvez, E. Cuevas, O. Avalos, Flower pollination algorithm for multimodal optimization.
Int. J. Comput. Intell. Syst. 10, 627–646 (2017)
15. H. Chiroma et al., A review of the applications of bio-inspired flower pollination algorithm,
in The 2015 International Conference on Soft Computing and Software Engineering (SCSE
2015) (2015)
16. Q. Liu, W.-N. Chen, J.D. Deng, T. Gu, H. Zhang, Z. Yu, J. Zhang, Benchmarking stochastic
algorithms for global optimization problems by visualizing confidence intervals. IEEE Trans.
Cybernet. 47(9) (2017)
17. X. He, X.-S. Yang, M. Karamanoglu, Y. Zhao, Global convergence analysis of the flower
pollination algorithm: a discrete time markov chain approach, in International Conference on
Computational Science (ICCS, 2017) (Zurich, Switzerland, 2017)
Counterfactual Causal Analysis
on Structured Data
Swarna Kamal Paul, Tauseef Jamal Firdausi, Saikat Jana, Arunava Das,
and Piyush Nandi
Abstract Data generated in a real-world business environment can be highly

connected with intricate relationships among entities. Studying relationships and
understanding their dynamics can provide deeper understanding of business events.
However, finding important causal relations among entities is a daunting task with
heavy dependency on data scientists. Also due to fundamental problem of causal
inference, it is impossible to directly observe causal effects. Thus, a method is
proposed to explain predictive causal relations in an arbitrary linked dataset using
counterfactual type causality. The proposed method can generate counterfactual
examples with high fidelity in minimal time. It can explain causal relations among
any chosen response variable and an arbitrary set of independent causal variables to
provide explanations in natural language. The evidence of the explanations is shown
in the form of a summarized connected data graph.
Keywords Counterfactual causality · Genetic algorithm · Explainable AI
1 Introduction
Data collected in a business enterprise contains all the story. Insights gained by
obtaining causal answers for business events will help devising better future strategy.
However, causality is a fleeting concept. Absolute causality in chaotic world is
S. K. Paul (B) · T. J. Firdausi · S. Jana · A. Das · P. Nandi

Tata Consultancy Services, Gitanjali Park, Kolkata, India
e-mail: swarna.kpaul@tcs.com
T. J. Firdausi
e-mail: tauseef.firdausi@tcs.com
S. Jana
e-mail: saikat.jana@tcs.com
A. Das
e-mail: arunava.das10@tcs.com
P. Nandi
e-mail: nandi.piyush@tcs.com
188 S. K. Paul et al.
extremely difficult to find. Counterfactual type causality is at the highest level on

the ladder of causation [1]. Constructing a black box model from the data can reveal
causal relations by predicting consequences on simulated interventions and thereby
finding the counterfactuals. A condensed set of causal explanations are derived from
counterfactuals and provided in natural language for easy interpretation. The usability
of the solution is three-fold. It reveals causal structure in the business process. It helps
finding important features influencing a response variable and their range of influ-
ence. It provides evidence and details of explanations in form of a summarized data
graph and enables deeper understanding of business dynamics.
Rest of the paper is organized as following. Section 2 presents some related
works. Section 3 provides a description of causal analysis method. Section 4 discloses
experimental results followed by conclusion.
2 Related Work
TETRAD [13] is a diverse collection of causal analysis methods for climate study
which focuses heavily on the statistical behavior of the data provided as input,
allowing to investigate elements like causal graphs, causal effects, feature engi-
neering, and simulations to derive a number of causal inferences. Rulex Explain-
able AI [12] focuses on explaining complex machine learning models using plain
language. Its proprietary algorithms create predictive models in the form of first-
order conditional logic rules. Causality can also be found by generating counterfac-
tual examples which in turn can be generated by solving the optimization problem of
minimizing distance between original and counterfactual samples [7]. PermuteAt-
tack [9] uses genetic algorithm to create counterfactuals which in turn can be used to
establish causality. Artelt [10] created a python toolbox called ceml for generating
counterfactuals. It can be used to explain machine learning models and find causal
relations between variables. Mothilal et al. [11] proposed a framework for gener-
ating counterfactuals which can satisfy feasibility conditions based on constraints
and diversity among counterfactuals.
3 Counterfactual Causal Analysis
In the context of Artificial Intelligence (AI), Explainable AI (XAI) [2] can be defined
as a method or techniques to explain the outcome of a black box ML model. In
this context, XAI has been used to find and explain causal influence of multiple
independent variables on the response variable. The proposed method creates a best
possible black box model in terms of accuracy of predicting the response variable with
respect to the causal variables. Later, the same model is used to find causal influence
of each causal variables on the response variable by generating counterfactuals using
perturbation method.
Counterfactual Causal Analysis on Structured Data 189
3.1 Generating Counterfactual Examples
Counterfactual examples are samples which are minimally modified with respect
to the original sample to alter the predicted value by a model. Thus, counterfac-
tual explanations provide statements as smallest changes required to alter certain
predicted value or decision. Majority of current well-known XAI methods are feature
attribution based [6, 12]. Wachter et al. [7] proposed generating counterfactual exam-
ples can be represented as optimization problem by minimizing distance between
original and counterfactual samples.
There are several limitations in the existing methods of generating counterfac-
tuals [8] for serving the purpose in this context. Thus, an algorithm is proposed for
generating counterfactuals which is model agnostic and based on gradient-free opti-
mization, named as Genfact. It can generate multiple counterfactuals at once and can
do amortized inference [8], thus making the process fast. Given a dataset, it can find
counterfactual pairs closest to each other, and the pairs may not exist in the original
dataset. This feature is useful in this context as the given dataset used for generating
counterfactuals may not contain enough samples around the classification boundary,
but the proposed method can generate samples around the boundary.
Algorithm 1 states the Genfact algorithm for generating counterfactuals. The algo-
rithm works for both categorical and numerical values. If the response variable is
numeric, it is divided into C classes by defining ranges for each class. The encoded
feature data is clustered into K clusters. This is done to group the nearest neigh-
bors which in turn can be used as initial population for the genetic algorithm. Each
cluster is assigned a normalized diversity score which is proportional to the entropy
of the predicted classes of all samples in a cluster. Higher diversity score signi-
fies better mixture of samples from different classes. Genetic algorithm is run on
each cluster in the order of normalized diversity score until 40% of the samples are
covered. The crossover operation handles both categorical and numerical variables
and adjusts them in a way to avoid creating non-feasible samples. The mutated numer-
ical feature values are bounded by the range defined by maximum and minimum
values in the sample set within the cluster. The categorical values are shuffled among
available values in the samples within the cluster. This way it satisfies the actionability
feature mentioned in [8]. The final output consists of counterfactual pairs of samples.
PermuteAttack [9] also uses genetic algorithm to create counterfactuals; however,
they cannot do amortized inference and generate counterfactual sample only for the
input sample. Also, no separate way of handling categorical and numerical values is
mentioned.
Algorithm 1
genfact(samples, model, C, maxoffspringsize, maxiterations, maxpopulationsize)

response variable model.predict(samples)
if response variable is numeric
divide response variable in C classes such that each bucket range contains at least
N/|C| samples, where N Total #samples
Create K clusters with feature samples, where K = round(#feature samples/
(clustersize ∗ 2))
entropy of each cluster −∑ ), = probability of class in the
cluster
cluster normalized diversity score entropy/log(size of cluster + 1)
sort clusters in descending order based on normalized diversity score
initialize factuals, counterfactuals
processeddatasize = 0
for each cluster k in K
newfactuals, newcounterfactuals genetic_algo(k, model, maxiterations,
maxoffspringsize, maxpopulationsize)
append newfactuals to factuals and newcounterfactuals to counterfactuals
if processeddatasize >= 0.4N
break
processeddatasize processeddatasize + |k|
return (factuals, counterfactuals)
genetic_algo(facts, model, maxiterations, maxoffspringsize, maxpopulationsize)

for i in 1 to maxiterations
facts crossover (facts, model, maxoffspringsize)
fitness, counterfacts calculate_fitness(facts)
facts, counterfacts selectbest(fitness, facts, counterfacts , populationsize =
min(maxpopulationsize, | facts |*2)
return (facts, counterfacts)
crossover(facts, model, maxoffspringsize)

initialize offspringsize = 0
while offspringsize < maxoffspringsize
parent1 randomly select one from facts
feature randomly select a feature
parent2 randomly select one from facts having different predicted class from
parent1
if feature is categorical
swap feature of parent1 with parent2
else if feature is numeric
feature of parent1 (feature of parent1 + feature of parent2)/2
offspring parent1
class of offspring model.predict(offspring)
if offspring present in facts
continue
add offspring in facts
offspringsize offspringsize +1
return facts
calculate_fitness(facts)
for each sample in facts
fitness minimum Euclidean distance from other samples in facts having differ-
ent predicted class
counterfacts other sample in facts having minimum distance from sample and
different predicted class from that of sample
return (fitness, counterfacts)
select_best(fitness, facts, counterfacts, populationsize)

facts select top populationsize facts based on fitness
counterfacts select top populationsize counterfacts based on fitness
return (facts, counterfacts)
3.2 Generating Causal Explanations
Causal explanations can be obtained by training a simple surrogate decision tree on

counterfactual examples. The generated tree is represented by internal nodes acting
as decision points and each leaf node as final prediction of a unique class. Each path
from root to leaf node outlines the features involved in that classification and the
probability of the outcome occurring. All that information can now be formatted into
facts/statements in natural language for ease of understanding.
The number of explanation statements generated is proportional to the number
of leaf nodes present in the decision tree. It may become difficult to comprehend all
the statements if the size of the explainer tree increases. So, reducing the number of
statements and shortening the length of a single statement is necessary for summa-
rizing the explanations so that causal explanations can be quickly comprehended by
a user. Different summarization approaches are proposed for each problem class.
In case of binary classification problem, there are only two classes. But same
class appears multiple times in the statements. To summarize, for a specific class
all conditions are grouped together from different statements. Within a group same
feature may appear multiple times with different boundary condition. To shorten the
length of a statement, conditions having common features are merged by superim-
posing the boundary conditions. Thus, multiple statements with repeating classes are
converted to summarize statements consisting only two unique classes. In case of
multiclass classification problems, statements for each unique class are sorted based
on the probability score of the class which is derived from a specific statement. The
statements with top 3 probability scores are finally selected as summarized state-
ments. In regression problems, a statement does not give any probability score like
classification. It gives some estimated value of the response variable. As all of these
values are some estimations, so a confidence interval is calculated using the root
mean squared error of each estimation. The statements with top 3 sample sizes are
only considered.
Filtering out relevant statements are followed by shortening of statement length
if required. A statement containing some categorical feature may have a long list of
values which can lengthen the statement. So, the list of values is filtered based on
their frequency of occurrence in the original data. Top 5 unique values with highest
frequencies are considered, and other values are omitted.
3.3 Generating Evidence for Explanations
To provide evidence for the explanations a data graph is generated by applying the
filter conditions obtained from the explanations on top of the actual dataset and
thereby summarizing it. The data evidence provides justification of the explanations
and also at the same time allows users to get a deeper understanding of the entity
relationship dynamics. The entity-feature map and entity relationship of the data
needs to be supplied to the evidence data graph generation algorithm. The entity
relationship is used to create edges among feature variables such that edges run
between only those features whose corresponding entities are connected. However,
edges always run between response variable and each of the feature variable in the
data. The following algorithm is used to generate the evidence graph.
Algorithm 2
Filter dataset based on the top n conditions generated by the explainer tree
For each column C in dataset
if C is numeric
divide values of C in k ranges such that each bucket contains at least N/k sam-
ples, where N is total number of samples
add each range in the nodelist with sample size as node size
else if C is categorical
select top k values of C based on sample size and add them into nodelist with
sample size as node size
For each column C in dataset
if C is response variable
For each node of type C in nodelist
find and add edges in edgelist with respect to all other node types in nodelist
make edgeweight as number of observed samples for the relation
else if C is feature variable
For each node of type C in nodelist
find and add edges in edgelist with respect to other node types in nodelist and
satisfying entity relation with respect to C
make edgeweight as number of observed samples for the relation
Normalize node size and edge weight
Table 1 Comparison of
Algorithm Runtime in seconds Average distance Entropy
counterfactual generation
algorithms DiCE 7675.48 0.029 0.848
Ceml 1297.47 255,153.44 0.9729
Genfact 2.172 0.5030 0.8831
4 Experimental Results
Experiments have been done on a Facebook advertisement dataset [5]. Experiments

are done to evaluate the counterfactual generation algorithm against prior arts. A case
study also demonstrates how to find key performance indicators and causal relations
with respect to value delivered by the advertisements. “Total_conversion” has been
chosen as the response variable. “clicks,” “spentperclick,” “age,” “gender,” “interest”
and “impression” are chosen as feature variables. Categorical variables are encoded
using m-estimator [3] encoding.
4.1 Evaluating Counterfactual Generation Algorithm
The counterfactual generation algorithm is compared with two of the existing

methods, namely DiCE [11] and ceml [10]. For generating counterfactuals in DiCE an
artificial neural network model has been trained with the dataset which in turn serves
as the base model. Counterfactuals are generated for randomly sampled original
dataset. For ceml and Genfact, random forest classifier model has been used as base
model. The response variable is converted to categorical variable using the method
mentioned in algorithm 1. For ceml, counterfactuals are generated for randomly
sampled original dataset with randomly sampled different target class. Comparison
has been done based on total runtime, average Euclidean distance between counter-
factual pairs and entropy of predicted classes in all counterfactual pairs. The entropy
measures the diversity of the counterfactual pairs in Table 1.
Overall, considering the runtime and other measures it can be concluded that
Genfact exceeds in performance compared to prior arts.
4.2 Case Study on Generating Causal Explanations
In a second set of experiments, the dataset has been run through the proposed method
with XGBoost [4] serving as the black box model. It provided the causal explanations
and the evidence data graph. The top 3 generated causal statements are stated as below.
It is evident that impressions are affecting the Total_Conversion most. Interest and
Spentperclick are the 2nd most influencing factors. In general, higher the impressions
higher is the Total_Conversion.
Fig. 1 Evidence data graph generated on Facebook advertisement data
Explanation 1: If Impressions > 920,683.0 & Spentperclick > 1.483 and < = 1.673,
then Total_Conversion will be in range 10.63 – 15.29, [sample size: 16.8%].
Explanation 2: If Impressions > 453,229.125 and < = 920,683.0 & interest are 16,
10, 29, 27, 15, then Total_Conversion will be in range 3.35 – 7.13, [sample size:
28.9%].
Explanation 3: If Impressions < = 453,229.125 & interest are 16, 10, 29, 27, 15,
then Total_Conversion will be in range 1.66 – 3.20, [sample size: 42.8%].
Figure 1 illustrates different sections of data graph generated as evidence.
Figure 1a illustrates data graph with respect to nodes “Total_Conversion” and
“Impression.” In general, higher “Total_Conversion” values have relation with higher
“Impressions.” Figure 1b shows relations with respect to “Total_Conversion” and
“interest.” Lower “Total_Conversion” values mostly have strong relations with
“interest 16,” “interest 15” and “interest 10.”
5 Conclusion
As per the claim, the proposed method has been demonstrated to find and explain
causal relations of a KPI with respect to arbitrary set to feature variables. The perfor-
mance superiority of the counterfactual generation algorithm has also been estab-
lished as it was able to generate high quality counterfactuals in a very short time.
The efficiency is measured by Euclidean distance between generated counterfactual
pairs and entropy of predicted classes of the counterfactuals. The current work can be
extended to encompass causal analysis of time series data generated from complex
dynamical systems.
References
1. J. Pearl, D. Mackenzie, The Book of Why: The New Science of Cause and Effect (Basic Books,
2018)
2. R. Goebel, A. Chander, K. Holzinger, F. Lecue, Z. Akata, S. Stumpf, A. Holzinger, Explain-
able AI: the new 42? in International Cross-Domain Conference for Machine Learning and
Knowledge Extraction (Springer, Cham, 2018), pp. 295–303
3. R. Andersen, Modern Methods for Robust Regression, in Quantitative Applications in the
Social Sciences (Sage Publications, Los Angeles, CA, 2008), pp. 152
4. T. Chen, C. Guestrin, Xgboost: a scalable tree boosting system, in Proceedings of the 22nd acm
sigkdd International Conference on Knowledge Discovery and Data Mining (2016), pp. 785–
794
5. https://www.kaggle.com/chrisbow/an-introduction-to-facebook-ad-analysis-using-r
6. R.K. Mothilal, D. Mahajan, C. Tan, A. Sharma, Towards Unifying Feature Attribution and
Counterfactual Explanations: Different Means to the Same End. arXiv preprint arXiv:2011.
04917 (2020)
7. S. Wachter, B. Mittelstadt, C. Russell, Counterfactual explanations without opening the black
box: automated decisions and the GDPR. Harv. JL Tech. 31, 841 (2017)
8. S. Verma, J. Dickerson, K. Hines, Counterfactual Explanations for Machine Learning: A
Review. arXiv preprint arXiv:2010.10596 (2020)
9. M. Hashemi, A. Fathi, Permute Attack: Counterfactual Explanation of Machine Learning
Credit Scorecards. arXiv preprint arXiv:2008.10138 (2020)
10. A. Artelt, CEML-Counterfactuals for Explaining Machine Learning models-A Python Toolbox
(2019)
11. R.K. Mothilal, A. Sharma, C. Tan, Explaining machine learning classifiers through diverse
counterfactual explanations, in Proceedings of the 2020 Conference on Fairness, Account-
ability, and Transparency (2020), pp. 607–617
12. https://www.rulex.ai/rulex-explainable-ai-xai/
13. J.D. Ramsey, K. Zhang, M. Glymour, R.S. Romero, B. Huang, I. Ebert-Uphoff, C. Glymour,
TETRAD—a toolbox for causal discovery, in 8th International Workshop on Climate
Informatics (2018)
Crime Analysis Using Machine Learning
Sree Rama Chandra Murthy Akuri, Manikanta Tikkisetty,

Nandini Dimmita, Lokesh Aathukuri, and Shivani Rayapudi
Abstract Crime eradication and prevention have been a major setback of most
developed countries. This paper deals with the analysis of criminal data record from
the kaggle which belongs to the San Francisco crime dataset. We are finding the
model with best accuracy, and performance of all the models is tested by us. Here, the
implementation of multiple approaches from machine learning and its comparative
analysis is done with the help of the data. We are finding which model has best
accuracy and performance of all the models. It is shown that Linear SVC has achieved
the best results of all the models considered. The inclusion of these methodologies
to the investigation broadens the search and lessens the risks for the cops.
Keywords Classification · Regression · Dataset · Accuracy · Performance
1 Introduction
Crime eradication and prevention have been a major setback of most developed coun-
tries also. The incorporation of technical knowledge into all areas of improvement is
a match-winning strategy of all countries, and by using the same, here, we incorpo-
rate the technical brilliance into the crime prediction and prevent it with the help of
machine learning techniques which forecast the future by considering the past crime
records, and other parameters which is very useful in suppressing the crimes in the
localities. The usage of machine learning into real-time problems is not something
new; it has been used by various countries in various fields and stepped onto success
on multiple occasions in Fig. 1.
The analysis of crime data is also not a new innovation, but the extension to the
data analysis and predictions done at various stages of forecasting of data, but use
of machine learning techniques is a latest addition to the field and is a very accurate
in comparison to the traditional data forecasting techniques. This process of using
the previous data to predict the future crimes and stimulate a procedure to lessen
S. R. C. M. Akuri · M. Tikkisetty · N. Dimmita · L. Aathukuri (B) · S. Rayapudi

Department of Computer Science and Engineering, Lakireddy Bali Reddy College of
Engineering, Mylavaram, India
198 S. R. C. M. Akuri et al.
Fig. 1 Machine learning classification
the risk and loss to the people by incorporating these techniques in the crime world
[1–4]. We deduce that this can be further extended by incorporating the process to
the investigation model and extending the scope of the investigation for the cops.
Hence, we propose the same inclusion, and with comparative analysis, we find the
best approach which can be used for the prediction in Table 1.
Crime data analysis is not a new innovation, but the extension to the data analysis
and predictions done at various stages of forecasting of data but use of machine
learning techniques is a latest addition to the field and is a very accurate in comparison
to the traditional data forecasting techniques. Earlier crime department used to keep
tabs on suspects who are likely to do crimes and also in areas where crime happens
pretty often. Later on surveillance systems came into existence which changed the
scope of keeping the areas of high crime on a leash and safeguard the surroundings
in Fig. 2.
2 Methodology
2.1 Classification
Random forest is a multi-functional algorithm from the supervised learning branch

of ML which can be used for regression as well as classification. The algorithm
is a bunch of randomly generated decision trees which happens to give us more
accuracy due to the fact that the features of the tree are selected at random by the
algorithm. Also, the major aspect of using this algorithm is its feature equivalency
and predictability which is offered by the Sklearn package in python. With the usage
of bagging, not only the trees generated become uncorrelated but also would be
having the advantage of being trained in multiple types of data and random features
while forming a decision tree. Being able to form trees using random features also
Table 1 Analysis of review
Traditional reactive model NYC model Community policing model Problem oriented policing
Main priorities Process-focused—responding Outcome-focused—reducing Process-focused—improving Outcome and
to calls; investigating and crime and disorder police community relations; process-focused—identifying
solving crimes addressing community and solving policing problems
concerns
Crime Analysis Using Machine Learning
Extent of involvement Low—policing seen as a Relatively low—police High—emphasis on working High—emphasis on

of others specialized activity primarily responsible for with ‘the community’ and its establishing ‘partnerships’ with
developing and implementing representatives other agencies to address
crime reduction strategies problems
Utilization of Low—delivery of policing High—information used to Moderate—emphasis on using High—information used to
information services highly routinized identify problem areas, target information at local level, identify problems, develop
resources and evaluate impacts rather than using it to drive strategies, and evaluate
organization-wide responses’ responses
Utilization of coercive Moderate—occasional High—extensive use of arrest, Low—emphasis on policing Low to moderate—coercive
policing strategies ‘crackdowns’ and ‘blitzes’; stop, and search powers; by consensus; locating police strategies only one of a menu
some tolerance of minor vigorous enforcement of minor within the community of options; whether utilized
offenses offenses depends on nature of problem
199
Architecture
Fig. 2 Architecture diagram
grants the flexibility of having more variations in trees and achieves more accuracy
in the model in Fig. 3.
Linear SVC is a best-fit classifier which returns the best-fit hyperplane value as
the output from the input data while considering the features selected from the input
data. Also, the methodology only supports and executes in a linear kernel which is
one of the drawbacks for the algorithm, and it has more flexibility with respect to
the penalty values as well as the loss functions which gives the scope of performing
well in the scenarios where we consider large data.
Fig. 3 Random forest algorithm working

Crime Analysis Using Machine Learning 201
2.2 Regression
y = θ1 + θ2 .x
1 2
n
minimize predi − yi
n i=1
1 2
n
J= predi − yi
n i=1
Linear regression model is a type of algorithm under the supervised leaning algo-
rithms which often target the output value with the help of the independent variables.
This predicts the values and the relationship between the input and output which can
be explained by the equation given below where root mean squared error is depicted
as cost function J. We use the tuning algorithm RidgeCV for the eradication of
multi-collinearity by analyzing the data with the help of the cost function.

Min Y − X (θ )2 + λθ 2
where alpha parameter is represented as λ which is changing to control the penalty

form.
3 Proposed Model and Results
The pre-processing is done for the data, and it is converted to a comma separated
format file for further handling of data. Then, the filling up of the inconsistent/missing
data is done, and normalization of data takes place which gives us the training set
for further analysis.
On the training set, we now apply the models and get the results by implementing
the decision tree, GaussianNB, Linear SVC algorithms and getting the results and
comparing them with each other. We are finding which model has best accuracy and
performance of all the models. It is shown that Linear SVC has achieved the best
results of all the models considered. We deduce that this can be further extended by
incorporating the process to the investigation model in Fig. 4.
After the comparative analysis of the methodologies, we move on to the regressive
analysis of data. The regressive analysis is done by incorporating two models in the
process which are linear regression and RidgeCv. We have found that the linear
regression model has higher efficiency in comparison to the later. It is shown that
linear regression has achieved the best results of all the models considered in Tables 2
and 3.
Fig. 4 Data flow diagram
Table 2 Classification algorithm results

Algorithm Accuracy (%) Precision (%) Recall (%) F-measure (%)
Decision tree 77.45 82.34 80.4 84.3
Random forest 83 86.67 87.87 85.6
Linear Svc 85 92.65 86 76.7
GaussianNB 75 82 68 78
Table 3 Regression
Algorithm MSE MSE of training set MSE mean
algorithm results
Linear regression 0.0192 0.46 0.12
RidgeCv 0.0189 1.77 0.36
4 Conclusion
The incorporation of machine learning into real-time problems is not something new;
it has been used by various countries in various fields and stepped onto success on
multiple occasions. Hence, we propose the same inclusion, and with comparative
Crime Analysis Using Machine Learning 203
analysis, we find the best approach which can be used for the prediction. Here, the
implementation of multiple approaches from machine learning, and its comparative
analysis is done with the help of the data. We are finding which model has best
accuracy and performance of all the models. It is shown that Linear SVC has achieved
the best results of all the models considered. We deduce that this can be further
extended by incorporating the process to the investigation model and extending the
scope of the investigation for the cops. The inclusion of these methodologies to the
investigation broadens the search and lessens the risks for the cops. We can also add
different additional modules and develop an application for the crime department as
a future extension.
References
1. F. Afroz, S. Rajashekara Murthy, M.L. Chayadevi, Crime analysis and prediction using data
mining—cap a survey
2. A.A. Shmais, R. Hani, in Data Mining for Fraud Detection, Prince Sultan University, Saudi
Arabia
3. K. Deepika, S. Vinod, Crime analysis in India using data mining techniques. Int. J. Eng. Technol.
7(2.6) (2018)
4. G. Borowik, Z.M. Wawrzyniak, P. Cichosz, Time series analysis for crime forecasting (European
Union, 2018)
Multi-model Neural Style Transfer
(MMNST) for Audio and Image
B. Vishal, K. G. Sriram, and T. Sujithra
Abstract Neural style transfer (NST) was created to give a new look for images,
audios and videos through optimization and manipulation techniques. Nowadays, this
specific field has picked up pace amongst various techniques that deal with neural
networks and it has emerged as one of the most efficient means of producing style
transfer. In order to address the shortcomings in the existing system, multi-model
neural style transfer (MMNST) approach for image and audio is proposed. It focuses
on two kinds of data: audio and image. The main objective of this proposed system is
to create artistic imagery by separating and recombining image content and style. For
the audio style transfer, we have two inputs which are broken down, optimized and
enhanced and finally combined together in a fulfilling manner. Specifically, local
and global features can be transferred using both parametric and non-parametric
neural style transfer algorithms, which result in an outcome that has equal portions of
both—content and style input as they coalesce perfectly. For experimentation, VGG-
19 (CNN) and TensorFlow Lite models are used. The proposed model outperforms
the existing models in terms of accuracy, execution speed and the total loss incurred
during the process.
Keywords Neural style transfer (NST) · Convolutional neural network (CNN) ·

Visual geometric group (VGG) · TensorFlow Lite · PyTorch
1 Introduction
Neural style transfer associates itself to a gaggle of software algorithms which manip-
ulates digital images, audios, videos so as to adapt the looks and visual kind of another
data. These kinds of algorithms are characterized by their use of deep neural networks
to achieve image, audio and video transformation/manipulation. The fundamental
uses for NST are the creation of artificial artwork from photographs, for instance
B. Vishal (B) · K. G. Sriram · T. Sujithra

Department of Computer Science Engineering, SRMIST, SRM Nagar, Chennai, India
T. Sujithra
e-mail: sujithrt@srmist.edu.in
206 B. Vishal et al.
by transferring the looks of famous paintings to user-supplied photographs. This

method has been employed by artists and designers around the globe to develop
new artwork with the help of existent style(s). By analogy, a modification may be
learned from a training pair of photos—a picture and an artwork representing that
photo and then introduced to create new artwork from a substitute photo. The disad-
vantage of this approach is that in reality, a training pair is uncommon. Original
source material, such as photographs, is seldom accessible for well-known artworks.
NST does not need such a pairing; all the algorithm requires one piece of artwork to
pass its attributes. Also speaking of style transfer, we associate ourselves to image
transfer. What about audio? Here, audio style transfer is achieved using the help of
convolutional neural network (CNN) and its nature of adapting from image to audio
transfer. In this method, the input being content and style audios will be modified by
stripping each of their attributes and training them separately in order to efficiently
conduct style transfer. And finally, all the singular fragments are put together in order
to obtain an efficient audio style transfer output. The inputs for this method can be
anything from a tune, song or even a beat.
2 Literature Survey
Cheng et al. [1] proposed “structure-preserving neural style transfer” model. It uses
state-of-the-art methodologies that are prevalent in neural networks and use it to
achieve maximum success in the field of neural style transfer. It primarily focuses
on deep learning algorithms in order to achieve maximum efficiency with regards to
style transfer. This is specifically trained using stochastic gradient descent in order to
minimize the loss functions. Although this model uses state-of-the-art methodologies
prevalent in said field, what this paper does not achieve is to test these methodologies
in various models available in style transfer. The research paper titled [2] primarily
focuses on two things. The first being effectiveness (E) which measures the scale
that a given style has been imprinted/transferred to the content image. Secondly,
coherence (C) is another statistic which measures the scope up to which the content
of the original image is preserved.
Chen et al. [3] use a faster DCNN approach to diversify the attributes present in
the given inputs (divide and conquer method) and create a singularity. It caters to
every single divided entity and finally provides a meaningful outcome by putting
together all singular entities. Usage of SANET model by [4] in order to acquit both
local and global style patterns within a single framework and also preserving the
originality of the content. However, this model uses a good method, but due to the
emergence of more capable and efficient algorithms like VGG-16, VGG-19, ResNet,
the SANET model even though it is a fine method, became routine and mundane
without room for improvement and growth. Moreover, this approach [5] uses a data
augmentation strategy such as data warping and oversampling to tackle the primary
issue, that is, the issue of overfitting. Even though this paper effectively delivers
image transformation, but it does not concoct with the underlying principle at hand
Multi-model Neural Style Transfer (MMNST) for Audio and Image 207
being style transfer. It still contributes in the areas such as image manipulation and
geometric transformations, but it does not bode well with style transfer.
In [6] and [7], they separated the content and style inputs and trained them and
finally, used optimization technique to coalesce them to produce an efficient audio
style transfer output.
The only pickle with this methodology is that the model used here is overshadowed
by many newly arrived models which proved to be more efficient in the style transfer
domain. Luan et al. [8] and Jing et al. [9] extended the work of Gatys on style transfer.
These approaches used deep learning algorithm and employed these methodologies
in VGG-16 model to yield a perfect style transferred output. But, due to the arrival of
many efficient style transfer neural network models, this method became obsolete.
To summarize, most of the practices/papers mentioned above use CNNs to achieve
neural style transfer. But we are planning to achieve success in more than one model
so that others will have all the right data to carry out neural style transfer. Although all
the above approaches/practices do very well in producing good neural style transfer
outcomes, none of the papers really discuss about which model is more accurate and
efficient. So, we have decided to test these methodologies in more than one model
and display the efficiencies of the same so that those who attempt to carry out neural
style transfer in the future will have all the necessary data to choose an efficient
model in order to produce accurate results [10, 11]. We also plan to employ video
style transfer in order to increase scalability of our model.
3 Proposed Work
To address the aforementioned shortcomings in the literature survey, MMNST is

proposed. This model produces an efficient style transfer outcome by developing
a neural network with respect to two different methodologies, namely VGG-19
(PyTorch) and TensorFlow Lite. The existing system that we have come across does
not actually give us a plethora of options regarding which style transfer is suitable for
an individual and his needs. There are several models used in existing systems which
produce style transfer, but not with a certain degree of accuracy and efficiency [12,
13]. Furthermore, there are not many works that dealt with a multi-model approach.
Hence, the objective is to achieve neural style transfer with the help of diverse models
in order to facilitate people who attempt to achieve style transfer to have all the neces-
sary data in order to pick the best model possible according to their needs and views
[14, 15]. The proposed work focuses primarily on the two models mentioned below.
3.1 Image Style Transfer
There are two models used for image style transfer in this paper which are as follows:
3.1.1 VGG-19 (PyTorch Approach)
VGG-19 is a convolutional neural network (CNN) which consists of 19 layers (16

convolution layers, three fully connected layer, five MaxPool layers and one SoftMax
layer) Fig. 1. This network is trained using quite one million images from the
ImageNet database. It was trained on 224 × 224 pixels coloured images. This is
an optimization technique that uses a content and style image and then, blends them
together so that the output image is transformed into something new with traces of
both content and style images in Fig. 2.
The layers used to break down the content and style images are namely: conv1_1,
conv2_1, conv3_1, conv4_1, conv4_2 and conv5_1. Here, conv4_2 is used as the
content extractor because when dealing with the content image, higher layers of the
CNN are preferred. The rest of the above-mentioned layers is used to break down
the style image as lower layers are sufficient to extract the style Fig. 3.
Fig. 1 VGG-19 architecture
Fig. 2 A general representation of image style transfer

Fig. 3 Workflow of image style transfer process
3.1.2 TensorFlow Lite Approach
TensorFlow Lite (TF Lite) is an open-source, cross-platform and product-ready deep

learning framework that convert a TensorFlow pre-trained model to a special format
that can be tuned for speed or storage. The special format model is frequently
deployed on edge devices, such as smart phones running Android or iOS, or Linux-
based embedded devices, such as Raspberry Pi or microcontrollers, to construct the
inference. TF Lite model may be a special format model efficient in terms of accuracy
and is also a light weight version which will occupy less space, these features make
TF Lite models the proper fit work on mobile and embedded devices Fig. 4.
This style transfer model consists of two sub-models:
Fig. 4 TensorFlow Lite workflow

1. Style Prediction Model: A MobileNetV2-based neural network capable of

transforming an input style image into a 100-dimension style bottleneck vector.
2. Style Transform Model: A neural network that generates a stylized image by
applying a bottleneck vector method to the content image.
Here, the MobileNetV2 neural network is a convolutional neural network that
is 53 layers deep. This is a compact architecture that implements lightweight deep
convolutional neural networks via depth-wise separable convolutions and provides
an efficient model for mobile and embedded vision applications. Calculation of loss
functions is also an important factor in image style transfer and it is applicable
for all the images, namely: content, style and output images. The formulae for the
calculation of loss functions in general for style transfer are given below.
Content Loss
→ − 2
Lcontent −
p ,→
x ,l = 1
2 i, j Filj − Pilj
−
→
p The original image.
−
→
x The generated image.
l Layer.
Filj Activation of the ith filter at position j in the feature representation of −
→
x in l.
l −
→
Pi j Activation of the ith filter at position j in the feature representation of p in l.
Layer-1 (Style Loss)
1 2
El = G li j − Ali j
4Nl2 Ml2 i, j
N l number of distinct feature maps.

M l the height times the width of the feature map.
G li j pairwise feature maps i and j in the style representation of −
→
x in l.
l −
→
Ai j pairwise feature maps i and j in the style representation of a in l.
Final Style Loss

→ − L
Lstyle −
a ,→
x = l=0 wl El
−
→
a The original image.
−
→x The generated image.
L Total number of all layers.
wl Weighting factors of the combination of each layer to the total loss.
E l loss in layer l.
Total Loss
L = αL content + β L style
where α and β are user-defined hyperparameters. By controlling α and β, the quantity

of content and style injected to the generated image can be controlled.
3.2 Audio Style Transfer
The main objective here is to adapt the “style transfer” concept to audio dominion.
Specifically, the aim to transfer the style of an audio which is labelled as the “style”, to
a different audio which is labelled as the “content”, and synthesize a brand-new audio
with the overall characteristics of the “style” by also remaining loyal to the “content”.
Through this approach, we will take a breakthrough for understanding the features
of raw music audio signals like the style, melody, rhythm and tempo and efficiently
produce style transfer without losing any of the aforementioned parameters. In this
method, VGG-19 (CNN) is being used to carry out audio style transfer. Here, VGG-
19 model separates the attributes of both content and style input and optimizes it using
the parameters that are designed during this approach and eventually, concatenates
the inputs to get an output required by the user. Also, alongside the required libraries,
two additional libraries, namely: librosa and soundfile were included, which was used
as an outlet for the audio Fig. 5.
To increase the user’s control over the outcome, an optimisation technique is
introduced though which one can customize the output according to his/her needs.
Three parameters, namely: ALPHA, learning_rate and iterations, were introduced to
enable the user to control the outcome in his favour. For instance, a larger ALPHA,
Fig. 5 Architecture of this method. Input A is content audio, input B is style audio then, we initialize
the output with input A. To optimize the output, we use both content loss and style loss
Fig. 6 Workflow of audio style transfer
for example, implies more content in the output, however, ALPHA = 0 indicates no
content, minimizing stylization to texture generation Fig. 6.
4 Results
The approaches used in this experiment have been rewarding due to the fact that it
allowed us to control the outcome of the process.
Also, this approach allowed us to have a say in restricting the losses incurred
during process of audio style transfer. The execution speeds have been higher than
the previous models such as VGG-16, ResNet and SANet. Figs. 7 and 8.
The audio style transfer technique also uses parameters to get a hold of the kind of
output as desired by the user. Using these parameters, namely ALPHA, learning_rate
and iterations, we can control the amount of content or style audio to be used to obtain
the output. Through this way, we can get multiple outputs for the same set of input
data in the same platform giving rise to scalability in Figs. 9, 10, 15 (Figs. 11, 12,
13, 14).
Fig. 7 Content and style VGG-19 (PyTorch)

images
Fig. 8 Result obtained with

epoch values of 100 and
1000, respectively
Fig. 9 Average execution speed for the given data
5 Conclusion
Thus, the multi-model neural style transfer (MMNST) approach for style transfer
was designed with VGG-19 and TensorFlow Lite models and has been successfully
executed. This approach is comparatively more accurate, efficient and it also controls
the losses incurred during the process to a minimum.
Thus, the objective of producing style transfer outcome has been attained in an
efficient and fulfilling manner.
Fig. 10 Iterations vs average time (in secs) for given data
TensorFlow-Lite
Fig. 11 Content and style images

Fig. 12 Results obtained with blending_ratios 0.18 and 0.97, respectively
Audio Style Transfer
Fig. 13 Spectrograms depicting content (left), style (middle) and result (right)
Fig. 14 Parameters used to control the output as shown in Fig. 11

Fig. 15 Alternate spectrograms of content (1st row), style (2nd row) and result (3rd row) audio
files
References
1. M.-M. Cheng, X.-C. Liu, J. Wang, S.-P. Lu, Y.-K. Lai, P.L. Rosin, Structure-preserving neural
style transfer, in IEEE Transactions on Image Processing, vol. 29 (2020)
2. M.-C. Yeh, S. Tang, A. Bhattad, C. Zou, D. Forsyth, Improving style transfer with calibrated
metrics, in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) (2020)
3. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Deeplab: semantic image
segmentation with deep convolutional nets, Atrous convolution, and fully connected (2016)
4. P. Rathi, P. Adarsh, M. Kumar, Deep learning approach for arbitrary image style fusion
and transformation using SANET model, in 2020 4th International Conference Trends in
Electronics and Informatics (ICOEI) (2020)
5. C. Khosla, B.S. Saini, Enhancing performance of deep learning models with different
data augmentation techniques: a survey, in 2020 International Conference on Intelligent
Engineering and Management (ICIEM) (2020)
6. E. Grinstein, N.Q.K. Duong, A. Ozerov, P. Perez, Audio style transfer, in ASSP—IEEE
International Conference on Acoustics, Speech and Signal Processing (2018)
7. Z. Huang, S. Chen, B. Zhu, Deep leaning for audio style transfer
8. F. Luan, S. Paris, E. Schechtman, Deep photo style transfer, in 2017 IEEE Conference on CVPR
(July, 2017)
9. Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, M. Song, Neural style transfer: a review. IEEE Trans.
Vis. Comp. Graphics 26(11) (2020)
10. P. Li, D. Zhang, L. Zhao, D. Xu, D. Lu, Style permutation for diversified arbitrary style transfer.
IEEE Access 8 (2020)
11. A.J. Champandard, Semantic style transfer and turning two-bit doodles into fine artworks, in
nucl.ai Conference (Mar, 2016.)
12. Y. Zhu, Y. Niu, F. Li, C. Zou, G. Shi, Channel-grouping based patch swap for arbitrary style
transfer, in 2020 IEEE International Conference on Image Processing (ICIP) (2020)
13. W. Ma, Z. Chen, C. Ji, Block shuffle: a method for high-resolution fast style transfer with
limited memory. IEEE Access 8 (2020)
14. A. Levin, D. Lischinski, Y. Weiss, A closed-form solution to natural image matting. IEEE
Trans. Pattern Anal. Mach. Intell. (2008)
15. M. Pasini, MelGAN-VC: voice conversion and audio style transfer on arbitrarily long samples
using Spectrograms (2019)
Forecasting of COVID-19 Using
Supervised Machine Learning Models
Y. Vijay Bhaskar Reddy, Vyshnavi Adusumalli,

Venkata Bharath Krishna Boggavarapu, Mahesh Babu Bale,
and Archana Challa
Abstract Machine learning (ML) models have proved significant in forecasting to

improve decision-making. Various application domains, including the identification
of adverse factors for a hazard, have long used machine learning models. To forecast
the problems, several prediction approaches have been used. For a long time, machine
learning algorithms have been used in a variety of applications, including detecting
negative risk factors. This research demonstrates the ability of machine learning
models to predict the number of patients who would be infected by COVID-19, a
virus that may pose a danger to humanity. The three standard forecasting models
used in this study were Linear Regression, Support Vector Machine (SVM), and
Exponential Smoothing. In the next 20 days, each of these models has different
types of forecasts, such as cases that are confirmed newly, new deaths, and new
recoveries predictions. The results of the study suggest that these models are better
used in the most recent COVID-19 analysis. ES performs better all other ones.
Keywords COVID-19 · Supervised machine learning · Future forecasting · R2

score · Exponential smoothing
1 Introduction
1.1 Machine Learning
Over the last decade, machine learning has established itself as a prominent field of
study by solving complex problems. ML has many applications in many fields in
Table 1.
ML algorithms play the major role in processing the complex datasets such as
COVID-19. This algorithm follows the if-else approach. ML algorithms use the trial-
and-error method. Forecasting is the most important aspect of the ML [1]. There are
many forecasting algorithms have been used for predicting the specific disease such as
Y. V. B. Reddy (B) · V. Adusumalli · V. B. K. Boggavarapu · M. B. Bale · A. Challa

Department of Computer Science and Engineering, Lakireddy Bali Reddy College of
Engineering, Mylavaram, Krishna District, Andhra Pradesh 521230, India
220 Y. V. B. Reddy et al.
Table 1 Applications of
Business Healthcare Smart vehicle Natural
machine learning
applications language
processing
Sports Entertainment Image Climate
processing change
Robotics Voicing Stock market Disease
prediction prediction
coronary artery disease [2, 3], cardiovascular disease [3], breast cancer prediction [4],
in particular COVID-19 prediction [5]. This study forecasts the COVID-19 outbreak
and early responses. This study helps in decision-making and manages the disease
effectively.
1.2 COVID-19
The aim of this study is to create a model for predicting the spread of COVID-19, a
new coronavirus. It belongs to the type of SARS-CoV-2 virus. Corona virus disease
is the full name of COVID-19. Formerly, this disease is referred to in the 2019 novel
Corona virus’. The virus was first identified in the Chinese city of Wuhan at the end
of 2019 in Table 2. It has the following symptoms.
Because of the causes of its spread and the threat it poses, almost every country has
declared either partial or complete lockdowns. Some of people possess higher symp-
toms and some of them not possess any symptoms. Medical researchers working on a
vaccine for this virus. Some countries succeeded in vaccine trials and implementing
in their respective countries. People are affected by the virus even though they are
vaccinated. People are ultimately lead to death [6]. To contribute to this situation,
various researchers are studying different types of dimensions of the pandemic to
help the humanity.
We aimed to design a COVID-19 forecasting system to help with the global
humanitarian crisis. For the next 20 days of the outlook, we’re looking at three main
variables [7, 8].
I. The overall confirm cases
II. The overall of recoveries
III. The overall of deaths.
Supervised machine learning models are used for this forecasting. They are
Table 2 Symptoms of
Fever Coughing Shortness of breath
COVID-19
Trouble breathing Headache Sore throat
Loss of smell or taste Tiredness Loss of speech
Forecasting of COVID-19 Using Supervised Machine Learning Models 221
1. Linear Regression
2. Support Vector Machine
3. Exponential Smoothing.
The learning models were trained using Johns Hopkins University’s COVID-19
patient statistics dataset, which is available on GitHub. The dataset was pre-processed
and split into two subsets: Training and Testing. In terms of significant factors, the
performance has been achieved. They are
• R-square score
• Mean Square Error
• Mean Absolute Error
• Root Mean Square Error.
2 Materials and Methods
2.1 Dataset
The aim of this research is to provide the better research on detecting and diagnosing
the COVID-19 based on the overall cases. Several ML algorithms are utilized to
analyze these datasets. These algorithms are applied on GitHub Repository dataset
provided by the Johns Hopkins Whiting School of Engineering.
This dataset contains countries across the world and day-wise confirm cases, day-
wise death cases, and day-wise recovery cases. All data comes from the regular case
report, which is revised once a day [7] in Figs. 1, 2, and 3.
2.1.1 Sample Data of Confirm Cases
See Fig. 1.
Fig. 1 Sample data of confirm cases

Fig. 2 Sample data of death cases
Fig. 3 Sample data of recover cases
2.1.2 Sample Data for Death Cases
See Fig. 2.
2.1.3 Sample Data for Recovery Cases
See Fig. 3.
2.2 Supervised Machine Learning Models
Supervised machine learning models deal with labeled data. The labeled dataset that
contains both input and output parameter. Supervised models are of two types: (1)
Regression, (2) Classification. A regressor is a tool that is used to train a model
using regression. After that the trained model makes a prediction based on the input
data or dataset. For model development, the learning methods may use regression or
classification algorithms.
For this study, four types of regression models are used. They are
• Linear Regression (LR)
• Support Vector Machine (SVM)
• Exponential Smoothing (ES).
2.2.1 Linear Regression (LR)
The target class in LR is determined by the independent function. Linear regression

can be defined as the relationship among the two variables such as dependent and
independent. LR model is a type of regressive model that is useful in statistical
technique for predictive analysis. Each observation is dependent on two values one
is dependent, and another is independent. It determines the relation between these
two variables. For the linear regression model, two variables (x, y) are required. The
relationship can be shown as
y = mx + c
To get the better regression result, the best c and m values are to be initializing.
To ensure that this minimizing issue is presented, the gap between real and expected
values should be as small as possible.
2.2.2 Support Vector Machine (SVM)
SVM is one kind of supervised machine learning technique. Both regression and
classification can be accomplished with the SVM. SVM is commonly used to solve
classification problems. In a multi-dimensional space, the SVM model represents
different groups in a hyperplane [8]. The hyperplane will be created in an iterative
process to reduce the error. Since by using liner function, this algorithm solves the
regression issues, it maps the input vector (x) to an n-dimensional space called a
feature space when dealing with non-linear regression problems (z). After applying
linear regression, non-linear mapping techniques are used to complete the mapping.
2.2.3 Exponential Smoothing (ES)
The exponential window function is used to smooth time series data using the rule
of thumb technique of exponential smoothing. ES is a technique for analyzing time-
series data that is often used. Forecasting is done in this model using data from
previous periods. As time passes, the power of previous data observations diminishes
exponentially. The current–time forecast provided by
F(t) = α A(t − 1) + (1 − α)F(t − 1)

2.3 Evolution Parameters
2.3.1 R2 Score
R-squared is a statistical test that provides some insight into the fit of a model. It is
calculated on a scale of 0–100%. The coefficient of determination, or the coefficient
of multiple determinations, is what it’s called. The goodness of the trained model is
determined by its R2 value.
The R2 score ranges from 0 to 100%.
• A response variable with a variance of 0% has no variability around its mean.
• A model that describes 100% of the variability in response data across its mean
is said to be 100%.
2.3.2 Mean Absolute Error (MAE)
This is the average (absolute value) of the model predictions and real results on
test data. Its range of values is 0 to infinity. Fewer performance values indicate the
effectiveness of learning models, which is why, they are often known as negatively
oriented ratings.
2.3.3 Mean Square Error (MSE)
MSE squares the gap between data points and the regression line. Since squaring
eliminates the negative sign from the value, it gives greater weight to larger differ-
ences. The lower the mean squared error, the closer you are to finding the best fit
rows.
2.3.4 Root Mean Square Error (RMSE)
To predict the errors, the standard deviation (SD) is defined as the RMSE. To predict
errors, the best fit line and original data points are calculated, and these are also called
as residuals. All the individual data points are clustered by using RSME along with
best fit axis.
3 Methodology
This research was conducted in accordance with the COVID-19 predictions. The
COVID-19 has proven to be a current and imminent danger to human and animal
life. Every day, it results in tens of thousands of deaths. This project aims to conduct
Fig. 4 Day wise total confirm cases
Fig. 5 Day wise total death cases
Fig. 6 Day wise total recover cases
potential forecasts on death cases, recovery cases, and confirm cases in order to help
monitor the pandemic situation. Data collection should be divided into two subsets
after the initial planning phase. There are two sets: a training set and a test set. The
aim of this study is to make predictions. LR, SVM, and ES are among the models used
in this research. The number of illnesses, deaths, and recoveries for the next twenty
days were examined. This study necessitates the use of four models. The analytical
data includes summary tables for each day time series, as well as the number of
confirmed cases in the days following the pandemic’s spread, deaths, and retrievable
data. The data collection of global data on the regular number of cases, deaths, and
recoveries was processed at the start of this report in Figs. 4, 5, and 6.
3.1 Day Wise Total Confirm Cases
See Fig. 4.
3.2 Day Wise Total Death Cases
See Fig. 5.
3.3 Day Wise Total Recovery Cases
See Fig. 6.
4 Proposed Work Flow
The information was gathered from a GitHub source. After that, the data is pre-
processed. The data is divided into two sets: training and testing. The model will be
used to train the data. The training data will be used to conduct the testing. Evaluation
parameters will be used to assess accuracy in Fig. 7.
Fig. 7 Proposed work flow

Fig. 8 Accuracy
Fig. 9 LR forecasts infected cases for the next 20 days
5 Results
5.1 Confirm Cases Forecasting Shown in Figs. 8, 9, 10, 11,

12, 13, and 14
See Figs. 8, 9, 10, 11, 12, 13, and 14.
5.2 Death Cases Forecasting Shown in Figs. 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, and 29
See Figs. 15, 16, 17, 18, 19, 20.
5.3 Recovery Cases Forecasting
See Figs. 21, 22, 23, 24, 25, 26, 27, 28, and 29.
Fig. 10 SVM forecasts infected cases for the next 20 days
Fig. 11 ES forecasts infected cases for the next 20 days
6 Conclusion
Using machine learning models, this study aims to build a forecast method for
predicting the number of cases affected by COVID-19. The datasets used in this
analysis provide information from regular reports on new confirm cases, new recov-
eries, and new death cases. As the death and confirmation rates rise day by day,
the world’s condition is deteriorating. The number of people who could be affected
by COVID-19 in various countries around the world is unknown. The aim of this
analysis is to forecast the number of confirmed cases, recovered cases, and deaths
over the next 20 days. The two machine learning models LR, SVM, and ES are used
Fig. 12 Mortality rate
Fig. 13 Comparison of cases over top-5 countries in infected cases
to make this forecast. These two variables are used to forecast the number of new
events, recoveries, and deaths.
Fig. 14 Moving average of 20 days on confirmed cases
Fig. 15 LR forecasts death cases for the next 20 days

Fig. 16 SVM forecasts death cases for the next 20 days
Fig. 17 ES forecasts death cases for the next 20 days

Fig. 18 Moving average of 20 days on death cases
Fig. 19 Comparison of cases over top-5 countries in death cases
Fig. 20 Accuracy
Fig. 21 Accuracy
Fig. 22 LR forecasts recovery cases for the next 20 days
Fig. 23 SVM forecasts recovery cases for the next 20 days

Fig. 24 ES forecasts
recovery cases for the next
20 days
Fig. 25 Level of recovery

Fig. 26 Moving average of 20 days on recovery cases
Fig. 27 Comparison of cases over top-5 countries in recovery cases

Fig. 28 Recoveries over deaths
Fig. 29 Comparison between confirm, recovery, death cases
References
1. G. Bontempi, S.B. Taieb, Y.-A. Le Borgne, Machine learning strategies for time series
forecasting, in Proceedings of European Business Intelligence Summer School (2012), pp. 62–77
2. P. Lapuerta, S.P. Azen, L. Labree, Use of neural networks in predicting the risk of coronary
artery disease. Comput. Biomed. Res. 28(1), 38–52 (1995)
3. K.M. Anderson, P.M. Odell, P.W. Wilson, W.B. Kannel, Cardiovascular disease risk profiles.
Amer. Heart J. 121(1), 293–298 (1991)
4. H. Asri, H. Mousannif, H.A. Moatassime, T. Noel, Using machine learning algorithms for breast
cancer risk prediction and diagnosis. Procedia Comput. Sci. 83, 1064–1069 (2016)
5. F. Petropoulos, S. Makridakis, Forecasting the novel coronavirus COVID-19. PLoS ONE 15(3)
(2020)
6. L. van der Hoek, K. Pyrc, M.F. Jebbink, W. Vermeulen-Oost, R.J. Berkhout, K.C. Wolthers et al.,
Identification of a new human Coronavirus. Nat. Med. 10(4), 368–373
7. R.S.M. Lakshimi, B.T. Rao, M.R. Murty, Predictive time series analysis and forecasting of
COVID-19 dataset. in Recent Patents on Engineering (2021)
8. V. Bhateja, A. Gautam, A. Tiwari, S.C. Satapathy, N.G. Nhu, D.N. Le, Haralick features-based
classification of mammograms using SVM (2018)
Feature Extraction from Radiographic
Skin Cancer Data Using LRCS
V. S. S. P. Raju Gottumukkala, N. Kumaran, and V. Chandra Sekhar
Abstract Identification of an object in nature depends on its behaviour as well as

features. Computer vision has adequacy and capability for much identification. In
the last decade, cancer has emerged as a hazardous and fatal disease of the effect
inside and outside of the body. For cases of location inside the body, detection
is relatively easy. Onside, cases are termed as skin cancer. These are two types
of tests for identifying skin cancer. Radiography involving scanning X-ray images
features such as lopsidedness, Rim irregularity, colour intensity and span (LRCS)
and colour intensity, and diameter is useful for gradation diapauses. Skin cancer
prediction is done by being an algorithm for determining the malignancy of the
terms. Furthermore, thermoscopic images can help in the identification of the tumour.
Artificial techniques can be used for the classification of cancer. The pre-processor of
representation stages precedes the extraction of features following the LRCS model
which is the main objective of this paper. The results of feature extraction have been
discussed with encouraging findings of the LRCS type of processing.
Keywords Asymmetry · Skin cancer · Feature selection · Feature detection ·

LRCS (Lopsidedness, Rim irregularity, Colour intensity, Span)
1 Introduction
In the current one, skin cancer has emerged as one of the most dangerous types of
cancer diagnosed in human beings by classification, skin cancer can be divided into
several categories identifying melanoma, basal and squamous cell carcinoma. Of
these, melanoma is the most common as well as must unpredictable being negatively
more common.
V. S. S. P. R. Gottumukkala (B) · N. Kumaran

Department of Computer Science and Engineering, Annamalai University, Chidambaram, Tamil
Nadu, India
V. C. Sekhar
Department of Computer Science and Engineering, SRKR Engineering College, Bhimavaram,
Andhra Pradesh, India
240 V. S. S. P. R. Gottumukkala et al.
Even though its occurrence is only 4% of all types of skin cancer, its fatality crosses
70%. It may be noted that image processing is widely used for the diagnosis of such
cancer. Demography is a non-invasive analytical procedure, whereas oil immersion
technology helps in the visual examination of skin surface structure [1].
The diagnostic accuracy in melanoma detection is quite often dependent on the
human aspect of the dermatologist’s training input of crossing the maximum assess-
ment. It is pertinent to note that the diagnosis of melanoma is relatively complex
to get these to work out in the early stages. The automated diagnostic tool can be
hence more effective beyond dermoscopic current techniques are computer-aided
identifications of melanoma based on artificial neural Newton [1, 2].
It is quite distressing to note that incidence in the USA over the past three decades
with facilities due to accelerated metastases. Invasive melanoma is nearly 1% of
all types of skin cancer; however, it is responsible for a major proportion of actual
death related to skin cancer in the USA, as many as 91,270 people were affected by
melanoma in 2018 accounting for over 91% of diagnosed cases of skin cancer linked
to 9320 fatalities. In this context, it is essential to develop technologies for detecting
malignant melanoma in the early stages as even the experienced dermatologists find
it difficult to dispose of properly. Record work has been continuously in various
techniques since the late 1990s for advantages automated analysis of dermatology
images required to assist the physicals for enhancing clinical admixture [1, 2]. With
the advent of the computer version, many computerized tools are based on the hospital
to aid the doctor in early diagnosis of skin cancer. The current approach incorporates
key phases for the identification of skin lesions covering comparison with grouping
[2, 3].
It is expected that system-based research could reduce the time of diagnosis and
improve the precision of selectors. During the higher conflicts, greater diversity
completed with limited knowledge, skin diseases are quite challenging for reliable
diagnosis particularly on developing of the underdeveloped country having low health
is foundry. It is worth note that early detections reduce the chance of fatalities. Even
the environmental deterioration aggravates the skin cancer possibilities generally,
the phases of such diseases. The general phases of these diseases are as follows:
STAGE one-in situ diseases, survival 99.9%, STAGE two-high-risk diseases, survival
45–79%, STAGE three-regional metastasis, survival 24–30%, STAGE four-remote
metastasis-survival 7–19% [1, 4].
2 Associated Mechanisms
The literature survey conducted for this paper included twelve research papers of
various authors and approaches to implementation methods. Significant contribution
has been summarized as per the following.
Angunane et al. [1, 5] have been addressed the issues of artificial neural networks
and their AI divisions covering computer vision. Artificial neural networks (ANNs)
are useful for radiology, urology, cardiology and oncology. It may be noted that neural
Feature Extraction from Radiographic Skin Cancer Data Using LRCS 241
networks can help in functions of the highly callable feature. The main parameters
of melanoma are asymmetry, boundary, colour, diameter, etc., based on MATLAB
procedure, the dispute enhances the success rate ASCD rules for melanoma skin
cancer can be coupled with ANN for classification, these returns have to be trained
for the goals. The precision achieved is around 96.9% in classification. Lee et al. [2,
6] used 88 images of melanoma for analysis and suitable segmentation. The three-
step of their method corresponds to its traditional stages affectively the sensitivity,
precision and consistency. It may be rated that for segmentation, the last and proposed
segmentation often are constructed as the ground reality. The three measures have
effected a significant change in the process of diagnosing.
Celebi et al. [7, 8] built an open-source application for convectional layers of
neural networks. With the availability of reduced graphics units, image analysis of
diabetic retinopathy has of last become an essential area of study. The authors have
presented a short review of the exciting sub-field of segmentation and classification
which can be used as partial directives to researchers.
Sugar et al. [9, 10] have used the first extract approach for the texture-based feature
extraction achieving a high level of precision for the classification method of ANN
using multilayer perceptron (MLP). The results for four unsuspected images were
accurate up to 80% at the lowest per cent accuracy and 88.8% for the highest per
cent accuracy on use of melanoma based on 23 pictures.
Pham et al. [11, 12] have evaluated the classification of the outcome of six clas-
sifiers along with seven attribute techniques and four data pre-processes steps based
on two of the most extensive skin cancer datasets. The framework connected on
the linear standard section of the input as a data processing step, HSV as a feature
subtraction device and balanced random forest as a classifier coupled to HAM 10,000
dataset heavy 81.464% AUC, 745% precision, 90.09% at the specified percentage
of 7284.
It is worth notify that many countries have discussed malnova skin cancer detection
based on feature extraction techniques by noisy segmentation, border identifications
methods can be used to focus on ANNS, support vector machine for especially
identify of only reasoned specially. The first part of the five parts of this research
endeavour is the introduction followed by a literacy survey. The third part discusses
the feature extract, whilst the fourth part elaborates on the result and interference of
feature selection mechanisms. The last part covers the conclusion.
3 Proposed System
It is known that the root cause of skin cancer is the abnormal growth of cells on the
skin. The major types of skin cancer are—basal cell carcinoma, squamous carcinoma
and melanoma. If there are not detected easily, spreading in such that it is hard to
treat leading to fatalities. Mainly, this is detected early by visual diagnosis along with
clinical screen followed by dermoscopic analysis histopathological assessment and
biopsy. With the advent of artificial intelligence and machine leaving, the chance of
moving the ideal machine for early detection of cancer has become fully potential.
Due to the fine-grained difference in sun lessons appearance, automated classifi-
cation is quite difficult through images. For ideal operation, finely grained objects
are categorized, images are the processor and function is extracted by the lopsid-
edness, Rim irregularity, colour intensity and span (LRCS). For developing a skin
cancer prediction algorithm, it is a pertinent note that spreading to different parts
of the body can move the disease very dangerous. The LRCS fractures are used
for prediction. Dermoscopic images are of use in identifying tumours. To achieve
the nature of being on the pair experts across the globe, it must be used for skin
cancer classification. Such a system has pre-processors, sequential before extracts of
features.
3.1 About Dataset
The datasets used are of international skin imaging celebration (ISIC) having an
image of melanoma skin cancer. The ISIC project aims to reduce effectiveness by
early diagnosis in all the categories. There is 2300 photography of each 100–150
images are obtained, processed or chanced.
3.2 System Architecture and Algorithms
This section discusses the feature extraction from the skin cancer images, these are
stored as training set data and LRSC architecture is shown in Fig. 1. The first step is
the total directly images and applies segmentation techniques are divides the number
Fig. 1 System architecture

for feature extraction
mechanism (LRCS)
of operation so that it identifies the correct spots of cancer. Segmentation is done

by (a) OTSU segmentation or (b) modified OTSU segmentation or (c) watershed
segmentation feature extraction is selected by colour, shape, size and texture. Pre-
processing and hair keyword are due by the transform. Followed by shade\store
removed by hair removal, lines or elliptical/circular images cannot in case of their
heavy either turnover. A clear image of the tumour comes for further enhancements.
3.3 LRCS Algorithm
LRCS standard converts (lopsidedness, Rim irregularity, colour intensity and span)
are used in which lopsidedness means uneven area and rim irregular. To irregular
boundaries colour intense clients of the area the often two parameters. The algorithm
steps as follows.
1. Apply the segmentation
|X i ∩ G|
Ai = , i = 1, 2 (1)
Xi ∪ G
G Ground truth segmentation of the expert.

X1 Segmentation by ‘Method 1’.
X2 Segmentation by ‘Method 2’.
2. To apply the LRS
Assume B, pB, B∗ as the segmented lesion, the centre of the circumscribed rectangle
of B and a region lopsidedness to B concerning pB. Then, as follows: na is defined
1 |B\B||B ∗ \B|
na = (2)
2 B
Assume B̂ as the convex hull of the lesion band ∂B, ∂ B̂ as the boundary of B and
B̂, respectively, then
1
nb = dist x, ∂ B̂ (3)
|∂ B| x∈∂ B
3. The feature is extracted to find the attributes
The above algorithm to determine skin cancer’s image attributes to apply the segmen-
tation to identify the affected area and find out lopsidedness and Rim irregularity.
Also, get the skin’s affected area’s RGB values and density also identifies store
attributes.
In this selection, the LRCS algorithm is applied for skin cancer affected images to
obtain attribute orientation and extract the specific feature by applying the techniques
of segmentation, colour detection, the area of affected area and the irregularity of
skin cancer is determined by OpenCv and lands. The affected area of skin cancer
identifies by applying GT mask with segmentation technique is shown in Fig. 2.
Clearly, it explains to identify the area of skin cancer with normal mask, GT mask
and pred mask apply to segmentation technique.
The second step is to determine skin cancer location effected by using segmen-
tation, followed by the need to identify the skin cancer techniques to apply the
techniques like lopsidedness, Rim irregularity, colour intensity and span as features
extracting algorithm. In second step, first, find asymmetry area of the skin cancer
and corresponding diagram is shown in Fig. 3a, b. Second, find area density using
segmentation using masks corresponding diagram is shown in Fig. 3c, and finally,
find centre of the location is shown in Fig. 3d.
After completion of the third and fourth steps to get the values and extract the
features of the image id, a read of effected and perimeter, maximum diameter and
minimum diameter, horizontal asymmetry and vertical asymmetry, max red, max
green, max blue, minimum red, minimum green, minimum blue, hue, saturation,
value as follows in Table 1.
Table 2 indicates 11 features of the image for randomly applying different area
to dataset. Generally, area of skin cancer image gives the information about whether
that particular part is having skin cancer or not and it is determined using mentioned
11 features in Table 2. Figure 4 indicates 11 features of image randomly applying
area to dataset. Best features of values obtained for area at 152,221 apply to dataset
and worst feature values at area 258,643 applied to dataset. As indicated in graph,
different colours of each rectangular box for 11 features by applying different areas
to dataset is shown in Fig. 4.
Perimeter indicates the exact area of skin cancer and gets different perimeter values
for different areas. Table 1 indicates 11 features of the image with different perimeter
values get randomly applying different area to dataset. Figure 5 indicates 11 features
of image get different perimeter values for randomly applying area to dataset. Best
feature values observed for perimeter at 9.66 and worst values at perimeter 30.73.
As shown in graph in different colours of each rectangular box for 11 features by
applying different perimeter values are shown in Fig. 5.
Features of the image of skin cancer by taking maximum diameter as 600.
Maximum and minimum diameter indicate the range and depth of skin cancer. In
this algorithm, maximum diameter 600 is considered because this is highest value
for saviour skin cancer and minimum diameter 450 because this is highest value for
normal skin cancer. Table 1 shows the 11 featured values of maximum diameter 600
with different area values, and graphical view is shown in Fig. 6. Table 1 shows the 11
features values of minimum diameter 450 with different area values, and graphical
view is shown in Fig. 7.
Fig. 2 Apply the normal mask, GT mask and pred mask segmentation technique to find out the
affected area
Fig. 3 a To find out the asymmetry of the area of skin cancer, b to find out the asymmetry of the
area of skin cancer, c to apply the masking segmented, find out the area density, d colour density
and centre of the location are forced
The results of the extraction of the dataset prove the effectiveness of LRCS param-
eters. With these datasets, it is convenient for classification to find out the exact nature
of cancer. Also, we can ensure the effective area of cancer and its severity and depth
of cancer at particular location.
5 Conclusions
In skin cancer, early detection is the most critical factor for the health sector as
well as the patient’s object detection method is suitable for such diagnosis of skin
cancer characteristics. In this paper, it has focussed mainly been on feature extraction
as LRCS parameters to identify the attributes like maximum diameter, minimum
diameter, horizontal ammeter, vertical ammeter, maximum green, maximum blue,
maximum red and structure values. The customized algorithm has been developed
for convenient extraction for uneven area of the affected skin and skin cancer affected
area of colour density is extracted from around 2500 images to determine the results
of the attribute are displayed as the datasets and generated in the future, classification
of the feature enhanced data cannot out by real return algorithm on system technique.
Table 1 Extract features of the image of skin cancer on dataset
Area Perimeter Maxdia Mindia h_asym v_asym Maxr Maxg maxb Minr Ming Minb h S v
210,136 881.74 600 450 0.367 0.562 255 255 255 0 2 25 14 140 90
188,162 48.63 600 450 0.079 0.079 254 241 255 0 0 0 12 140 89
190,534 30.73 600 450 0.27 0.27 216 209 238 0 0 0 11 140 90
246,518 49.56 600 450 0.742 0.192 255 243 250 11 11 44 15 140 90
152,221 71.94 600 450 0.894 0.631 214 211 255 0 0 52 9 255 147
239,795 709.31 600 450 0.4 0.418 249 231 245 0 2 45 9 255 147
221,596 15.9 600 450 0.279 0.687 211 193 252 35 24 101 2 187 147
221,633 9.66 600 450 0.634 0.012 210 199 253 6 3 27 5 221 147
215,275 14.83 600 450 0.394 0.754 190 182 240 17 20 77 6 207 147
250,307 574.5 600 450 0.719 0.154 255 243 242 7 0 7 7 137 88
258,643 406.37 600 450 0.281 0.313 234 214 226 2 9 37 14 251 147
Feature Extraction from Radiographic Skin Cancer Data Using LRCS
171,692 32.38 600 450 0.361 0.452 231 222 229 3 4 28 1 122 88
233,875 61.94 600 450 0.099 0.27 242 228 255 0 7 25 11 139 90
193,000 43.21 600 450 0.013 0.013 187 172 251 23 16 98 5 201 147
247
Table 2 Features of the image of skin cancer by variation of area into consideration
Area h_asym v_asym Maxr Maxg Maxb Minr Ming Minb h S V
210,136 0.367 0.562 255 255 255 0 2 25 14 140 90
188,162 0.079 0.079 254 241 255 0 0 0 12 140 89
190,534 0.27 0.27 216 209 238 0 0 0 11 140 90
246,518 0.742 0.192 255 243 250 11 11 44 15 140 90
152,221 0.894 0.631 214 211 255 0 0 52 9 255 147
239,795 0.4 0.418 249 231 245 0 2 45 9 255 147
221,596 0.279 0.687 211 193 252 35 24 101 2 187 147
221,633 0.634 0.012 210 199 253 6 3 27 5 221 147
215,275 0.394 0.754 190 182 240 17 20 77 6 207 147
250,307 0.719 0.154 255 243 242 7 0 7 7 137 88
258,643 0.281 0.313 234 214 226 2 9 37 14 251 147
171,692 0.361 0.452 231 222 229 3 4 28 1 122 88
233,875 0.099 0.27 242 228 255 0 7 25 11 139 90
193,000 0.013 0.013 187 172 251 23 16 98 5 201 147
Fig. 4 Features of the image of skin cancer by variation of area
Fig. 5 Features of the image

of skin cancer by variation of
perimeter
Fig. 6 Comparison of
features of the image of skin
cancer by taking maximum
diameter as 600
Fig. 7 Features of the image

of skin cancer by taking
minimum diameter as 450
References
1. H.R. Mhaske, D.A. Phalke, Melanoma skin cancer detection and classification based on super-
vised and unsupervised learning, in Proceeding of the International Conference on Circuits,
Controls, and Communications (CCUBE), 1–5 Jan 2013, Bangalore, India (IEEE, 2013)
2. H. Lee, K. Kwon, Diagnostic techniques for improved segmentation, feature extraction, and
classification of malignant melanoma. Biomed. Eng. Lett. 10, 171–179 (2020)
3. M.A. Khan, M. Sharif, T. Akram, S.A.C. Bukhari, R.S. Nayak, Developed Newton-Raphson
based deep features selection framework for skin lesion recognition. Pattern Recogn. Lett. 129,
293–303 (2020)
4. M.M. Vijayalakshmi, Melanoma skin cancer detection using image processing and machine
learning. Int. J. Trend Sci. Res. Dev. (IJTSRD) 3, 780–784 (2019)
5. C. Magalhaes, J. Mendes, R. Vardasca, The role of AI classifiers in skin cancer images. Skin
Res. Technol. 25, 750–757 (2019)
6. A. Murugan, S.A.H. Nair, K.S. Kumar, Detection of skin cancer using SVM, Random Forest,
and kNN classifiers. J. Med. Syst. 43, 269 (2019)
7. M.E. Celebi, N. Codella, A. Halpern, Dermoscopy image analysis: overview and future
directions. IEEE J. Biomed. Health Inform. 23, 474–478 (2019)
8. S. Majumder, M.A. Ullah, Feature extraction from dermoscopy images for melanoma diagnosis.
SN Appl. Sci. 1, 753 (2019)
9. Y. Sugiarti, J. Na’am, D. Indra, J. Santony, An artificial neural network approach for detecting
skin cancer. TELKOMNIKA Telecommun. Comput. Electron. Control 17, 788–793 (2019)
10. M.Q. Khan, A. Hussain, S.U. Rehman, U. Khan, M. Maqsood, K. Mehmood, M.A. Khan,
Classification of melanoma and nevus in digital images for diagnosis of skin cancer. IEEE
Access 7, 90132–90144 (2019)
11. T.C. Pham, G.S. Tran, T.P. Nghiem, A. Doucet, C.M. Luong, V.D. Hoang, A comparative
study for classification of skin cancer, in Proceeding of the International Conference on System
Science and Engineering (ICSSE), 20–23 July 2019, Dong Hoi, Vietnam (IEEE, 2019)
12. T. Sreelatha, M.V. Subramanyam, M.G. Prasad, Early detection of skin cancer using melanoma
segmentation technique. J. Med. Syst. 43, 190 (2019)
Shared Filtering-Based Advice of Online
Group Voting
Madhari Kalyan and M. Sandeep
Abstract Public voting in online social networks is a recent feature. It presents

special problems and suggestion possibilities. In order to suggest the social voting
method, we create a series of matrix factorization (MF) systems and neighboring NN
(RSs) systems to explore social network users and community membership details.
Via experimentation with actual traces of social voting, the accuracy of popularity-
based voting recommendations is greatly improved by social network and group
membership information, and social network information in NN-funded approaches
exceeds group membership information. We can see that the input provided by social
and community users is much more useful than for heavy users. In our tests, basic
Meta track based NN models exceed hot-voting recommended computational MF
models, while MF models can help mitigate user interest in non-hot ballot. We also
offer a hybrid RS that offers various single approaches to achieve the best hit rate.
Keywords Online social networks (OSNs) · Recommender systems (RSs) · Social

voting
1 Introduction
A user not only shares his updates with direct friends in terms of text, imagery and
video, but also can spread such updates easily to a much broader range of indirect
friends, use the rich networking and worldwide scope of common OSNs. Many NSOs
now provide a social voting capability that enables users to express their views with
peers, for example, like or hate them, on different topics, from user statuses to profile
photos, played sports, bought goods, visited websites… Taking this as a disdain—
some OSNs, such as empowering users to run their own voting campaigns, with
user-specific voting choices on every issue of their preferences one step further. In
reality, social voting often offers several possible trade values other than relaxing
immediately social experiences. Advertisers may take votes for those brands to be
M. Kalyan · M. Sandeep (B)

Computer Science and Engineering, Malla Reddy College of Engineering and Technology,
Hyderabad, Telangana, India
e-mail: m.sandeep@mrcet.ac.in
252 M. Kalyan and M. Sandeep
advertised. Product managers will initiate market research votes. In order to draw
more online buyers, electronic trade owners should strategically launch votes [1].
When the social vote becomes more and more common, the issue of “knowledge over-
load” arises: the consumer is quickly overpowered by different elections conducted
by direct and indirect peers. The presentation of “right votes” to “right people” is
crucial and difficult to optimize user experiences and increase user involvement in
social votes. Recommendation systems (RSs) address the overloading of information
by recommending things that may be in their interests to consumers. In this article,
we present our recent efforts to improve RSs for social voting, that is to say to suggest
interesting campaigns for users. Unlike the usual recommendation products, such as
books and films, social votes spread via social connections. If you initialize, partic-
ipate or rebound your mates, a person is more likely to be subjected to a vote [2,
3]. The popularity of a vote in your social neighborhood is heavily associated with
the voting practices. Social propagation also emphasizes social impact: if a person
participates in the vote, he or she will be more likely to vote. The action of a consumer
in voting is closely associated with social friends because of social transmission and
social influences. The use of social confidence knowledge creates special obstacles
and opportunities for RS’s. Moreover, without negative samples, voting data shall be
binary. The development of RSs for social voting is also fascinating. To overcome
these challenges, we create a range of novel RS models; including MF models and
NN models to learn user interest, while also mining knowledge on user voting, user
friendship and user-group distress. To address these challenges, we have a series of
novel RS models [4–6].
2 Problem Statement
They suggested a semi-supervised transfer-learning approach for RS in particular

to tackle the problem of cross-platform activity prediction that takes full advantage
of a limited number of overlapped crowds to cross-platform knowledge. Jiang et al.
found the use of enriching data as a social network as a star-structured hybrid graph
focusing on a social realm that links to other element fields, helping to increase
prediction accuracy. Online social voting, in comparison, differs from conventional
social dissemination recommendations. In addition to social relations, our models
often examine consumer group membership details in contrast to current social-based
RSs. We are examining how to concurrently boost recommendation for social voting
across a social network and community knowledge.
Shared Filtering-Based Advice of Online Group Voting 253
3 Proposed Methodologies
Our experience has not been studied too much online social voting. We build RS
models for MF and NN. Experiments of true social voting traces reveal that infor-
mation about both the social network and party affiliations can be used to greatly
boost the accuracy of the reputation recommendations for voting [4]. Our NN-based
models studies show that information about the social network dominates group
membership information. And for cold user’s social and community knowledge is
better than for strong users. We demonstrate that easy NN model models based on
track results in hot-voting computational MF models while MF models can be used
more efficiently in the interest of users for non-hot voters.
4 Enhanced System
Using the valid username and password to login the admin. After effective logon,
certain operations may be conducted, such as account authorization, list users and
authorization, See all requests and answers for friends, Add the mail, See all video
posts, Display all postal services recommended, View all postal services reviewed,
Collective Filtering History, Find Top K hit rate in table. All user search history. Both
friend requests and answers can be seen by admin. The tags ID, the requested user
photo, the requested user name, the request for user’s name, status and time and date
are shown for any request and reply. The status will be modified to approve whether
the recipient approves the request or the status will stay as pending. Both friends
who come from the same Website can be seen by the manager. Information include,
Request From, Website Request, Name Requests and Website Requests. The admin
will view all posts on the same and other network pages shared by friends. Info like
image article, cover, definition, name recommendation and name recommendation.
The admin attaches information like the title, summary and the posting picture [5].
Details like title and summary are encrypted and saved in a folder. There are no
user numbers available. Before carrying out any activities, users may log. Upon
registration, the information will be saved in the database. Using the approved user
name and password, he needs to login after effective registration. Once login is
successful, some operations such as Register and Login can be performed, view
your profile, request for friend, For Friends, Find Friends See all of your mates, Post
search, history of my search, recommendations view, user interests view in post, Top
K hit rate view. Users browse the same Website for their users and give them friendly
requests in various Websites. The user can find people on other pages only if they
have permission to make friends [6] in Fig. 1.
254 M. Kalyan and M. Sandeep
Fig. 1 System design
5 Conclusion
We have a range of online group voting MF-based and NN-based RSs. Via real data
studies, we have found that the accuracy of popularity-based voting advice, particu-
larly for cold users, can be considerably increased in social network information and
group affiliation information, and social network information dominates group affili-
ation data within NN-based approaches. This paper shows that social and community
knowledge is significantly more useful to increase the accuracy of recommendations
for cold users than for heavy users. This is because cold consumers are more likely
to take part in common votes. In our tests, clear path-based NN models outperform
hot-voting computational intensive MF models, although no hot-voting preferences
are best exploited by MF models by users. This paper is only our first step toward
a comprehensive analysis of the recommendation on social voting. As a matter of
urgency, we would like to research how the details on voting material, particularly
Shared Filtering-Based Advice of Online Group Voting 255
for cold votes, can be extracted for recommendation. Given the availability of multi-
channels information about your social districts and events, we are also involved in
creating RSs for individual customers.
References
1. G. Adomavicius, A. Tuzhilin, Toward the next generation of recommender systems: a survey

of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749
(2005)
2. X. Su, T.M. Khoshgoftaar, A survey of collaborative filtering techniques. Adv. Artif. Intell.
(2009). Art. no. 421425. https://doi.org/10.1155/2009/421425
3. Y. Koren, Factorization meets the neighborhood: a multifaceted collaborative filtering model, in
Proceedings ACM KDD (2008), pp. 426–434
4. Y. Koren, Collaborative filtering with temporal dynamics, in Proceedings KDD, Paris, France
(2009), pp. 447–456
5. A. Paterek, Improving regularized singular value decomposition for collaborative filtering, in
Proceedings KDDCup (2007), pp. 39–42
6. R. Salakhutdinov, A. Mnih, Probabilistic matrix factorization, in Proceedings NIPS, vol. 20
(2008), pp. 1257–1264
Mining Challenger from Bulk
Preprocessing Datasets
A. Sreelekha and P. Dileep
Abstract The success of a successful company is dependent on the ability to make

an object attractive for its consumers rather than the competition. In the framework of
this challenge, we have some questions: How can competition between two elements
be formalized and quantified? Who are a given item’s key competitors? What are
the most competitive characteristics of an item? While this topic has so many fields
of influence and importance, only a small amount of work has been spent on an
appropriate solution. In this paper, the competition between two products is seen
formally, depending on the business categories they both represent. Our competi-
tiveness assessment uses consumer feedback, a wide range of available knowledge
in a variety of fields. In extensive datasets, we propose efficient methods to assess
competition and resolve the natural problem of finishing the top-known competitors
of a particular object. Finally, we assess the validity of our findings and the scalability
with several datasets from various fields.
Keywords Phrase mining · Information search and retrieval · Electronic commerce
1 Introduction
Click information from the database logs and the query session data or query subject
templates would be the basis of effective keyword recommendation approaches.
Depending on their semantically importance to the initial keyword query, new
keyword proposals will be decided. Based on a query log overlay of their clicked
URLs, the semantic relevance between two keyword query types can be determined
via its proximity to a bipartite graph which links keyword queries and their clicked
URLs in the query log based on their similarities in the topic distribution space.
However, none of the available methods have position-conscious query suggestions;
therefore, keyword queries can be used not only in connection with user knowledge
needs, but in the vicinity of the user location. This necessity comes about thanks
A. Sreelekha · P. Dileep (B)

Department of Computer Science and Engineering, Malla Reddy College of Engineering and
Technology, Dhulapally, Hyderabad, India
e-mail: p.dileep@mrcet.ac.in
258 A. Sreelekha and P. Dileep
to the prevalence of spatial keyword search, which considers the user location and
the keyword query provided as arguments, returning spatially near and textually
related objects. In 2011, Google handled an average daily of 4.7 billion requests,
and all of which have local purpose and aim Web site items (i.e., Web sites with
text descriptions) or geo-documents (i.e., documents associated with geo-locations).
In addition, 53% were found to have a local intention in 2011 in mobile search of
Bang. We propose a keyword question suggestion mechanism (LKS) for the purpose
of filling this void. We show the advantage of LKS with a toy instance. Please take
into account five of the geo-documents d1-d5. Any document is linked to the place.
Assume a consumer issues a chq = \seafood keyword question “_q at _q location.
Notice the d1-d3 (including \seafood) documents”) well beyond _q. One place known
is \lobster “The documents d4 and d5 which are both related to the initial user search
intention are able to be found nearby. Earlier models to propose keyword queries
disregard the user place and the,” which again does not get the documents in the
vicinity. Note that, LKS has another purpose and, therefore, differs from other recom-
mendation location-aware approaches. The first problem of our LKS architecture is
how keyword query similarity can be measured accurately and the spatial distance
element captured. LKS approaches use a bipart keyword graph (KD-graph for short)
to bind keyword queries to their documents as shown in Fig. 1 in conjunction with the
previous query suggestion (c). Contrary to all past methods, which neglect positions,

Mining Challenger from Bulk Preprocessing Datasets 259
LKS modifies the weight for KD-graph edges such that the semantic significance
of keyword queries as well as the spatial distance between the locations of the text
and the _q position of the query editor is captured. In order to find the m keyword
query set with highest semantic significance to kq and spatial proximity to the user
position, we use a random restart mechanism (RWR), starting from the query chq
given by the users. RWR on a kid-graph was deemed superior to alternative methods
and was a common technique used in multiple keyword recommendation studies
(location-independent).
2 Problem Statement
In terms of its impact and its significance, so many areas have been dedicated to an
appropriate approach only to the small workload. We propose a formal description,
based on the sectors which both should represent, of competition between two prod-
ucts in this article. Our competitiveness assessment uses consumer feedback, a rich
pool of knowledge in a variety of fields. In large-scale analysis datasets, we propose
effective approaches to evaluate competition and resolve the natural problem of iden-
tifying the top competitors for one particular object. Finally, the accuracy of our find-
ings and the scalability of our strategy were assessed with several domain datasets.
An experimental assessment on actual datasets from various domains tested the effi-
ciency of our approach; we present effective competition assessment approaches in
broad revision datasets and deal with the natural problem of identifying the top-K
competitive competitors for a certain commodity. A business with a number of n
products I and a series of functions is introduced to us. We then want to classify the
I-k objects that optimize CF in the specified single item I 2 I (I) [1–3]. Furthermore,
a naive implementation MapReduce would face the difficulty of moving anything
into the reducing device to account for the integration into the computation.
A formal specification of the competition is between two products, based on their

appeal to the different consumer segments of their industry. Our approach overcomes
the reliance of previous work on scarce comparative evidence mined from text [4–6].
A systematic methodology is for the recognition of the various types of customers
in a given industry, as well as for the calculation of the proportion of customers who
belong to each category [7]. A highly efficient method is for identifying the top-K
competitors of a given object in very large datasets. Such as the location of the objects
in the multi-dimensional feature space and the interests and opinions of the users for
a user who wants an item I, an item j that is much superior to I with respect to the
user’s criteria (and, therefore, very different) is a better suggestion candidate than an
item j that is highly similar.
260 A. Sreelekha and P. Dileep
4 Enhanced System
A correct username and password are needed for admin. After successfully logged
in, he will perform such procedures including viewing and authorizing all users, their
data. Hotels are added (hotel name, location, area name, item name, item price, item
description, item image, no. of rooms available, room charge distance from location)
and add malls (name of mall, location, name of zone, definition of mall, specialized
mall, mall picture, location distance), take a look at all hotel info, comments, rating,
comments, all mall info, see all hotel booking info, payment details, see the results
chart of hotels and mall rating, see the top-K keywords searched for. There are n
numbers of people. Until performing such activities, users can register and submit
your position when registering. Using a correct username and password and loca-
tion, it can log in after active registration. After login is complete, he will conduct
several operations such as profile view information, account creation and manage-
ment, check of neighboring hotels and centers from your site, GMap, comment, book
hotels, display top-K keywords checked. Numerical users are available. Until doing
such activities, users can register and add their location during registration. Use the
correct username and password and location to login after registry is successful.
After active login, he will perform such operations, including viewing profile infor-
mation, generating and administering account, searching nearest hotels, malls, GPS,
commentary, book hotels, showing top-K keywords checked. Name of mall, loca-
tion, area name, description of mall, specialization of mall, mall image, and location
distance in Fig. 1.
5 Conclusion
We consider a variety of considerations, such as the location of the articles in the

multi-dimensional space and the consumer expectations and views that have been
widely ignored in the past. Our thesis incorporates a comprehensive approach for
collecting such information from massive customer review databases. We also tackled
the computationally difficult issue of finding the top competition in a given object on
the basis of our competitiveness dentition. The framework introduced is effective and
can be applied in areas of very large articles populations. An experimental assessment
on actual datasets from various fields confirmed the reliability of our approach.
Our studies have also shown that only a small number of reviews are enough to
approximate the various kinds of users in a given market as well as the number of
users belonging to each type.
Mining Challenger from Bulk Preprocessing Datasets 261
References
1. M. Bergen, M.A. Peteraf, Competitor identification and competitor analysis: a broad-based

managerial approach. Manag. Decis. Econ. (2002)
2. J.F. Porac, H. Thomas, Taxonomic mental models in competitor definition. Acad. Manage. Rev.
(2008)
3. M.-J. Chen, Competitor analysis and interfirm rivalry: toward a theoretical integration. Acad.
Manage. Rev. (1996)
4. R. Li, S. Bao, J. Wang, Y. Yu, Y. Cao, Cominer: an effective algorithm for mining competitors
from the web, in ICDM (2006)
5. Z. Ma, G. Pant, O.R.L. Sheng, Mining competitor relationships from online news: a network-
based approach. Electron. Commerce Res. Appl. (2011)
6. R. Li, S. Bao, J. Wang, Y. Liu, Y. Yu, Web scale competitor discovery using mutual information,
in ADMA (2006)
7. S. Bao, R. Li, Y. Yu, Y. Cao, Competitor mining with the web. IEEE Trans. Knowl. Data Eng.
(2008)
Prioritized Load Balancer
for Minimization of VM and Data
Transfer Cost in Cloud Computing
Sudheer Mangalampalli , Pokkuluri Kiran Sree, K. V. Narayana Rao,

Anuj Rapaka, and Ravi Teja Kocherla
Abstract Cloud computing is one of the rapid growing technology in IT industry.

Load balancing is a huge challenge in cloud computing as the incoming requests
onto cloud console varies from time to time and so a load balancer need to balance
these requests by assigning a new VM. The main challenge in load balancing is that
whenever a new request arrives at cloud console and if the load is high at cloud console
based on the newly arrived request, i.e., length of task and processing capacity of
the task. In the existing work, many of the authors discussed about many of the load
balancing algorithms, i.e., round robin, equally spread execution load, and throttled
algorithms, but none of the authors mentioned about the priority of the tasks. In
this work, we are proposing a prioritized load balancer based on incoming priority
of tasks and assign these tasks onto corresponding VMs. This simulation is carried
out by using cloud analyst simulator and evaluated the metrics response time, data
transfer cost and virtual machine cost. These metrics are evaluated against existing
algorithms named as RR and throttled algorithms, and the proposed prioritized load
balancer is greatly minimizes response time, virtual machine, and data transfer cost.
Keywords Cloud computing · Task scheduling · Load balancing · Prioritized load

balancer · Throttled · Round Robin
1 Introduction
Cloud computing is one of the fast growing technology, which can be used to provide
different types of services, i.e., storage, compute, and network to all users who
were subscribed to it in a seamless fashion by using virtualization. According to
NIST [1], “cloud computing can be defined as an on demand network access to a
shared pool of configurable computational resources.” This paradigm mainly gives
the access of the resources in cloud in the form of different type of services. Cloud
computing architecture mainly needs an application which should run from a browser
S. Mangalampalli (B) · P. K. Sree · K. V. N. Rao · A. Rapaka · R. T. Kocherla

Department of Computer Science and Engineering, Shri Vishnu Engineering College For Women,
Bhimavaram, Andhra Pradesh, India
264 S. Mangalampalli et al.
should be connected to a network. Users of cloud gives requests to cloud console,

and these are on behalf of users broker will submit it to task scheduler which in
turn submit to resource manager. The job of task scheduler is to map incoming
tasks onto appropriate virtual machines based on SLA made between cloud provider
and user. Task scheduler is connected to a resource manager, which keep track all
virtual resources, which were given to users in the form of services with underlying
physical machines. Load balancer is a component which is connected to resource
manager which can be used when huge number of tasks comes onto cloud console
automatically resource manager have to provoke load balancer for balancing the load
on VMs by migrating it to the next VMs or automatically spins up new VMs based on
the demand of the applications. In order to handle these huge requests which varies
from time to time we need an efficient task scheduler and to balance these tasks and
to effectively handle these requests based on size and capacity of task we need an
efficient load balancer [2].
The highlights of the paper are as follows:
1. A prioritized load balancer is designed which is based on length and processing
capacity of the tasks.
2. Cloud analyst simulator is used for simulating cloud environment.
3. The proposed algorithm is evaluated against existing RR and throttled algo-
rithms.
2 Related Works
In [3], proposed an algorithm which focuses on response time. It was modeled by

using modified throttled algorithm. It was implemented by using cloud analyst, and it
is evaluated against RR and throttled algorithms, and it is showing huge impact over
these algorithms with specified metrics. In [4], a load balancing strategy is devel-
oped which focuses on response time, data transfer cost, and data center processing
time. It was modeled by using hybridization of throttled and ESCE algorithms. It
is implemented on cloud analyst simulator, and it is compared over existing RR,
ESCE, and throttled algorithms. Simulation results revealed that it is outperformed
over existing algorithms in view of specified parameters. In [5], an algorithm is
formulated which uses to balance heterogeneous tasks by forming clusters of VMs
and thereby focusing on execution time, waiting time and turnaround time. It is
modeled by using modified throttled algorithm. It is implemented on a customized
Java simulator for simulation. It is compared against existing algorithms RR and
throttled, and it is showing huge impact over existing algorithms for the specified
metrics. In [6], algorithm is designed which aims at distribution of tasks to corre-
sponding VMs by using dynamic load management algorithm. It is implemented
by using cloud analyst simulator, and it is compared against existing RR, throt-
tled, and ESCE algorithms, and it is outperformed existing algorithms by efficiently
balancing load. In [7], algorithm is developed which intelligently maps tasks onto
VMs by minimizing underutilization in cloud environment. It is modeled by using
Prioritized Load Balancer for Minimization … 265
VM-assign load balancer algorithm. It is implemented on cloud analyst simulator

and is evaluated against existing active load balancer algorithm, and it is showing a
huge impact over existing algorithm by balancing load efficiently among VMs. In
[8], algorithm is developed which focuses on response time. It is modeled by using
throttled algorithm. It is implemented on cloud analyst simulator, and it is compared
with existing algorithms RR and ESCE and it greatly minimizes overall response time
over the existing algorithms. In [9], algorithm is developed which minimizes response
time. It is modeled by using MQLO algorithm. It is implemented in CloudSim, and
workload is generated randomly, and it is given as input to the algorithm, and it
is evaluated against STM and SWDP algorithms, and MQLO outperforms over the
existing algorithms in terms of average success rate and response time. In [10], algo-
rithm is formulated which focuses on makespan, resource utilization, waiting time.
It is modeled by using BPSO algorithm. It is implemented on CloudSim simulator,
and workload is generated randomly from CloudSim, and it is given as input to the
algorithm. It is evaluated against standard PSO, and it is showing huge impact over
existing algorithms in the specified parameters. In [11], algorithm is designed which
focus on efficient balancing of tasks among different VMs with minimized response
time. It is modeled by using GA algorithm. It is implemented on cloudsim, and input
to the algorithm is generated randomly from cloudsim [12]. It is compared against
existing CLB algorithm, and it outperforms the existing approach in view of specified
parameter [13, 14].
From the existing literature, most of the authors used different algorithms used,
i.e., nature inspired like PSO, GA, and other algorithms, and some of the authors used
throttled [3] and modified throttled [3] algorithms for load balancing of the tasks.
From Table 1, we can identify that authors are using cloud analyst as a simulating
environment for most of the load balancing algorithms. In the literature, most of
authors addressed the metrics like makespan, waiting time, execution time, degree
of imbalance of tasks, and processing time of tasks, but authors were not addressed
the metrics like data transfer cost and VM cost as these parameters were important
as when a huge and heterogeneous requests comes onto cloud console and resource
manager is seeking the help of load balancer to assign these tasks to a new VM there
by spinning up a new VM which incurs a cost for data transferring and spinning up a
new VM and handling that task using the new VM which also incurs the cost for VM.
Whenever load balancing of tasks need to be done, then there is a need of addressing
Table 1 User base settings for simulation environment

User base Region Number of users in peak hours Number of users in non-peak hours
North America 0 145,000 145,000
South America 1 135,000 135,000
Europe 2 265,000 265,000
Asia 3 545,000 545,000
Africa 4 30,000 3000
Oceana 5 10,000 1000
these parameters. In this paper, we are proposing a prioritized load balancer for
minimization of VM and data transfer costs in cloud computing [15–17].
3 Proposed Prioritized Load Balancer
The proposed prioritized load balancer is used to maps tasks, which were coming
onto cloud console and maps these tasks to corresponding VMs based on calculation
of priorities of tasks. Task priority is calculated based on length of the task and
processing capacity of tasks. Initially tasks coming on to cloud console were coming
at the data center Controller, and these tasks were given as input to prioritized load
balancer. This prioritized load balancer will calculates priorities of tasks based on
size of task and processing capacities of tasks. Initially, tasks were inserted onto the
queue and then load balancer will checks length of the task and runtime processing
capacity of task and then assign these tasks to appropriate VMs based on the priority
calculated by the load balancer.
If a task comes with high priority based on length and processing capacity of task,
then load balancer will check for a VM which is suitable for the current task and
assign all tasks which were in the queue based on this calculation of priority, and
this process continues until all tasks in queue were completed as shown in Fig. 1.
Fig. 1 Flow of proposed prioritized load balancer

Algorithm : Prioritized Load Balancer
Input: Incoming requests R1,R2,R3,R4, Rn and VMs V1, V2,V3,V4,

Vn
Output: Mapping of these incoming requests onto available VMs based on
calculation of priority of tasks.
1. Input the incoming requests onto the Datacenter controller.
2. Prioritized load Balancer maintains an index for all VMs.
3. After receiving requests at DC controller, it will probe the
prioritized load balancer.
4. Prioritized load balancer calculates priority of tasks by using
length and runtime processing capacity of tasks.
5. After calculation of task priorities prioritized load balancer
checks for VM by using the index.
Case 1: if VM found
a. Prioritized load balancer will returns corresponding id of that
VM to DC controller.
b. DC Controller sends a request to identified VM by prioritized
load balancer using that id.
c. DC Controller also notifies prioritized load balancer about VM
allocation.
d. Prioritized load balancer updates allocation of VM.
Case 2: if VM not found

a. Prioritized load balancer looks for the next prioritized VM of
that corresponding request.
b. If none of the VMs are suitable for task then it returns -1.
c. This process repeats by using step 5 until all tasks were
allocated to corresponding VMs by using prioritized load
balancer.
4 Simulation and Results
Simulation is carried out in cloud analyst [17] simulator, which is a graphical GUI-
based simulator, which was built on cloudsim in Table 1.
In this paper, we have integrated prioritized load balancer algorithm into cloud
analyst by extending data center class in cloud analyst. The primary objective of
this proposed algorithm is to map incoming jobs by using priority of tasks and then
assign an appropriate VM in data center. We have addressed the metrics named as
response time, virtual machine, and data transfer cost at data centers.
The data center settings for this simulation which consists of RAM which is
4 GB; data storage space is 100 GB; number of CPUs used for a host is of 4 CPUs
and processing capacity of 10 K MIPS. Initially, we have analyzed the load balance
of VMs, response time and then identified the VM and data transfer cost in the
corresponding data center.
Usage of VMs by diﬀerent Load Balancers

1400
1200
1000
800
600
400
200
0
1 2 3 4 5
RR Troled Priorized Load balancer
Fig. 2 Load balancing and usage of VMs for different load balancers
Table 2 Usage of VMs for

S. No. Round Robbin Throttled Prioritized load
different load balancing
balancer
algorithms
VM0 278 1296 278
VM1 278 88 296
VM2 285 8 285
VM3 286 6 286
VM4 285 0 286
4.1 Load Balancing by Using Prioritized Load Balancer
In this algorithm, our primary objective is to balance the VMs by assigning tasks
based on priorities of tasks I (Fig. 2). This is calculated based on length and processing
capacity of tasks. In this simulation, we have compared our proposed load balancer
with the existing algorithms named RR and throttled algorithms. Table 2 represents
usage of different VMs for the corresponding load balancer.
4.2 Response Time Using Prioritized Load Balancer
After calculating usage of VMs, we have calculated response time which is an impor-
tant parameter for any scheduling and load balancing algorithm as shown in Fig. 3.
We have compared our algorithm against RR and throttled load balancers and simu-
lation results revealed that our proposed algorithm outperforms existing algorithms
in terms of response time in Table 3.
Avg. Response me

371
370
369
368
367
366
365
364
363
362
RR Throled Priorized Load Balancer
Fig. 3 Average response time of different load balancers
Table 3 Response times for

Algorithm Avg. response time (ms)
algorithms Round Robin 369.98
Throttled 369.86
Prioritized load balancer 365.25
4.3 VM Cost and Data Transfer Cost Using Prioritized Load

Balancer
Finally, we need to calculate total VM cost and data transfer cost for the above
simulation for 60 min, and then, we have also calculated the VM costs for the existing
RR and throttled load balancers in Fig. 4. Table 4 represents the total cost of data
center i.e., sum of data Transfer cost and VM cost.
Load balancing is one of the prominent challenges in cloud computing. Incoming

requests into cloud console varies with time. In order to handle this issue, a prioritized
load balancing technique was introduced. All the incoming requests were handled by
calculating priorities of tasks based on length of task and processing capacity of task
and assigns tasks to corresponding VMs. Simulation is carried out on cloud analyst
and proposed prioritized algorithm is evaluated against RR and throttled algorithms,
and proposed technique is outperformed over these algorithms in terms of response
time, VM cost, data transfer cost and efficiently balances tasks among VMs. In future,
we want to evaluate this algorithm by using synthetic datasets available online.
Datacenter Cost(VM Cost+ Data Transfer Cost)

80 78.56
77.58
78
76
74
72
70 68.84
68
66
64
62
RR Throled Priorized Load Balancer
Fig. 4 Data center cost of different load balancers
Table 4 Data center costs for

Algorithm Data center cost (VM Cost + Data
transfer cost)
algorithms
Round Robin 77.58 $
Throttled 78.56 $
Prioritized load balancer 68.84 $
References
1. P. Mell, T. Grance,The NIST definition of cloud computing (2011)

2. M.S. Sudheer, M. Vamsi Krishna Dr, Dynamic PSO for task scheduling optimization in cloud
computing. Int. J. Recent Technol. Eng. 8(2), 332–338 (2019)
3. S.G. Domanal, G.R.M. Reddy,Load balancing in cloud computing using modified throttled
algorithm, in 2013 IEEE International Conference on Cloud Computing in Emerging Markets
(CCEM) (IEEE, 2013)
4. V. Bagwaiya, S.K. Raghuwanshi, Hybrid approach using throttled and ESCE load balancing
algorithms in cloud computing, in 2014 International Conference on Green Computing
Communication and Electrical Engineering (ICGCCEE) (IEEE, 2014)
5. S. Kapoor, C. Dabas,Cluster based load balancing in cloud computing, in 2015 Eighth
International Conference on Contemporary Computing (IC3) (IEEE, 2015)
6. R. Panwar, B. Mallick,Load balancing in cloud computing using dynamic load management
algorithm, in 2015 International Conference on Green Computing and Internet of Things
(ICGCIoT) (IEEE, 2015)
7. S.G. Domanal, G.R.M. Reddy,Optimal load balancing in cloud computing by efficient utiliza-
tion of virtual machines, in 2014 Sixth International Conference on Communication Systems
and Networks (COMSNETS) (IEEE, 2014)
8. V. Tyagi, T. Kumar, ORT broker policy: reduce cost and response time using throttled load
balancing algorithm. Proc. Comp. Sci. 48, 217–221 (2015)
9. V. Priya, C.S. Kumar, R. Kannan, Resource scheduling algorithm with load balancing for cloud
service provisioning. Appl. Soft Comput. 76, 416–424 (2019)
10. J.P. Mapetu, Z.C. Buanga, L. Kong, Low-time complexity and low-cost binary particle swarm
optimization algorithm for task scheduling and load balancing in cloud computing. Appl. Intell.
49(9), 3308–3330 (2019)
11. M. Lagwal, N. Bhardwaj,Load balancing in cloud computing using genetic algorithm, in 2017
International Conference on Intelligent Computing and Control Systems (ICICCS) (IEEE,
2017)
12. S. Mohanty et al., A novel meta-heuristic approach for load balancing in cloud computing, in
Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed
and Cloud Computing (IGI Global, 2021), pp. 504–526
13. Y. Fahim et al.,Load balancing in cloud computing using meta-heuristic algorithm. J. Inf.
Process. Syst. 14(3) (2018)
14. A. Ragmani et al., An improved hybrid fuzzy-ant colony algorithm applied to load balancing
in cloud computing environment. Proc. Comput. Sci. 151, 519–526 (2019)
15. B. Mallikarjuna, P. Venkata Krishna,A nature inspired bee colony optimization model for
improving load balancing in cloud computing. Int. J. Innov. Technol. Exp. Eng. 8, 51–54
(2018)
16. M. Lawanyashri, S. Subha, B. Balusamy, Energy-aware fruitfly optimisation algorithm for load
balancing in cloud computing environments. Int. J. Intell. Eng. Syst. 10(1), 75–85 (2017)
17. B. Wickremasinghe, R.N. Calheiros, R. Buyya,Cloudanalyst: a cloudsim-based visual modeller
for analysing cloud computing environments and applications, in 2010 24th IEEE international
conference on advanced information networking and applications (IEEE, 2010)
Smart Underground Drainage
Management System Using Internet
of Things
K. Venkata Murali Mohan, K. M. V. Madan Kumar, Sarangam Kodati,

and G. Ravi
Abstract Over the municipal corporation infrastructures, drainage system plays a

significant role. This drainage system is undoubtedly a crucial part in every one of
our day to day lives. At present day, many people and also all the drainage workers
have been facing various problems in cleaning such drainages. This problem was
delineated different capacities utilized for support and observing of underground
drainage systems. Therefore, it is a need to focus on the present technologies that
ensure more safety to the drainage workers and also designing of a system to address
the various problems of underground water, and harmful gases is a significant one.
Then, a smart drainage system is developed using several sensors such as level, flow,
and gas sensors by interfacing them to the ARM7 processor. This smart drainage
system can be designed in such a way to monitor the smart drainage system, water
flow, water level in underground, gases, and manhole system using the Internet of
Things (IOT) technology. The sensors used in the smart drainage system can detect
if any blockage occur in drainage system or water flows in underground system and
then displays the respective information on the 16 × 2 LCD display when the sensors
detects the values more than their threshold level. The GSM technology is also used
in the system to communicate the information to the nearest municipality service
center for the further corrective action.
Keywords Internet of Things · Drainage · Sensors · GPS · ARM7
1 Introduction
The drainage system performs a significant feature among huge urban areas in the
place where millions of people survive. The fundamental thing to provide the drainage
K. V. M. Mohan (B)
Department of ECE, Teegala Krishna Reddy Engineering College, Hyderabad, Telangana, India
K. M. V. M. Kumar · S. Kodati
Department of CSE, Teegala Krishna Reddy Engineering College, Hyderabad, Telangana, India
G. Ravi
Department of CSE, MRCET, Hyderabad, Telangana, India
274 K. V. M. Mohan et al.
system is to make sure the dryness in lands from the not used and over flow of rain
water. In order to maintain the appropriate functionality, the drainage environment has
to lie observed. Indeed, not every area holds a drainage observing group. It prompts
an irregular checking over the drainage condition. The unpredictable observing has
contributed according to the obstructing on the drainage so much inference to the
greeting which triggers flooding in the area. Since the commitment to plenty of
people is required who are simply ready to document a limited report together with
low accuracy, manual checking has also not much effective one. Drainage checking
system is not automated. Along it lines, at whatever point there is a blockage such
is strong to sort out the particular area about the blockage. Likewise, early cautions
concerning the blockage are not received. It becomes out according to very inconve-
nient to deal with the circumstance, so pipes are blocked totally. People faced many
problems and need a support since the leakages on the drainage lines [1].
The problem encountered with this type of drainage line can turn daily city life
into serious problems. Some problems like clogging due to waste, sudden dilatations
of the water level, and various polluting gases can arise if the best possible cleaning
actions are not carried out in time. The current drainage system is not automatic as
it is difficult to tell if the railing is in a particular position. In addition, from time to
time, various gases such as methane (CH4 ), carbon monoxide (CO) can be generated
in these drainage pipes from time to time, which are destructive and can cause serious
problems if inhaled by humans in large quantity. These problems are common among
workers who have been given notice and can lead to death as a result. Likewise, we
do not receive any warning of clogging or rising levels of these gases or expanding
water levels. Recognizing and repairing the block is therefore tedious and hectic [2].
2 Proposed Work for Underground Drainage Monitoring

System
Developing clean and smart urban areas with automated monitoring system is the
main aim of the system proposed in the paper as shown in Fig. 1. The main operation
of this system is to continuously monitoring the unused water level, flow of rate of
water, and any leakages in drainage channels persistently by sensing an alert message
to provide concerning information through GSM and LCD. This system can give a
low-cost and adaptable solution of monitoring in all condition of sensors. In Fig. 2,
ARM7 is a family of processors widely used in embedded system applications [3,
4]. It is manufactured by Philips and comes pre-installed with numerous integrated
peripherals. This makes it a more effective and reliable alternative for beginners and
high-end app developers.
Smart Underground Drainage Management … 275
Fig. 1 Block diagram of underground drainage monitoring system
Fig. 2 ARM7 board
2.1 Alphanumeric LCD Display 16 × 2
The 16 × 2 display has 32 characters by and large, for example 16 out of one line and
the other 16 in the second line. Each character is made of 50 pixels, consequently, all
the pixels must work to show the character effectively, and this capacity is constrained
by another regulator (HD44780) in the display unit in Fig. 3. Basically, the LCD is
utilized to display the information obtained from the different sensors.
Fig. 3 Alphanumeric LCD

display
Fig. 4 Flow sensor
2.2 Flow Sensor
The water go with the flow sensor have a plastic valve body, water rotor, and the hall-
effect sensor. At the point, whenever water moves through the rotor, the rotor rotates.
According to the different rate of change of water flow, speed of rotor is changes in
Fig. 4. The analogous twig signal is produced as output from the hall-effect sensor.
This certain is terrific in conformity with recognize stream among a water container
or coffee machine [5].
2.3 Temperature Sensor
Figure 5, LM35 is a temperature sensor that produces a simple sign which is relative to
the instantaneous temperature [6]. The output voltage can undoubtedly be deciphered
Fig. 5 Temperature sensor
to get a temperature perusing in Celsius. The advantage of LM35 over thermistor

is that doesn’t need any external calibration. The covering likewise shields it from
self-heating [7].
2.4 Gas Sensor
The industries, fuel stations, etc. are the main sources from which various harmful
gases are produced. Such gases are detected using the gas sensors in Fig. 6. These
sensors can easily sense the content of H2 S levels in the environment and can give
an alert in a form of sound or visual information by simply incorporating it into the
system. These sensors have feasible interface ability in to the system and produce a
fast response [8, 9]. The blockages occur in the drainage channels are cleared and
worked through manholes that are enclosed in various places along the channels [10].
Fig. 6 Gas sensor

Fig. 7 GSM
2.5 GPS
The GPS is a Global Positioning System which is space-based navigation system. It

provides the latitude and longitude information of any location with time in real time
at every point of time. This can give the location of any place on the earth by using
the information from the satellite for each and every second. The place in which
blockage occur in the drainage channel can be find using this GPS by determining
the latitude and longitude values. Here, the intersection or meeting point at which
both latitude and longitude are met gives the exact location. Then, this location can
be sent to the municipal corporation authorities in a text form through an SMS.
2.6 GSM
The GSM can support communication into 900 MHz band. We are from India and
the greater section about the versatile system suppliers between that nation works
among the 900 MHz band. In Fig. 7,
3 Algorithm
• Power Up equipment
• Initialization of the equipment module
• Display the “DRAINAGE MONITORING SYSTEM” on LCD
• Sense esteem values of sensor using microcontroller.
• Display the temperature on LCD by sensing temperature with temperature sensor
• Check level of carbon dioxide in environment using CO2 sensor

• If water level increased and water flow decreases
• Then trace that area location using GPS
• Send the location as SMS through GSM
• If threshold value of sensor exceed, send SMS using GSM
• Update the real-time sensor data on Web server using IOT
• Display the entire information of operation continuously on LCD
• STOP.
4 Working of ARM7
This ARM processor is interfaced to the various kinds of concerning water level,
water flow, and gas recognizing sensors. This processor controls all the signals
coming from the various sensors attached to it when they are detected if exceeds
their threshold levels by transferring them to the processor. Then, the information
taken from such sensors this ARM7 monitors and controls the drainage system by
performing the concerned actions and also notifies the condition to the nearest munic-
ipal corporation through sending an text SMS using the GSM technology. This can
support to locate the blockage of water level and drainage easily for taking further
controlling actions. The real-time monitoring of sensor values are detected through
this ARM7 and then continuously update them using Web server in IOT by interfacing
the sensors to different number of ports of processor. The complete information of
system with respect to sensors is likewise being displayed on the 16 × 2 LCD.
5 Results
ARM-based drainage sensors detection system implantation is mentioned below. The

programming of ARM7 processor is done in order to detect the leakage in drainage
or blockage of water flow and water level in underground. Gas sensor also connected
to this system to detect the gas leakages and to notify with the help of messages. If the
sensors detect the sensed values more than their threshold level, then the respective
information is displayed on the 16 × 2 LCD. When in the case of gas detection, in
addition to display the information on the LCD, a text message is also sent through an
SMS to the predefined registered number in the processor through GSM technology.
In a water level detection condition also same process is repeated to give an alert in
Figs. 8 and 9.
Fig. 8 Gas displaying message on LCD
Fig. 9 Water displaying message on LCD
6 Conclusion
Underground drainage e-checking is a challenging problem. Various techniques are

utilized in this paper for checking and overseeing a drainage system in various under-
ground applications. It provides an explanation for different applications such as
real-time leakage detection in drainage, underground, and manhole systems. The
IOT is used in the system to monitor the various parameters like water level and
water flow then update them on IOT network. This enables the person in control
to make the important moves with respect to the corresponding action. By using
this paper, we can decrease the labor and time utilization to confirm the manhole
blocking and underground drainage pipelines and furthermore stays away from the
dangers. At the point, when the sensors detect more than threshold level, then its
values are displayed on the LED or LCD display. Through the concerning LED or
LCD display, the information is passed and displayed when a gas sensor is detected
at the same time text message also sent to the respective number that was predefined
in the program using the GSM module. This process was repeated for the high level
of water detection condition also.
References
1. M.T. Lazarescu, Design of a WSN platform for long-term environmental monitoring for IoT
applications. IEEE Emerg. Sel. Topics Circuits Syst. 3(1), 45–54 (2013)
2. Y. Narale, A. Jogal, S.P. Bhosale, H. Chowdhary, Underground drainage monitoring system
using IoT. Int. J. Adv. Res. Ideas Inno. Technol. 4(1) (2018)
3. S. Rao, S.K. Muragesh, Automated Internet of Things for underground drainage and manhole
monitoring systems for metropolitan cities. Int. J. Inno. Sci. Eng. Technol. 2(4) (2015)
4. A. Suvarna, S.A. Shaik, Sonawane, Monitoring smart city application using Raspberry PI based
on IOT. Int. J. Adv. Res. Ideas Inno. Technol. 5(VIL) (2017)
5. G.A. Naidu, S. Kodati, J. Selvaraj, A review report of smart health care applications and benefits
using Internet of Things. Int. J. Recent Technol. Eng. (IJRTE) 8(3) (2019). ISSN: 2277-3878
6. S. Velliangiri, R. Sekar, P. Anbhazhagan, Using MLPA for smart mushroom farm monitoring
system based on IoT. Int. J. Netw. Virt. Organ. 22(4), 334–346 (2020)
7. S. Velliangiri, S.A. Kumar, P. Karthikeyan (eds.), in Internet of Things: Integration and Security
Challenges (CRC Press, 2020)
8. D.P. Rajan, D. Baswaraj, S. Velliangiri, P. Karthikeyan, Next generations data science applica-
tion and its platform, in 2020 International Conference on Smart Electronics and Communi-
cation (ICOSEC), Trichy, India, pp. 891–897 (2020). https://doi.org/10.1109/ICOSEC49089.
2020.9215245
9. S.H. Khan et al., Statistics-based gas sensor, in 2019 IEEE 32nd International Conference
on Micro Electro Mechanical Systems (MEMS), Seoul, Korea (South), pp. 137–140 (2019).
https://doi.org/10.1109/MEMSYS.2019.8870821
10. A. Baviskar, A. Mulla, A. Bhovad, J. Baviskar, GPS assisted standard positioning service for
navigation and tracking: review and implementation, in 2015 International Conference on
Pervasive Computing (ICPC), Pune, pp. 1–6 (2015)
IoT-based System for Health Monitoring
of Arrhythmia Patients Using Machine
Learning Classification Techniques
Sarangam Kodati, Kumbala Pradeep Reddy, G. Ravi, and Nara Sreekanth
Abstract An emerging technology Internet of Things (IoT) in wireless technologies

and wearable sensors enables effective monitoring of the patients. Using huge volume
of data from such health wearable sensors, health condition of in/out patients can
be monitored periodically and repeatedly by processing, analyzing, and classifying
using machine learning algorithms. This paper proposed an IoT-based system for
health monitoring of arrhythmia patient using machine learning classification tech-
niques. The presented method can be used for continuous monitoring of arrhythmia
patient using wearable sensors and classifying the data from such sensors by catego-
rizing into various groups using machine learning algorithm. Initially, data sensing is
performed using several wearable sensors with the microcontroller; then, using the
IOT technology, that sensed data can be transmitted to the cloud storage. After that,
the classification of sensed data can be performed using multi-machine learning algo-
rithms with the previous clinical data and dataset. Finally, prediction of arrhythmia
can be done based on the classified feature data. From the result analysis, it can be
illustrated that the support vector machine (SVM) algorithm provides higher accu-
racy compared to all other algorithms in predicting the heart disease of arrhythmia.
Therefore, this system has proven robust to predict heart disease of arrhythmia.
Keywords Wearable sensors · Internet of Things (IoT) · Machine learning ·

Arrhythmia · Health monitoring
S. Kodati (B)
Department of CSE, Teegala Krishna Reddy Engineering College, Hyderabad, Telangana, India
K. P. Reddy
Department of CSE, CMR Institute of Technology (Autonomous), Hyderabad, Telangana, India
G. Ravi
Department of CSE, MRCET, Hyderabad, Telangana, India
N. Sreekanth
Department of CSE, BVRIT Hyderabad College of Engineering for Women, Hyderabad,
Telangana, India
284 S. Kodati et al.
1 Introduction
In every one of the day to day life around the world, a significant role has been
played by the healthcare. By the appropriate treatment, diagnosis and early stage
prevention of health diseases can be achieved. It was easy to prevent some rare
diseases in the early stages such as heart stroke and heart attack when they occur.
Burden on the health systems needs to be reduced in order to keep the level and
quality of medical care on the more high level. Arrhythmia is the condition in which
there is an irregular beating of heart, and this is categorized under the pathologies of
heart [1]. The condition of arrhythmia patients has to be continuously monitored as
the information about the heartbeat is necessary to diagnose the type of arrhythmia
and to follow appropriate medical procedures. Usually, the patients are diagnosed
using the electrocardiogram (ECG) monitors or monitors that are available in the
clinics. This is where the problem arises; for arrhythmia diagnosing, ECG of patients
has to be continuously monitored for minimum of 24–48 h. Hence, the wearable
sensors have been used for monitoring the patient’s health condition over their typical
environment. An increasing interest has been observed on the wearable sensors from
the latest years. Many researches demonstrated that the recording and then managing
the patient’s physiological information for a long duration of time can be achieved
by considering the application of wearable sensor technologies [2].
A possible solution for healthcare system to reduce their burden is the Internet
of Things (IoT). Such healthcare system should utilize and take into consideration
of declaring about some amount of adding data preprocessing, analysis, and storage
approaches to the system since enormous quantity of data exist in present Internet
world [3]. According to the wireless sensor structure-based technology, knowledge
and information have been explained by the healthcare environment over the recent
days [4]. Patients face an uncertain circumstance at the end of prognosis due to the
specific explanation of heart problems and heart attacks which are a straightforward
consequence of the lack of good therapeutic maintenance for patients at the necessary
conditions. It is used to clearly identify elderly and young patients and to educate
specialists, friends, and relatives. This is why considering a resourceful activity is
an idea that can achieve such unexpected success rates by controlling the patient
condition monitoring using sensor technology and the Internet to pass it on to friends
and family in a chance when a problem arises. Furthermore, it has been observed that
the ML processes are used for continuous improvement in various areas of the Internet
of Things (IoT) [5]. IoT-based arrhythmia patient monitoring system with wearable
sensors, classification of arrhythmia using multi-machine learning algorithms, and
the data visualization through the prediction of classified data are the main objective
of work to be presented in this paper.
IoT-based System for Health Monitoring of Arrhythmia … 285
2 IoT and ML in Healthcare
2.1 IoT in Healthcare
Detection, management, and analysis of remote sensor devices can be performed

through a physical system along with object network connection provided by IoT.
For a smooth communication between intelligent devices and wearable sensors, edge
computers have been linked through the development of computational architecture.
Such smart devices were significantly dependent on the middleware layers of the IoT
in order to preprocess the information. Some of the application areas of IoT include
smart home, smart cities, smart healthcare, smart grid, smart agriculture, and smart
transportations. Initially, application, network, and perception layers, respectively,
are the three layers included in the fundamental architecture of IoT and then business
and middleware layers are extended to its architecture to cover other advanced high-
level architectures. The personal care systems and healthcare systems used machine
learning techniques in addition to the use of IoT technology, several implantable
devices, and wearable sensors [6]. The most significant points in the major types
of such healthcare personalized devices called wearable and implantable devices
are demonstrated below. Several products can be easily fitted to the structure of
human body that include pins, pendants, smart watches, bracelets, smart rings, t-
shirts, workout trackers, shoes, and other public health gadgets, portable systems.
The health condition and disease of a patient or a person can be tracked with the wear-
able devices that are directly in contact with their body [7]. This wearable device
technology includes sensor, analysis, and visualization approaches [8]. The infor-
mation related to the biological signals of body like amount of calories consumed,
heartbeat, walking, workout time, blood pressure can be generated through this wear-
able devices. Restoring some part or a total part of biological system and its structure
is the main objective of implantable devices which can be inserted below the human
body. These devices can be most widely required and used in various applications
like microchip, radiology, neuron, heart attack in which providing a secured network
for the data is a most significant task [9].
2.2 Machine Learning in Healthcare
Another latest technology used for information exchange is machine learning which
implements algorithms to learn from the datasets [10]. According to the earlier obser-
vations of machine, algorithms can be implemented in this machine learning. Recog-
nizing the patterns over the data and using the learnt patterns are the main aim of the
machine learning. The fundamental artificial intelligence methodology that extracts
information during data training is the machine learning technique [11]. These ML
algorithms can be useful for difficulty of distinguishing and getting wide information
or record patterns [12]. In particular, this technique is most suitable for medical and
healthcare purposes to diagnose and identify the various diseases.
3 IoT-based Arrhythmia Patient’s Health Monitoring

Using ML
Figure 1 shows the framework of present IoT-based arrhythmia patient’s health moni-
toring system using machine learning. The main objective of the this health moni-
toring system is providing a health condition monitoring of arrhythmia patients using
IoT technology and then classifying that sensed data from patients with the help of
Fig. 1 IoT-based arrhythmia Data Sensing

patient’s health monitoring BAN
system using machine GPS
learning
Heartbeat Temperature ECG
Controller
Data Transmission
Wi-FI module
Cloud Storage
Classification & prediction

Patient clinical Data Data set
Preprocessing
Multi-Machine Learning Algorithms
Present Feature
Sensor
Classification
Data
Prediction
Presence of Absence of
Arrhythmia Arrhythmia
machine learning classification techniques to improve the diagnosis and preven-

tion of arrhythmia by early prediction of it. For the provision of advanced health-
care monitoring systems in hospitals to the patients especially for the arrhythmia
patients, this system can be used. Therefore, the entire framework of this IoT-based
arrhythmia patient’s heath monitoring system using ML is shown in Fig. 1 comprised
of totally four sections which are data sensing, data transmission, cloud storage, and
classification and prediction.
3.1 Data Sensing
Initially, wearable sensors are used in this system at the patients’ environment for
the continuous monitoring of their health condition, and these sensed data are trans-
mitted to the cloud. The transmitted data can be seen and monitored by the patients’
family members as well as doctors. In this presented framework, system is used to
monitor the patients continuously who are suspected to suffer from arrhythmia. The
heartbeat, body temperature, etc. of some physiological parameters can be monitored
in such system through the use of respective wearable sensors. These parameters are
controlled, and continuous monitoring can be achieved by using the controller. The
real-time data monitoring can be done by using the global system for mobile commu-
nication (GSM) and global positioning system (GPS) technologies with the location
of patients also. The sensed data transmitted to the cloud are then visible to the
patients and doctors through a software platform called “ThingSpeak.”
3.2 Data Transmission and Cloud Storage
This section called data transmission makes use of Wi-Fi module technology to
obtain the data from different wearable sensors attached to the patients, and then
in the cloud section, this module transmits the obtained sensor data to the cloud.
The secure transmission can be achieved with this to provide the privacy to the
patient’s data. The design of system is in such a way to effectively and feasibly
store the patient’s bidirectional data into the cloud for a large duration of time.
Feature of this storing patient’s data for longer duration can help the doctors and
medical professionals in diagnosing arrhythmia. In this section, components used
provide an access to monitor the data transmitted to the cloud through a platform
called ThingSpeak. It displays the sensed physiological parameters like heartbeat,
temperature in the form of graphical representation. The data transmission method
used in this system provides an improved scalability and additional advantages such
as accessibility of data to both the doctors and patients based on demand.
3.3 Classification and Prediction
After transmitting the data to cloud, that can be accessible to the medical professionals
and doctors through a platform to make a decision on sensed data in diagnosing the
patient’s health. If there were any abnormalities in the sensed physiological data
stored in cloud like irregularity in heartbeat, high temperature, then classification
can be performed to make the specification decision on it. In this system, sensor data
regarding to the heartbeat and body temperature of the patients suspected to suffer
from arrhythmia are monitored and then stored in the cloud. In this machine learning
techniques are used to classify the data if occur any abnormalities in the sensed data
and then predict whether the patient suffer from arrhythmia or not. Multiple machine
learning techniques are used in this system to perform the classification on the data.
Then, by using classified data, prediction of arrhythmia can be accomplished. The
total process of this classification and prediction can be explained below.
3.3.1 Information Preprocessing
The extended improvements for the data to be analyzed in this system before the
actual processing of information can be dealt in this information preprocessing step.
In this preprocessing framework, tattered information turns in to error-free training
dataset. Information with the predefined location is necessary for some artificial
intelligence-defined models such that; for example, in the random forest algorithm,
invalid attributes must be handled by the master training dataset to conduct random
forest calculation as it does not enforce invalid attributes. The other main objective
of this preprocessing step is to organize the huge collected information with a goal
of applying more than one machine learning and deep learning classifiers to perform
on one training dataset and choosing the best among them.
3.3.2 Multi-machine Learning Algorithms
In general, artificial intelligence techniques are used for the diagnosis purposes in
the healthcare system to classify the patient’s health data in predicting the diseases,
and this most widely makes use of machine learning and deep learning techniques.
Therefore, in this system, five well-known types of machine learning classifiers are
used to effectively classify the data that are discussed as follows one by one. By using
such multi-machine learning algorithms, data can be classified and then predicted the
arrhythmia based on classified data. Modeling and processing of collected and stored
patient’s health data are carried out to predict the arrhythmia in this paper using the
multi-machine learning algorithms. This multi-classification process is done with the
collaboration of IoT analytics and cloud environment with the automation analysis.
The machine learning classification algorithms used in this paper are the logistic
regression algorithm, support vector machine (SVM), naïve Bayes classifier, K-

nearest neighbor (KNN) classifier, and decision tree algorithms.
1. Logistic Regression Algorithm: This algorithm can predict the arrhythmia
in patients by learning the posterior probability distributions of each class on
dataset. This is a supervised-based machine learning algorithm that uses L
number of labels to classify the K number of classes. By using the probability
regression analysis of given dataset, the likelihood classes can be selected. This
classification algorithm is used on such dataset which has a nature of variability
or regression; i.e., different data are coded with binary of 1 and 0 for yes and no
natures of quantity, respectively. Then, such dataset can be divided into various
classes based on such natures of dataset in order to predict the likelihood of
required features based on variable quantity or regression coefficients of that
features.
2. Support Vector Machine (SVM): This classification approach can be used
where the pattern and nonlinear regression-based classification of data is
required. SVM is generally a precise mathematical machine learning algorithm
that makes linear binary classification operations even in a huge amount of
dataset. A hyperplane that can exactly divide the high-dimensional dataset is
created to classify it into two parts using required features or attributes. In the
framework presented in this paper, SVM provides better prediction efficiency
than others.
3. Naïve Bayes: A theory based on probability analysis is used in this super-
vised learning approach of naïve-Bayes (NB) classification algorithm. In this
approach, the probability of occurrence related to the required feature is
computed on the entire dataset using the individual factors by the probability
of labels assigned in prior the classification. This classification algorithm can
be most widely used to classify the text-type documents into any one of other
multiple types of predefined categories.
4. K-Nearest Neighbor (KNN): This KNN algorithm is also a type of supervised
learning approach that can be used to classify the data as well as to predict the
result based on regression problems. This algorithm can predict the new data
point that was added to the dataset by using the similarity measure based on
“function similarity.” This algorithm is used when the doctors search for similar
data that were added in dataset. The new point that is added is labeled with the
nearest neighbors, and the number of such nearest neighbors is denoted by the
K for the new point. Predicting this K value can give the prediction output.
5. Decision Tree: This technique constructs decision tree on data samples to
support prediction from each of them, and by using determination, it will select
the simplified solution. From the given dataset, select random samples. Create an
option tree for each sample, and then from each decision tree, get the predicted
result. After that, carry out voting on each predicted outcome. Lastly, select the
outcome of the most excellent vote prediction.
After training the datasets for a target features using the above machine learning
algorithms, the resultant classified data are tested with the current sensor data stored
in the cloud and then selection of the best feature is carried out in the feature selection.
Classification of dataset can be done by the multiple machine learning algorithms
discussed above based on the nature of dataset available. From the resultant feature,
prediction of arrhythmia can be performed by checking the presence and absence of
it.
4 Results
The system developed in this paper initially used an IoT to sense the data from
the patients using the wearable sensors, and then through the processor, real-time
monitoring data of patients like heartbeat and body temperature are recorded by
the ThingSpeak software. Figures 2 and 3 show the response of heartbeat and body
temperature sensors from the ThingSpeak. Then, using this current monitored data
along with the raw datasets, heart diseases like arrhythmia in humans can be predicted
by implementing the machine learning techniques.
The current system considered a raw ECG dataset for the result analysis and
evaluation. There are roughly 16 different arrhythmias types in which each and
every beat data of ECG has recorded at R number of peak locations. Then, the current
dataset considered has 5 number of arrhythmias groups which have 22 number of
beats types under the Association for the Advancement of medical Instrumentation
(AAMI) standards, but an arrhythmia group with only ten number of ECG beat types
is used for the evolution of present framework. The ten number of such ECG beat
types with their number of samples used for both training and testing are shown in
Fig. 2 Response of heartbeat sensor from ThingSpeak

Fig. 3 Response of body temperature from ThingSpeak
Table 1 such as nodal escape beat condition as j, fusion of paced and normal beat
condition as f, fusion of ventricular and normal beat condition as F, ventricular flutter
wave beat condition as !, paced beat condition as P, left bundle branch block beat
condition as L, right bundle branch block beat condition as R, premature ventricular
contraction beat condition as V, atria premature contraction beat condition as A, and
the normal beat as N in Fig. 4.
After training and testing the dataset with multi-machine learning models, the
performance of this arrhythmia prediction in the early stages is evaluated by selecting
the best feature from feature selection and then classified using it as the presence or
Table 1 Types of arrhythmia

Si. No. Beat label Count
with number of samples used
Training Testing
1 j 225 22
2 ! 430 56
3 F 746 82
4 f 856 102
5 V 6584 692
6 R 6796 715
7 P 6250 695
8 N 66,945 6754
9 L 72,680 870
10 A 2326 220
Fig. 4 Prediction accuracy

of multiple machine learning
algorithms
absence of arrhythmia as prediction result. In this paper, different machine learning

methods such as logistic regression, SVM, naïve Bayes, decision tree, and KNN
classifier algorithms with the varying parameters are used for the classification on
data. Accuracy is used to evaluate the result analysis of such classification algorithms
in predicting the arrhythmia. From the result analysis, it was observed that the 79%
of accuracy is achieved by logistic regression, 86% of accuracy is achieved by SVM,
82% of accuracy is achieved by naïve Bayes, 76% of accuracy is achieved by decision
tree, and 74% of accuracy is achieved by KNN algorithm. It can be illustrated that
the SVM achieved a high accuracy compared to other classification algorithms in
predicting the arrhythmia of patients.
5 Conclusion
It is very important to detect and diagnose the arrhythmia appropriately in order to

prevent the loss of human life. In this paper, a proper IoT-based system for health
monitoring of arrhythmia patients using machine learning classification techniques is
proposed for the monitoring and prediction of arrhythmia in early stages effectively.
This system with a combination of IoT and ML framework provided a long time
duration of observations and recordings on arrhythmia. The process of diagnosis is
improved with the classified and visualized data. Therefore, with the IoT and machine
learning techniques, the risk in losing life of arrhythmia patients can be reduced by the
monitoring system presented in this paper by predicting it. From the result analysis,
it was observed that SVM algorithm achieved a higher accuracy compared to other
ML algorithms. The overall framework of current system can facilitate the patients
in predicting the heart diseases at early stages. In the rural areas such as villages in
which no hospital facilities are available, the present developed system can be more
useful for mass health screenings.
References
1. A.M. Rahmani, Z.N. Aghdam, M. Hosseinzadeh, The role of the Internet of Things in
healthcare: future trends and challenges. Comp. Methods Prog. Biomed., 105903 (2020)
2. T. Poongodi, P. Sanjeevikumar, B. Balamurugan, J. Holm-Nielsen, Internet of Things (IoT)
and eHealthcare system—a short review on challenges (2019)
3. M.G. Khari, Securing data in Internet of Things (IoT) using cryptography and steganography
techniques. IEEE Trans. Syst. 50(1), 73–80 (2019)
4. M.I. Khan, M.M. Alam, T. Pardy, A. Kuusik, Y.J.I.A. Le Moullec, H. Malik, A survey on the
roles of communication technologies in IoT-based personalized healthcare applications. IEEE
Access 6, 36611–36631 (2018)
5. S. Askar, S. Sulaiman, Investigation of the impact of DDoS attack on network efficiency of the
University of Zakho. J. Univ. Zakho 3(2), 275–280 (2015)
6. T. Ince, S. Kiranyaz, L. Eren, M. Askar, M. Gabbouj, Real-time motor fault detection by 1-D
convolutional neural networks. IEEE Trans. Ind. Electron. 63, 7067–7075 (2016)
7. M. Pellegrini, P. Pierleoni, L. Pernini A. Belli, S. Valenti, L. Palma, A high reliability wearable
device for elderly fall detection. IEEE Sens. J. 15(8) (2015)
8. L. Palmerini, L. Cattelani, S. Bandinelli, F. Chesani, C. Becker, P. Palumbo, L. Chiari, FRAT-Up,
a rule-based system evaluating fall risk in the elderly, in 2014 Proceedings of IEEE 27th Inter-
national Symposium on Computer-Based Medical Systems, vol 204 (IEEE Computer Society,
2014), pp. 38–41
9. K. Ren, M. Li, W. Lou, S. Yu, Y. Zheng, Scalable and secure sharing of personal health records
in cloud computing using attribute-based encryption. IEEE Trans. Parallel Distrib. Syst. 24(1),
131–143 (2013)
10. H.-C. Shin, M.R. Orton, D.J. Collins, S.J. Doran, M.O. Leach, Stacked autoencoders for unsu-
pervised feature learning and multiple organ detection in a pilot study using 4D patient data.
IEEE Trans. Pattern Anal. Mach. Intell. 35(8) (2013)
11. T.C. Silva, L. Zhao, Network-based stochastic semisupervised learning. IEEE Trans. Neural
Netw. Learn. Syst. 23(3) (2012)
12. L. Atallah, B. Lo, R. Ali, R. King, G.Z. Yang, Real-time activity classification using ambient
and wearable sensors. IEEE Trans. Inf. Technol. Biomed. 13(6), 1031–1039 (2009)
EHR-Sec: A Blockchain Based Security
System for Electronic Health
Siddhesh Deore, Ruturaj Bachche, Aditya Bichave, and Rachana Patil
Abstract The technological advancements in healthcare management have resulted

in increased digitization of health records, however, issues relating to lack of interop-
erability, security and proper infrastructure seem to cause a hinderance in complete
and seamless delivery of healthcare services. Electronic Health Record (EHR)
management systems developed on principles of secure data sharing and storage
can help healthcare professionals for better collaboration and in making sound deci-
sions regarding patient diagnosis and treatment. We propose an EHR management
system based on objectives such as secure storage, increased availability, effective
sharing and trust between stakeholders in the healthcare ecosystem like patients,
doctors and government agencies. The proposed EHR system works in a permis-
sioned blockchain framework based on the Hyperledger Fabric platform. Patients
and medical professionals can interact with the EHR system using a web-based
interface for managing EHRs.
Keywords Blockchain · Electronic health records · Hyperledger fabric · Medical

history · Security
1 Introduction
In recent years, technology has had a massive impact on the healthcare sector such
as disease prediction, digitization of health records, settlement of insurance claims,
etc. Electronic Health Record (EHR) is a digitalized version of a patients’ health
record that contains information like medical history in terms of diagnosis, treatment,
medical prescriptions, laboratory reports, demographic information and insurance
details [1]. The Ministry of Health and Family Welfare (MoHFW) of the Govern-
ment of India notified the Electronic Health Record (EHR) Standards for India in
2013 and revised in 2016 [2]. According to the EHR Standards, ‘Any person in India
S. Deore (B) · R. Bachche · A. Bichave · R. Patil

Department of Computer Engineering, Pimpri Chinchwad College of Engineering, Pune, India
R. Patil
e-mail: rachana.patil@pccoepune.org
296 S. Deore et al.
can go to any health service provider, any diagnostic center or any pharmacy and
yet be able to access and have fully integrated and always available health records
in an electronic format.’ The standards lay down a framework for EHR systems to
have secure storage, increased availability, efficient data exchange and proper access
control mechanism [3]. EHR systems available today in India follow these stan-
dards to varying degrees of conformance. Efforts are being made by various State
as well as Union Governments to use advancements in Information and Communi-
cation Technologies (ICT) in developing EHR systems. Also, in the private sector
corporate hospitals like Max Health, Apollo, Sankara Nethralaya have successfully
implemented EHR systems [4]. However, EHRs are rarely exchanged between hospi-
tals causing less interoperability among healthcare entities. As a result, data sharing
among healthcare entities is often inefficient and time-consuming. EHR systems are
prone to security data breaches [5, 6]. In 2019 alone, over 41 million health records
were stolen, exposed or disclosed without permission which was 195% higher than in
2018 [7]. Current EHR systems also suffer from lack of total availability which makes
it harder to deliver healthcare services in times of natural disasters like earthquake,
floods, etc. Apart from issues discussed above like interoperability, privacy and avail-
ability, EHR systems fail in being patient centric. The EHR standards for India make
it imperative for healthcare entities to store data on behalf of the patient with owner-
ship of the data residing with the patient [2]. EHR systems with poorly designed
interfaces have shown to cause time pressure and psychological stress among nurses
leading to inconsistent healthcare delivery [8].
The integration of technology and digitization in healthcare technology is creating
both the opportunities and challenges. The ease of handling patient data is coming
with the risk of compromise of data due to cyber-attacks [9, 10]. The data breach on
EHR can impact the healthcare system as well as the availability of data [11].
Blockchain has shown promise with its decentralized peer to peer network that
provides an immutable and a transparent ledger of transactions. It enables to create
trust, transparency and accountability among various stakeholders of the Blockchain
network.
In this paper, we propose EHR-Sec, a secure platform for preserving patient
privacy, increased availability, efficient data sharing for enhanced collaboration
among healthcare providers based on a permissioned Blockchain network of partic-
ipating healthcare entities. The EHR system works in a permissioned Blockchain
consortium of healthcare organizations, government agencies, etc., based on the
open-source Hyperledger Fabric platform hosted by the Hyperledger project. Hyper-
ledger Fabric platform supports a permissioned membership based modular archi-
tecture model with higher transaction throughput than other Blockchain platforms
without the need of any cryptocurrency for transaction execution [12].
EHR-Sec: A Blockchain Based Security System for Electronic Health 297
2 Background
Blockchain
Blockchain is a decentralized peer to peer network with an immutable distributed
ledger which is replicated over multiple nodes containing transactions performed
over the network [13, 14]. Initially proposed as a framework for peer to peer transfer
of Bitcoin it has since found its use in many other non-financial use cases. Apart
from being decentralized, it is secure and resilient to node failures resulting in
high availability of any Blockchain based network, which makes it preferrable to
use in developing applications for healthcare, supply chain management, gover-
nance, etc. Blockchain broadly can be classified as public, private and permissioned
Blockchain. Bitcoin uses public Blockchain where any peer can participate in the
network without revealing its identity, whereas private Blockchain restricts entry of
public peers on the network without validation by the network operator. Permissioned
Blockchain allows for peer to join the network after verification with designated
permissions to perform specific set of activities on the network. The ledger is a chain
of blocks that contains transactions validated by nodes using a consensus protocol.
The consensus protocol ensures that all participating nodes have similar view of the
ledger. Blockchain is designed to be Byzantine Fault Tolerant (BFT), meaning that
even if some peers behave against the network or faulter, and the network reaches
to a final consensus. This consensus is achieved by consensus algorithms like the
Proof of Work (PoW) consensus used in the Bitcoin Blockchain that requires nodes
to perform some computational task ‘work,’ to achieve consensus and prevent mali-
cious nodes form disturbing the integrity of the network [13]. Newer Blockchains
like Ethereum have proposed smart contracts: a program containing some business
logic running on the Blockchain which users can interact with [14].
Hyperledger Fabric
Hyperledger Fabric is an open-source permissioned Blockchain platform hosted
under the Hyperledger project. Hyperledger Fabric platform provides a frame-
work for developing enterprise-grade applications for healthcare, supply chain
management, governance, etc. Hyperledger Fabric supports a modular architecture
that comprises of components like ordering service, Membership Service Provider
(MSP), peer to peer gossip service, smart contract, ledger and pluggable endorsement
and validation policy [12]. Hyperledger Fabric proposes the idea of organizations
also known as members with peers as an endpoint to it. Peers on the network host
ledger and smart contracts that enquire the ledger. The ledger comprises of blockchain
that contains transactions executed on the network and a world state implemented
as a database that holds current value of business attributes. The idea that multiple
organizations can come together and form a consortium enables for secure data
sharing across them. The MSP is a trusted authority that provides identities to actors
on the network that determine access permissions to information and resources. The
Certificate Authority (CA) issues digital certificates providing an identity to the actor
for executing transactions on the network. Chaincode holds the business logic that
298 S. Deore et al.
enables application users to create new facts and query into the ledger. Transac-
tion execution in Hyperledger Fabric follows the execute-order-validate architecture
[12]. The endorsers execute a transaction after which the client on collecting enough
endorsements submits it to the ordering service. The ordering service then establishes
an order among submitted transactions leading to creation of block and sends the
same to peers on the network. In validation phase, every committing peer validates
against the endorsement policy. The valid transactions are then applied to the world
state in the form of CouchDB or LevelDB.
3 System Design Model of EHR-Sec
EHR-Sec is a privacy preserving and patient-centric EHR management system built

upon a permissioned Blockchain network of participating healthcare entities based
on the Hyperledger Fabric platform in Fig. 1. EHR-Sec is based upon three main
objectives: availability of the system, confidentiality and integrity of stored EHRs.
Participating healthcare entities primarily are hospitals which can be extended to
include government agencies, pharmacies, insurance companies and independent
diagnostic laboratories. In the consortium, hospitals function as organizations that
can house one or more peer nodes. All hospitals are connected to each other on
the network via a main channel. Hospitals that belong to the same company can
exchange company specific information on a separate channel not visible to other
hospitals outside of the company. Primary actors of the EHR management system
are patients and doctors. Each hospital has an admin that has the responsibility of
Fig. 1 System architecture for EHR-Sec; sample network of four hospitals where actors interact
using a web-based UI that invokes API calls to the Blockchain network using SDK provided by the
Hyperledger Fabric project
registering the actors using the Certificate Authority (CA). These actors can interact
with the Blockchain network using a web-based application interface. The idea that
a patient reserves the right to provide access of his/her EHR to a doctor makes it
patient centric, providing ownership to the patient which is as per the guidelines
notified by the EHR Standards for India (2013).
However, in situations of emergency when a patient is incapable of providing
access to the consulting doctor, EHR-Sec makes necessary provisions for access
control. EHR storage in EHR-Sec follows a two-tier approach: EHR metadata and
data about access permissions are stored on-chain, and actual EHR data is stored
on an off-chain cloud server. The off-chain cloud server is chosen as per the Health
Insurance Portability and Accountability Act of 1996 (HIPAA) [15]. The data stored
in the cloud server is encrypted to disallow information leak to unauthorized access.
4 Component Specifications
Hospitals
Hospitals that resemble organizations in the Hyperledger Fabric network should
ideally house two peers in Fig. 2. API calls made to the network invoke the chain
code installed on the peers to query the ledger. CouchDB is chosen as the ledger
world state database as it supports rich set of queries and data values modeled in
JSON format [16]. The CouchDB is the on-chain data store. The hospital admin has
Fig. 2 Block diagram of a hospital (organization) and its comprising components

300 S. Deore et al.
the responsibility of registering doctors and patients specific to that hospital. The
digital identity stored in the wallet is of the form X.509 standard and exposes the
public key that acts as an authentication anchor for transactions executed by the actor.
Ordering Service
Hyperledger Fabric provides three types of ordering service nodes: Solo, Kafka and
RAFT. With the recent version of Hyperledger Fabric 2.x, Solo and Kafka have been
deprecated [16]. RAFT is the recommended ordering service by Hyperledger Fabric
and is crash fault tolerant (CFT). RAFT follows a ‘leader-and-follower’ model where
the dynamically elected leader forwards the messages to the rest of the follower nodes
[17]. RAFT requires 2n + 1 node to manage n faults. Considering availability of the
EHR management system as a prime objective without compromising throughput,
EHR-Sec uses a 5 node RAFT ordering service.
Off-Chain Data Storage

Every new transaction created on the Blockchain is added to the ledger replicated
on every peer of the Blockchain network. EHR data consist of medical prescrip-
tions, patient personal information like name, date of birth, address, etc., diagnostic
reports like CT scan, MRI scan, sonography, blood reports, etc., and insurance details,
etc. Storing data of this large size on the Blockchain network will not be feasible
and will have serious performance issues [18]. And so, it becomes imperative to
find ways to store data off-chain. EHR-Sec proposes cloud server for off-chain
data storage needs. The fact that cloud services have become scalable, accessible
and highly available in recent times make it optimal for storing sensitive informa-
tion like EHRs. Various cloud service providers provide HIPAA compliant storage
services like Google Cloud, Microsoft OneDrive, Amazon AWS and Box. According
to HIPAA, cloud service providers need to sign business associate agreement, have
capability for data classification, infrastructure to ensure confidentiality, integrity
and availability for storing EHRs. The choice of HIPAA compliant cloud server is
to be made as per the prevailing conditions regarding service provider availability,
demographics, legal restrictions, etc.
4.1 On-Chain Data Storage
The fact that a patient is in control of his/her data and can provide access to consulting
doctor makes EHR-Sec patient-centric. EHR metadata and data related to access
permissions is stored on-chain. Doctors need to request a patient to gain the ability
to access, edit and update a patient’s EHR. Upon request, the patient can then grant
the necessary access to the doctor. This transaction of granting access is recorded
on the Blockchain. The access permission data consists of attributes like Doctor-ID,
access type: read, update or append, timestamp of access granted and timestamp
till access is granted. In emergency situations when patient is unable to provide
access, the administrator of that hospital can execute a transaction that temporarily
grants automatic access to the doctor. This is achieved by using a Boolean state
variable depicting a normal state to be of value ‘false’ and an emergency state to
be of value true. EHR metadata also resides on-chain that gives information about
the EHR data stored on off-chain cloud server. It includes attributed like patient
unique identification number and type and size of EHR data stored. EHR Standards
for India make it obligatory for patients to have a unique identification like UIDAI
issued Aadhar number which is used by EHR-Sec.
4.2 EHR Data Encryption
The highly sensitive EHR data stored on the cloud server as well as when trans-
ferred among actors on the network requires to be encrypted to prevent information
leak to any unauthorized access. To be able to encrypt the EHR data which can
only be decrypted by the actor to whom such permission is granted requires use of
cryptography techniques. EHR-Sec proposes a combination of symmetric key and
asymmetric key cryptography. The EHR data is encrypted using a symmetric key
which itself is encrypted by using public key of the patient. This encrypted EHR
data is then stored on the cloud server. When a patient wants to access his/her EHR
data, the encrypted symmetric key is decrypted by using the private key which then
decrypts the EHR data using this decrypted symmetric key. In cases where the access
is to be given to different doctors, EHR-Sec proposes a proxy re-encryption scheme.
Proxy re-encryption allow for proxies to alter a ciphertext encrypted for one party
into a ciphertext that can be decrypted by another party by using a re-encryption key.
When a patient grants access to a doctor, the patient sends a re-encryption key to the
proxy that is generated using the patient’s private key and the doctor’s public key.
This re-encryption key is then used by the proxy to encrypt the already encrypted
symmetric key. The re-encrypted symmetric key is decrypted by the doctor’s private
key. EHR-Sec uses NuCypher for data sharing between using proxy re-encryption.
It uses a decentralized network to delegate encryption and decryption rights over
the network, thus splitting trust among multiple nodes and allows for specifying a
minimum requirement of nodes for the provision of the key to decrypt [19, 20]. This
way data can only be decrypted by the actor who has been given access, without
revealing the private credentials of the patient. Any other actor, malicious or not,
who has not been given access can never access the EHR data in normal form in
Fig. 3.
5 Conclusion
Blockchain technology with its core features of secure data storage, immutability
of transactions recorded, fault tolerance, availability and transparency enable for
302 S. Deore et al.
Fig. 3 Flow of EHR data encryption in EHR-Sec
its use in developing healthcare related applications. The proposed EHR manage-
ment system, EHR-Sec ensures the privacy, security, interoperability and availability
of EHRs. EHR-Sec built as per the EHR Standards for India and HIPAA can be
implemented with a consortium of healthcare institutions later on extending to other
stakeholders in the ecosystem. Hospitals implementing this system for daily EHR
management can expect better collaboration, increased quality of healthcare delivery
and better decision making by healthcare professionals.
References
1. C.S. Kruse, M. Mileski, A.G. Vijaykumar, S.V. Vishwanathan, U. Suskandla, Y. Chidambaram,

Impact of electronic health records on long-term care facilities: systematic review. JMIR Med.
Inform. 5 (2017)
2. EHR Standards for India, https://www.nrces.in/standards/ehr-standards-for-india
3. R.Y. Patil, S.R. Devane, Unmasking of source identity, a step beyond in cyber forensic, in
Proceedings of the 10th International Conference on Security of Information and Networks
(2017), pp. 157–164
4. S.K. Srivastava, Adoption of electronic health records: a roadmap for India. Healthc. Inf. Res.
22, 261–269 (2016)
5. P.R. Yogesh, S.R. Devane, Primordial fingerprinting techniques from the perspective of digital
forensic requirements, in 2018 9th International Conference on Computing, Communication
and Networking Technologies (ICCCNT ) (IEEE 2018), pp. 1–6
6. D. Bhole, A. Mote, R. Patil, A new security protocol using hybrid cryptography algorithms.
Int. J. Comput. Sci. Eng. 4(2), 18–22 (2016)
7. Healthcare Data Breach Report, Dec 2019. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC
5116537
8. T. Vehko, H. Hyppönen, S. Puttonen, S. Kujala, E. Ketola, J. Tuukkanen, A.M. Aalto, T.
Heponiemi, Experienced time pressure and stress: electronic health records usability and infor-
mation technology competence play a role. BMC Med. Inform. Decis. Mak. 19(160), 1–9
(2019)
9. R.Y. Patil, S.R. Devane, Network forensic investigation protocol to identify true origin of cyber
crime. J. King Saud Univ. Comput. Inf. Sci. (2019)
10. P.R. Yogesh, Backtracking tool root-tracker to identify true source of cyber crime. Proc.
Comput. Sci. 171, 1120–1128 (2020)
11. R.Y. Patil, S.R. Devane, Hash tree-based device fingerprinting technique for network forensic
investigation, in Advances in Electrical and Computer Technologies (Springer, Singapore,
2020), pp. 201–209
12. E. Androulaki, Hyperledger fabric: a distributed operating system for permissioned
blockchains, in EuroSys’18 (2018)
13. S. Nakamoto, Bitcoin P2P e-cash paper. https://bitcoin.org/en/bitcoin-paper
14. Ethereum whitepaper. https://ethereum.org/en/whitepaper/
15. J. Kulynych. D. Korn, The new HIPAA (health insurance portability and accountability act of
1996) medical privacy rule: help or hindrance for clinical research, 108, 912–914 (2003)
16. Hyperledger fabric documentation. https://hyperledger-fabric.readthedocs.io/en/release-2.2
17. D. Ongaro, J. Ousterhout, In search of an understandable consensus algorithm (extended
version), https://raft.github.io/raft.pdf
18. Why new off-chain storage is required in blockchains, in IBM Storage (IBM Corporation,
2018)
19. R. Gondhali, A. Patil, O. Kharatmol, R.Y. Patil, Electronic medical records using blockchain.
Int. J. Adv. Res. Sci. Commun. Technol. 5(5) (2020)
20. M. Egorov, M. Wilkison, D. Nunez, Nucypher kms: decentralized key management system.
https://arxiv.org/abs/1707.06140
End-to-End Speaker Verification
for Short Utterances
S. Ranjana, J. Priya, P. S. Reenu Rita, and B. Bharathi
Abstract Speaker verification is the process used to verify a speaker from his/her
voice characteristics. Given a speech segment as input and the target speaker data, the
system automatically determines whether the target speaker spoke the test segment.
There are many methods of bio-metric verification like a fingerprint, iris scanning
signatures, etc. In this list, speech specific authentication is not as much reliable as
other methods. Hence, we would like to develop a reliable speaker verification model.
Recent advances in deep learning have facilitated the design of speaker verification
systems that directly input raw waveforms. Though developing a model with raw
waveforms is complex in speech processing, it would yield an end-to-end system,
which reduces the time and power of feature extraction. To achieve the motive of
end-to-end speaker verification, we propose to use Raw Waveforms as input. The
development of such a system is possible without much domain knowledge of feature
extraction. Moreover, the availability of a large dataset eases the development of the
end-to-end system. The later part of the proposed system also includes analyzing the
model’s performance using a short utterance dataset to make the model more user-
friendly and reduce computation power. Hence, we plan to analyze and improvise
RawNet (Jung et al. in Proceedings of Interspeech, pp. 3583–3587, 2020 [1]) for
short utterances.
Keywords Speaker verification · End-to-end system · Raw waveforms · RawNet ·

VoxCeleb dataset · Short utterance
S. Ranjana (B) · J. Priya · P. S. Reenu Rita · B. Bharathi

Department of CSE, Sri Siva Subramaniya Nadar College of Engineering, Chennai, India
e-mail: ranjana17127@cse.ssn.edu.in
J. Priya
e-mail: priya17117@cse.ssn.edu.in
P. S. Reenu Rita
e-mail: reenurita17128@cse.ssn.edu.in
B. Bharathi
e-mail: bharathib@ssn.edu.in
306 S. Ranjana et al.
1 Introduction
Speech processing is the study of speech signals and its processing methods. Aspects
of speech processing are acquisition, manipulation, storage, transfer and output of
speech signals. Sub domains of speech processing include speech coding, synthesis
[2], enhancement [3, 4], speech and speaker recognition and speaker identification
and verification. The crucial aspects of a speaker verification system are feature
extraction and model building. Extraction of the features from the sound signal is
essential as speaker verification systems largely depend on speaker-specific char-
acteristics. MFCC [5] is a low-dimensional representation of the audio input. The
signal is divided into smaller frames, and corresponding frequencies are identified.
These are passed through a Mel-Filterbank to obtain a vector. The performance of
MFCCs degrade rapidly in real-world noise [6]. The noise signal alters all MFCC
if at least one frequency band is distorted. When FFT was applied, it gave only
frequency values, and time information was lost. Linear Frequency Cepstral Coef-
ficients (LFCC) consistently outperforms MFCC mainly due to its better perfor-
mance in the female trials [7]. The relatively shorter vocal tract can explain this
in females and the resulting higher formant frequencies in speech. In the presence
of different forms of additive noise and reverberant conditions, Power Normalized
Cepstral Coefficients (PNCC) processing offers significant improvements in recogni-
tion performance compared to MFCC and PLP processing, with just a minor increase
in computational cost over traditional MFCC processing [8]. Another commonly used
feature extraction technique is spectrogram [9]. Unlike MFCC, here, time domain is
represented by the window numbers. 2D matrix represents the frequency magnitudes
(columns) and time for a given signal (rows-window number). A waveform-based
CNN that directly takes raw speech signal as input has been used in many studies
to resolve this issue such as in speech verification [1], speaker recognition, voice
activity detection.
2 Related Work
For a long time, Gaussian Mixture Models (GMMs) trained on low-dimensional

feature vectors dominated speaker identification [10, 11]. It is a clustering-based
approach that does not use labels. The UBM technique reduces the time for recog-
nition and verification significantly (GMM-UBM) [3, 9]. The i-vector represents an
utterance in a low-dimensional space named total variability space. This approach
reduces high-dimensional sequential input data to a low-dimensional fixed-length
feature vector while retaining the most relevant information. The i-vector system
consists of a front-end and back-end. The former consists of cepstral extraction and
the latter includes dimensionality reduction and scoring. DNN architecture or CNN
architectures can be applied directly to raw spectrograms and trained in an end-to-end
manner. The feature extraction output is a variable-length feature vector depending
End-to-End Speaker Verification for Short Utterances 307
Table 1 Comparison of various models in speaker verification

Paper Model Feature extraction Dataset Error (%)
Paper [9] GMM-UBM Spectrogram VoxCeleb1 15.0
Paper [9] VGG Spectrogram VoxCeleb1 7.80
Paper [5] GMM-UBM + i-vector MFCC NIST’08 16.62
Paper [12] RACNN-LSTM RAWNET VoxCeleb1 4.80
Paper [1] CNN-GRU RAWNET VoxCeleb2 3.52
on the length of the input utterance. Average pooling layers have been used [9] to
aggregate frame-level feature vectors to obtain a fixed-length utterance-level embed-
ding. The network is further trained for verification using the contrastive loss [9]
or other metric learning losses such as the triplet loss. Similarity metrics like the
cosine similarity [12] or PLDA are often adopted to generate a final pairwise score.
The literature survey consolidated in Table 1 shows that raw waveforms as input and
CNN-GRU architecture yield better performance for the speaker verification system.
Hence, we propose to develop an end-to-end speaker verification system using raw
waveforms [1] and for short utterances of raw waveforms [11].
3 Proposed System
After a thorough literature survey of existing models, end-to-end system having a

front-end architecture of CNN-GRU is considered, which inputs raw waveforms
and generates speaker embeddings for the utterances, which are later analyzed. The
architecture of the model [1] is given in Fig. 1.
3.1 Raw Waveforms
In speech verification systems, models are fed with intermediate features like MFCCs
and spectrograms [9]. However, the input has only limited spectral information due
Fig. 1 CNN-GRU architecture RawNet [1]

to the provided filter bank type and magnitude compression which affects the model
architecture. A waveform-based CNN that directly takes raw speech signal as input
has been used in studies of speech verification [1], speaker recognition and voice
activity detection to resolve this issue.
3.2 Sinc-Convolution Layer
SincNet processes raw audio samples and learns powerful features using a Deep
Learning model. The parametrized sinc functions, which replaces the DNN’s first
layer, implement band-pass filters and are useful in convolving the waveform to
extract low-level features. Here, the network learns more relevant features and
improves the model’s convergence time due to its significantly fewer parameters.
The SincNet uses a Softmax layer at the top responsible for mapping the network’s
final features into a multi-dimensional space corresponding to various speakers. In
contrast to standard CNNs, which learn all filter elements, the proposed method
learns only the low and high cutoff frequencies directly from data. This provides an
efficient way of generating a customized filter bank.
3.3 RawNet CNN-GRU Architecture
RawNet is a speaker embedding extractor that takes raw waveforms as input and
produces speaker embeddings for speaker verification without using any prepro-
cessing techniques. RawNet adopts a convolutional neural network-gated recurrent
unit (CNN-GRU) architecture. The scale of a given feature map is modified using
filter-wise feature map scaling (FMS) technique to derive more discriminative repre-
sentations. The front CNN layers comprise residual blocks, followed by a max-
pooling layer which extracts frame-level features. GRU layer aggregates frame-level
features into an utterance-level representation.
3.4 Speaker Embeddings
It uses neural networks to encode the speaker attributes of an utterance into a fixed-
length vector irrespective of the length of the utterance.
Verification by cosine similarity: It calculates similarity of two vectors by measuring
the cosine of the angle between them. For the sample audio passed into the network,
its vector is calculated and the cosine similarity is computed between the vector of
sample audio and the vector of the claimed speaker. The score lies between 0 and 1.
4 Experimental Setup and Implementation
4.1 Dataset Collection and Preprocessing
VoxCeleb1 Dataset: In Table 2, VoxCeleb1 holds over 100,000 speech segments

for 1251 celebrities, obtained from videos in YouTube. The datasets are reasonably
gender balanced (55% male, 45% female), and the videos were filmed in challenging
visual and auditory environments. The Voxceleb1 dataset is used for testing the
model.
VoxCeleb2 Dataset: VoxCeleb2 comprises nearly 1 million utterances for 6,112
celebrities, obtained from YouTube videos. In this dataset, there are audio and video
files from which only the audio parts are considered. The VoxCeleb2 dataset is used
for training the model.
Data Format Conversion: The wave files in the VoxCeleb2 dataset are in m4a format,
which is incompatible for further processing. Pydub module in python helps to work
with audio files. Pydub.AudioSegment.from\_file() converts the given.m4a audio
sample to.wav format and writes it to the specified path in Table 3.
Generation of Short Utterance Dataset: The audio files (average size of four
seconds) of the VoxCeleb2 dataset were divided into two-second audio files using the
Pydub library. Some parts of the audio had zero energy level, which means silence.
These portions were eliminated as they could affect performance.
Ground Truth Values: Ground truth values are used for testing the model. It refers
to the efficiency of the training set’s classification. In the text file, same speaker
utterances are labeled as 1 and different speakers labeled as 0.
Table 2 Data statistics for

No. of speakers 1251
VoxCeleb1
No. of male speakers 690
No. of female speakers 561
No. of hours 352
No. of utterances 153,516
Avg. no. of utterances per speaker 116
Avg. length of utterances 8.2
Table 3 Data statistics for

No. of speakers 6112
VoxCeleb2
No. of videos 150,480
No. of utterances 1,128,246
4.2 Training the Model
Figure 2 elaborates on the various layers in CNN-GRU architecture. Input and output
vector shapes of each layer are mentioned on the respective arrows.
Convolution Neural Network (CNN): The layers of CNN are organized in dimen-
sions of width, depth and height. The neurons in one layer bind to only a small
portion of the neurons in the next layer rather than connecting to all of them. The
output is reduced to a single vector of probability scores, organized along the depth
dimension. To classify an object, each input image will move through a series of
layers with filters, pooling, completely connected layers and the SoftMax feature.
Gated Recurrent Unit (GRU): GRU is an improvised version of conventional recur-
rent neural networks. GRU uses an update and reset gate to solve the vanishing
gradient problem. They can be trained to retain only required information from the
past and discard information unrelated to the prediction.
Max Pooling: It selects the maximum element from the region of the feature map.
The output is a feature map containing the most prominent features. They minimize
the dimension of feature maps, thereby reducing the number of learning parameters
and the amount of computation performed in the network.
Fig. 2 Layers in the CNN-GRU architecture

Batch Normalization: Batch normalization is a technique for training intense neural

networks that standardize the inputs to each mini-batch layer. This reduces the
number of training epochs required to train deep networks, reducing training time. It
enables larger learning rates which shortens covariate shift and time of convergence.
Leaky ReLU: In some cases, ReLU neurons die for all inputs and remain inactive. If
no gradient flows and a large number of dead neurons are present, the neural network
performance is affected. This can be rectified by Leaky ReLU, where the slope is
changed left of x = 0, which extends the range of ReLU by causing a leak.
5 Results and Performance Analysis
The model was trained with complete VoxCeleb2 dataset for 8 epochs. VoxCeleb1
was used for validating and testing after which an EER of 0.04 was observed.
5.1 Testing with Short Utterance Dataset
We used the complete VoxCeleb2 dataset in training and the generated short dataset
from VoxCeleb1 as testing data. We have obtained an error rate of 12% which is
greater than the error of long data speaker verification. While comparing the previous
two models, we find that longer dataset performs well in the given situations in Table
4.
Short Utterance Improvisation: We have trained both the models for 4 epochs and
included only less than 10% of the dataset. In the comparison shown in Table 5, we
can infer that training model with short data and using time augmented evaluation in
testing improves the accuracy of short utterance speaker verification system.
The reasons we infer for this change in performance is:
Table 4 Comparison of baseline and short utterance testing

Training data Testing data Test audio duration (s) No. speakers Error (%)
VoxCeleb2 VoxCeleb1 4–8 9000 4
VoxCeleb2 VoxCeleb1 2 9000 12
Table 5 Comparative analysis after time augmentation (4 epochs)

Training dataset Testing dataset Audio duration (s) No. speakers Error (%)
VoxCeleb2 VoxCeleb1 4–8 100 27
VoxCeleb2 Short VoxCeleb1 Short 2 100 23
Training with Short utterance data: The model is able to learn well from the
features of shorter utterances. And the uniform length of the utterances adds up to
the better learning. The gender differences are also seen less in shorter utterances.
Hence, this aids in better performance.
Time augmented evaluation: This method is used in testing of the model. This
includes augmenting or appending the short data in testing phase to produce longer
data which helps in better performance, while testing.
6 Conclusion
Speaker verification was proposed as the project title using raw waveforms. Raw
waveforms were considered due to its ability of the systems end-to-end development.
RawNet [1] was considered as baseline and system for both long utterance and short
utterance data was developed. In [1], an error rate of 3% was reported when developed
for 100 epochs with files in.m4a audio format. After implementation of the same for
8 epochs, we have reported an error rate of 4.3%. Generation of short utterance data
from existing VoxCeleb1 dataset with duration of 2 s was done. When trained using
VoxCeleb2 dataset and tested using VoxCeleb1 short dataset, an error of 12% was
reported. Through this analysis, we have inferred that long utterance data performs
better than short utterance. Hence, to improvise the performance of short utterance
data, we have implemented the concept of time augmented evaluation. Here, we
concatenate the smaller speech segments and make the system presume it as it is a
long utterance data. Here, we could improve the accuracy of the system. Another
way used to improve performance of short utterance data is the removal of noisy and
silent parts of the dataset for better learning.
References
1. J.W. Jung, S.B. Kim, J.H. Kim, H.J. Shim, H.J. Yu, Improved RawNet with feature map scaling
for text-independent speaker verification using raw waveforms, in Proceedings of Interspeech
2020 (2020), pp. 3583–3587
2. B. Tarakeswara Rao, R.S.M.L. Patibandla, M.R. Murty, A comparative study on effective
approaches for unsupervised statistical machine translation, in Proceedings of AISC Springer
Conference, vol. 1076 (2020), pp. 895–905. Z. Michalewicz, Genetic Algorithms + Data
Structures = Evolution Programs, 3rd edn. (Springer, Berlin, Heidelberg, New York, 1996)
3. R. Zheng, B. Xu, S. Zhang, Text-independent speaker identification using GMM-UBM and
frame level likelihood normalization, in Proceedings of International Symposium on Chinese
Spoken Language Processing (Hong Kong, China, 2004), pp. 289–292
4. A.S.V. Praneel, T. Srinivasa Rao, M. Ramakrishna Murty, A survey on accelerating the classifier
training using various boosting schemes within cascades of boosted ensembles. Proc. Int. Conf.
Springer SIST Ser. 169, 809–825 (2019)
5. A. Poddar, M. Sahidullah, G. Saha, Speaker verification with short utterances: a review of
challenges, trends and opportunities. IET Biom. 7(2), 91–101 (2018)
6. J. Hansen, B. Pellom, R. Sarikaya, U. Yapanel, Robust speech recognition in noise: an evaluation

using the SPINE corpus, in Proceedings of Eurospeech (2001), pp. 905–911
7. N. Dave, Feature extraction methods LPC, PLP and MFCC in speech recognition. Int. J. Adv.
Res. Eng. Technol. 1(6), 2320–6802 (2013)
8. C. Kim, R.M. Stern, Power-normalized cepstral coefficients (PNCC) for robust speech
recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1315–1329 (2016)
9. A. Nagrani, J.S. Chung, W. Xie, A. Zisserman, VoxCeleb: large-scale speaker verification in the
wild. Comput. Speech Lang. 60, 0885–2308 (2018). https://doi.org/10.1016/j.csl.2019.101027
10. D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification using gaussian
mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
11. M. Navyasri, R. RajeswarRao, A. DaveeduRaju, M. Ramakrishnamurthy, Robust features for
emotion recognition from speech by using Gaussian mixture model classification, in Proceed-
ings of SIST Series, vol. 2 (Springer, 2017), pp. 437–444. J. Hansen, B. Pellom, R. Sarikaya,
U. Yapanel, Robust speech recognition in noise: an evaluation using the SPINE corpus, in
Proceedings of Eurospeech (2001), pp. 905–911
12. J.W. Jung, H. Heo, H. Shim, I. Yang, H. Yu, A complete end-to-end speaker verification
system using deep neural networks: from raw signals to verification result, in Proceedings
of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
(2018), pp. 5349–5353
A Comprehensive Analysis on Multi-class
Imbalanced Big Data Classification
R. Madhura Prabha and S. Sasikala
Abstract In real life, most of the big dataset is skewed in nature. Domains like
medical diagnosis, bioinformatics, banking theft, natural disaster, network intru-
sion, oil-spill detection, instrument failure and army crimes have dataset which are
not always balanced; rather, imbalanced dataset is the common occurrence. This
will lead to biased classification if the positive class instances are very few when
compared with negative class instances. Multi-class imbalanced big data learning is
a challenging research topic in big data analytics. In this review paper, we analyse
the conventional balancing techniques of data-level, algorithm-level approaches and
ensemble techniques which balance multi-class imbalanced dataset. Though these
existing ensemble techniques balance the dataset, it cannot be applied to streaming
data and has scalability issues. This paper is intended to analyse different techniques
to develop a novel ensemble technique to learn, balance and pre-process the big data
stream for classification by analysing different techniques and its accuracy levels.
Keywords Big data · Imbalance · Balancing · Multi-class · Sampling ·

Pre-processing · Classification · Ensemble · Majority class · Minority class ·
SMOTE · Datastream · Batch-incremental · Undersampling · Oversampling ·
Misprediction
1 Introduction
In most of real-world dataset, there are at least 1 million instances and 100 features,
without a single well-defined target class. From these instances, interesting cases
have a frequency of less than 0.01 [1]. This impacts the result from slight variation
to serious challenges to the standard methods of classification and regression. The
accuracy and efficiency of existing methods are significantly affected by overfitting,
creating a bias towards the class which has more samples [2].
R. Madhura Prabha (B)

Department of Computer Science, University of Madras, Chennai, India
S. Sasikala
Department of Computer Science, IDE University of Madras, Chennai, India
316 R. Madhura Prabha and S. Sasikala
When a medical dataset is having multiple diseases with imbalanced data, we

may get a chance of wrong diagonalization which causes life threatening wrong
treatment. Similarly, in the army people dataset, if we misclassify the fraudulent
person as normal, it will cost the safety of the country. In a same way if any biased
classification in natural disaster analysis, it will miss out the upcoming disaster will
be more cost effective.
Pre-processing of multi-class imbalanced big data is a thought-provoking research
topic. This creates a way to the creation of new techniques, algorithms, tools and
frameworks to learn and pre-process the big data [3]. Since we always apply classi-
fication techniques without considering the data spreading ratios, all class instances
should be equally distributed for creating a good training data.
This paper reviews the imbalanced big data, data distribution of binary and
multi-class big data. It also analyses the difficulties in balancing the data and
creating training models. Conventional balancing methods, algorithms and ensemble
techniques are analysed to resolve these difficulties.
2 Literature Review
2.1 Imbalanced Data
In big data mining, classification is the widespread category which is used to predict
and analyse the values from the given classes. There are many standard classifiers
like decision tree, K-nearest neighbors (KNN), support vector machine (SVM), etc.
are used for classification. Even though, we have many classifiers which are not
capable of handling big datasets.
Devi et al. in [4] recommended a MapReduce framework based on Bat Feature
Selection method which is adaptable for large dimensional data and leverages
efficiency of parallel algorithms.
But in numerous real-world big data, we have many different classes where few
classes are having more samples and other classes are having only less samples.
Classes which are having more samples are called majority classes and classes which
are having less samples are called minority classes [5]. The dataset which has many
minority classes and few majority classes is called imbalanced data. Since all clas-
sifiers assume that entire dataset is distributed evenly, it shows biased results for
imbalanced big data classification with standard algorithms [6].
Class imbalance is a general problem for all domains like medical, banking sector,
fault identification sector, etc. And the misprediction cost in minority classes is high
when compared with majority classes for multi-class imbalanced datasets.
In supervised learning, classification depends upon class labels. But in imbal-
anced big data, the classifiers will take the class labels which are high in count
automatically with the assumption of equal spreading of all classes. This produces
A Comprehensive Analysis on Multi-class Imbalanced Big Data … 317
poor performance in prediction and classification. So, the data skewness is to be

decreased using different techniques to maximize the classification accuracy.
Imbalanced big data can be categorized as (1) binary class or 2-class imbalanced
data (2) multi-class imbalanced data.
(1) Binary class Imbalanced data: Any dataset where all instances belong to one
of two target classes is known as binary class or 2-class imbalanced data. This
dataset contains only one minority class and one majority class.
(2) Multi-class Imbalanced data: Any dataset where all instances belong to one
of many target classes is known as multi-class imbalanced data. This dataset
contains many minority classes and many majority classes.
Different approaches are used to decrease the skewness in the datasets: (i) data-
level approaches, which resample dataset to allocate the classes evenly (ii) algorithm-
level approaches, which modify the algorithms to increase the balancing in class
spreading (iii) ensemble approaches which combine different techniques.
2.2 Data Level Approaches
Data-level approach primarily focuses on shuffling and redistributing original data

to balance classes by sampling technique. The main idea of sampling is balancing
the data skewness. Sampling techniques are categorized as undersampling and
oversampling [7].
2.2.1 Undersampling
Undersampling is balancing the dataset distribution by reducing the occurrences

which are more in count. The advantage is low computational time and less memory
space. Many methods are used to implement undersampling. Simplest method is
random undersampling which randomly chooses some instances and removes them
from training dataset. It may have disadvantage of deleting important instances.
Rahman et al. in [8] suggested modified cluster-based undersampling method
which will eliminate the undersampling disadvantage of losing important majority
class dataset. First, the dataset is partitioned into majority and minority clusters. The
majority class cluster can be subdivided into K clusters. All majority class subsets
are integrated with minority instances to make K training datasets. This technique
is suitable for dataset which has unknown labels and also help to remove imbalance
problem.
2.2.2 Oversampling
Oversampling is achieving the balance by replicating the occurrences which are few
in count. In oversampling, we will not lose any dataset; instead, we will duplicate the
existing minority class datasets. But disadvantage is the dataset size will be increased
which needs large memory and more computational time.
There are many ways to implement oversampling. Simplest technique is random
oversampling which randomly chooses and duplicates some minority instances in
training dataset. But disadvantage of random oversampling is overfitting problem
and small decision region.
Park et al. in [9] suggested oversampling techniques to balance the highway
traffic data to predict traffic accidents. Based on oversampling, the predicting system
analyses and pre-processes traffic big data to create a learning system. Balanced data
is classified into several groups, to which K-means cluster analysis is applied. Finally,
prediction can be done by logistic regression. The result shows that aimed accuracy
was 42.71% and the actual accuracy is 80.56%. These analysis steps are completed
by Hadoop framework. The disadvantage is a well-organized standard that should
be generated for oversampling technique.
Chawla et al. in [10] presented SMOTE (synthetic minority oversampling tech-
nique) which creates synthetic instances over path of joining all KNN points than
duplicating the real datasets.
SMOTE focuses and creates a bias to less instance class. This oversampling
method creates many different minority points near existing points.
By using SMOTE, larger area is covered by the class which has less instances and
makes better prediction of hidden instances belonging to the class which has less
instances.
While creating artificial instances, SMOTE is not considering the neighbouring
majority classes. This can result in an increase of class overlapping and produce more
outlier instances. The class overlapping has a strong correlation with class imbalance
[11]. Therefore, imbalanced overlapping classes are difficult to classify than the
normal one. SMOTE raises minority instances near boundary and over-fitting.
Zhai et al. in [12] proposed an oversampling method OSSLDDD-SMOTE (one-
sided selection link and dynamic distribution density-SMOTE). This method dealt
with noise instances in a hierarchical filtering mechanism, and SMOTE is applied
only the minority instances near the classification boundary by dynamic SDDL
(sequential distribution density link). Generating new instances in different counts
for each border line minority instances depends upon distribution density of that
particular instance. This approach eliminates the disadvantages of SMOTE.
But OSSLDDD-SMOTE produces unwanted synthetic samples around the
majority samples. And finding borderline will become a serious issue.
Hussein et al. in [13] proposed advanced-SMOTE (A-SMOTE) method which
regulates the position of synthetic creation. First, synthetic instances are created
using SMOTE method. Next, it removes the synthetic instances nearer to majority
instances and borderline. In experimental results, the A-SMOTE technique produces
a clear borderline between two classes and also eliminates noise.
At some times, both sampling methods have some disadvantages. Undersampling

neglects some important instances; accordingly, it reduces the classification accuracy.
In contrast, oversampling creates extra instances, which raises the training time.
Replicating the instances will cause over-fitting. To avoid these drawbacks, hybrid
or integrated sampling technique is introduced by combining both.
Cao et al. in [14] presented an integrated re-sampling technique which perform
oversampling through SMOTE method and undersampling through OSS (one-side
selection). SMOTE creates artificial instances and OSS undersampling removes
borderline and noise instances. The resultant dataset is given to classifier and results
are analysed. The integrated technique is feasible and effective than SMOTE in
classification overfitting.
Junsomboon et al. in [15] proposed a technique which combines both samplings
to balance imbalance data. Undersampling removes outlier instances from majority
class using neighbor cleaning rule (NCL). NCL is a method which checks 3 nearest
neighbors of an instance which has fewer count, if mismatch, it will remove the
nearest neigbors of the instance which has higher count. Resultant dataset is given to
SMOTE. The resultant dataset increased the recall measure which raised accuracy
level.
2.3 Algorithm-Level Approaches
Generally, sampling methods need more computational time and memory space.
Since original datasets itself overwhelmed by time, sampling methods are not suitable
for some domains which are having growing dataset.
Ertekin et al. in [16] proposed an SVM active learning which selects informative
instances from a randomly picked smaller pool of instances. Instances which are
inside the margin area are called small instance pool. This method will not search
whole dataset; instead, it will query the system. So, this active learning achieves
a fast solution with competitive prediction performance and deals with unlabelled
instances.
Belarouci et al. in [17] proposed a cost-sensitive extension of least mean square
(LMS) algorithm which solves unbalancing issue by penalizing errors caused by
different weights for different instances. After balancing, different classification
techniques are applied. Experiment result shows the improvement of classification
accuracy.
Hamidzadeh et al. in [18] suggested the Chaotic Krill Herd evolutionary algo-
rithm (CKHA) which examines both class spaces for sample reduction in binary
class imbalanced dataset. All instances are measured using a combined weighted
multi-objective optimizer in WDDS (weighted distance-based decision surface).
Using CKH algorithm, fitness values are measured in search space then it identi-
fies instances which have the best fitness value. All instances which reduce accuracy
and Geometry mean (Gmean) are removed from the original dataset. This method
controls imbalance and keeps instances of class which has less number of examples.
Febriantono et al. in [19] implemented cost-sensitive decision tree C5.0. First

decision tree is created, then metacost technique is applied based on cost sensitive
learning to create least cost model. It has good performance than existing algorithms.
2.4 Ensemble Techniques
2.4.1 Importance of Ensemble Techniques
Data level and algorithm-level approaches can solve binary class imbalanced
problem. Whereas, these approaches cannot solve multi-class imbalanced problem.
Since relationship among classes is different and boundaries can overlap, we need
an ensemble technique which combines both data and algorithm approaches.
Song et al. in [20] analysed one-versus-one (OVO) decomposition scheme by
applying binary ensemble learning approaches. “m” multi-class is split as m(m−1 )/2
binary class sub-datasets. For each pair of corresponding classes, a classifier is
trained, ignoring the samples which do not fit into those two classes. Then all
binary class outputs are aggregated to produce the multi-class result. OVO approach
combines with SMOTE is an appreciative method to balance multi-class imbalanced
problems. When making predictions, unlabelled samples will be fed into the models
for prediction.
Piri et al. in [21] proposed synthetic informative minority oversampling (SIMO)
which uses SVM classifier. Since the instances which are nearer to decision boundary
will be more informative, these samples are over-sampled. Weighted-SIMO (W-
SIMO) is also proposed which differ from SIMO. W-SIMO oversamples only the
instances which are wrongly classified informative minority instances with high
degree. In these techniques, emphasis on informative minority instances which
usually mis-predicted by standard classifiers.
Alam et al. in [2] proposed a recursive technique for multi-class imbalance clas-
sification and also for regression problem. In this technique, the data imbalance
problem is transformed into multiple balanced problems. Partitioning and balancing
data are applied recursively. Partition was implemented using balanced distribution
and random partitioning methods. An ensemble classifier is modelled and ensemble
rule selects one class. Further, it also solve the data imbalance in regression. This
technique is effective and improves performance.
Hassib et al. in [22] proposed a three-phase classification framework. First
phase is feature selection. Second phase is balancing dataset using LSH-SMOTE
(locality sensitive hashing synthetic minority oversampling technique). Lastly, resul-
tant dataset is given to WOA + BRNN (bidirectional recurrent neural network)
algorithm for classification. This method increased classification accuracy level.
Tsai et al. in [23] presented cluster-based instance selection (CBIS) technique
which combines clustering and instance selection. Clustering is applied on majority
classes to split into many subclasses and in each subclass instance selection filters
irrelevant instance of that particular subclass and gives a label for each subclass. This
technique balances multi-class dataset.
2.4.2 Recent Papers on Imbalanced Big Data Pre-processing
Liu et al. in [24] proposed SMOTE-CVCF (cross-validated committees filter) for

data resampling. SMOTE produces synthetic minority instances and rebalance
input dataset. CVCF technique removes the noise instances effectively. Thereby,
it overcomes the deficiency caused by SMOTE in classification.
Jegierski et al. in [25] proposed an enrichment technique which uses external
instances. There are three approaches, namely (1) arbitrary selection (2) repeatedly
chooses instances which will increase classification accuracy (3) add instances which
will support classifier to study class borders. The experimental result increases the
classification value by 27% for worst case and 66% for best case and 21% better than
other methods. This method is used for both binary as well as multi-class dataset,
but the drawback always there should be an external dataset ready which is similar
to actual dataset.
Koziarski et al. in [26] presented a multi-class combined cleaning and resampling
(MC-CCR) algorithm. This method identifies locations which are appropriate for
oversampling, area which does not have small disjuncts and noises. An instantaneous
cleaning process can be used to lessen the class overlapping in learning algorithms
output. MC-CCR algorithms will not lose any data about the inter-class associations
than traditional decomposition techniques. The result shows high robust to noise than
SMOTE.
Żak et al. in [27] compared well-known binary approaches, namely one-vs-all
(OVA), one-vs-one (OVO) and error-correcting output codes (ECOC) and their effec-
tiveness in multi-class imbalanced data classification, based on the base classifiers
and various aggregation schemes for each strategies. All three methods are compared
and result shows that OVO binarization is better than others in terms of performance.
Wei et al. in [28] proposed SCOTE (sample-characteristic oversampling tech-
nique) which divides multi-class into multiple binary classes. k-nearest neighbours
(knn) noise processing removes noisy instances from each binary imbalanced class.
Then, minority instances are sorted based on importance. SCOTE produces artificial
points based on k* information nearest neighbours (k*inn). Thus, multi-class imbal-
anced problem is solved by solving all binary imbalance problems. The experiment
result shows high performance for between-class imbalances, but this method cannot
be applied for multi-class imbalance for within class.
Gao et al. in [29] proposed differential partition sampling ensemble (DPSE) tech-
nique using OVA (one-versus-all) framework. OVA is a mainstream decomposition
technique which uses many binary classifiers for multi-class problems. The sampling
maximum and minimum limit was set by finding maximum number of instances for
a class which has maximum examples and minimum number of instances for a class
which has only very few examples.
Sleeman et al. in [30] created a compound framework on Apache Spark for multi-
class datasets. Instance-level difficulties of each class are analysed to find the learning
difficulties. This information is embedded in common resampling algorithms. These
algorithms will balance multiple classes. A new method of SMOTE was applied
which removes the spatial constraint in distributed datasets. This method shows
that instance-level information is most important for creating training dataset for
multi-class imbalanced big data.
2.5 Inferences from the Review Work
For multi-class dataset, it is divided into multiple binary classes and balancing can
be applied to each binary class. So multi-class imbalancing can be solved by solving
each binary class in Table 1.
Samples were categorized into four types such as safe samples, borderline
samples, rare samples and outliers depending upon the neighbor sample class.
For safe samples random undersampling (s-random undersampling), for borderline
samples SMOTE and for rare samples br-SMOTE are used in training dataset which
has binary classifiers [29].
It is time-consuming to identify instance-level information of all instances in big
data stream. SMOTE needs more computational power and memory requirement for
streaming data. To address these issues, we need to create an ensemble technique
which has to address multi-class imbalanced data on distributed environment with
high volume of data.
Table 1 Recent methods and its results

Method Result/advantage
SMOTE-CVCF Removes the noise instances effectively
Enrichment technique—uses information Advantageous especially noticeable for the
from external dataset smallest datasets, for which existing methods
failed and applies to both the multi-class and
binary classification tasks
MC-CCR algorithm Lessen the class overlapping
SCOTE (sample-characteristic oversampling Multi-class imbalanced problem is solved by
Technique) solving all binary imbalance problems using
knn and K*inn
Differential partition sampling ensemble Uses OVA (one-versus-all) framework, diversity
(DPSE) method of binary classifiers is improved
Compound framework on apache spark for Removes the spatial constraint in distributed
multi-class datasets datasets
2.6 Solutions
The major issues in existing ensemble techniques are scalability, important data
deficiency, memory issues, finding borderline, learning instance-level information
and computational time to create a training model.
The solution to the imbalanced big data is an ensemble technique which combines
sampling and enhancement in classification algorithms. A novel ensemble technique
should be developed to solve scalability and memory issues by using map reduce
framework or in spark. Finding borderline issue and high computational time can
be solved by learning instance-level difficulties in distributed environment with high
volume of data.
Multi-class imbalanced big data leads to biased result in classification. To balance

and learn from imbalanced data, we use different techniques in data pre-processing
phase. The different balancing techniques of data level, algorithm-level and hybrid
methods are analysed. Data-level approach focuses on shuffling and redistributing
the training data to balance the classes by sampling technique. SMOTE is one of the
widespread oversampling technique which creates synthetic instances and balance
the dataset, but it has an issue in finding borderline. OSSLDDD-SMOTE and A-
SMOTE techniques overcome the problem of SMOTE. Since original datasets them-
selves overwhelmed by time, sampling methods are not suitable for some domains
which are having growing dataset. Therefore, algorithm-level approaches are used
which focus on enhancement of different classification algorithms which in turn
pre-process imbalanced dataset.
Data-level and algorithm-level approaches can solve binary class imbalanced
problem. Whereas, these approaches cannot solve multi-class imbalanced problem.
Ensemble techniques which integrate both data-level and algorithm-level approaches
which can balance multi-class imbalance dataset. Even though ensemble techniques
balance the dataset, it cannot be applied to streaming data and has scalability
issues. Experimental reports are analysed with a brief description and its perfor-
mance measured based on accuracy levels in classification. And also, few ensemble
techniques and its frameworks are analysed with its merits and demerits.
Therefore, a novel ensemble technique to balance and pre-process the multi-
class imbalanced dataset can be developed to increase the classification accuracy
level. This can be implemented by batch-incremental processing of datastream with
automated rebalancing technique selection. Then balanced datastream can be given
to classifier to classify the appropriate class.
References
1. Q. Yang, X. Wu, 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis.
Mak. 5(04), 597–604 (2006)
2. T. Alam, C.F. Ahmed, S.A. Zahin, M.A.H. Khan, M.T. Islam, An effective recursive technique
for multi-class classification and regression for imbalanced data. IEEE Access 7, 127615–
127630 (2019)
3. H. He, E.A. Garcia, Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9),
1263–1284 (2009)
4. D.R. Devi, S. Sasikala, Feature selection and classification of big data using MapReduce
framework, in International Conference on Intelligent Computing, Information and Control
Systems (Springer, Cham, 2019), pp. 666–673
5. M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, F. Herrera, A review on ensembles for
the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans.
Syst. Man Cybernet. Part C Appl. Rev. 42(4), 463–484 (2011)
6. V. Ganganwar, An overview of classification algorithms for imbalanced datasets. Int. J. Emerg.
Technol. Adv. Eng. 2(4), 42–47 (2012)
7. J. Tanha, Y. Abdi, N. Samadi, N. Razzaghi, M. Asadpour, Boosting methods for multi-class
imbalanced data classification: an experimental review. J. Big Data 7(1), 1–47 (2020)
8. M.M. Rahman, D.N. Davis, Addressing the class imbalance problem in medical datasets. Int.
J. Mach. Learn. Comput. 3(2), 224 (2013)
9. S.H. Park, Y.G. Ha, Large imbalance data classification based on mapreduce for traffic accident
prediction, in 2014 Eighth İnternational Conference on Innovative Mobile and Internet Services
in Ubiquitous Computing (IEEE, 2014), pp. 45–49
10. N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-
sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
11. R.C. Prati, G.E. Batista, M.C. Monard, Class imbalances versus class overlapping: an analysis
of a learning system behavior, in Mexican International Conference on Artificial Intelligence
(Springer, Berlin, Heidelberg, 2004), pp. 312–321
12. Y. Zhai, N. Ma, D. Ruan, B. An, An effective over-sampling method for imbalanced data sets
classification. Chin. J. Electron. 20(3), 489–494 (2011)
13. A.S. Hussein, T. Li, C.W. Yohannese, K. Bashir, A-SMOTE: a new preprocessing approach
for highly imbalanced datasets by improving SMOTE. Int. J. Comput. Intell. Syst. 12(2),
1412–1422 (2019)
14. L. Cao, H. Shen, Imbalanced data classification based on hybrid resampling and twin support
vector machine. Comput. Sci. Inf. Syst. 14(3), 579–595 (2017)
15. N. Junsomboon, T. Phienthrakul, Combining over-sampling and under-sampling techniques for
imbalance dataset, in Proceedings of the 9th International Conference on Machine Learning
and Computing (2017), pp. 243–247
16. S. Ertekin, J. Huang, L. Bottou, L. Giles, Learning on the border: active learning in imbal-
anced data classification, in Proceedings of the Sixteenth ACM Conference on Conference on
Information and Knowledge Management (2007), pp. 127–136
17. S. Belarouci, M.A. Chikh, Medical imbalanced data classification. Adv. Sci. Technol. Eng.
Syst. J. 2(3), 116–124 (2017)
18. J. Hamidzadeh, N. Kashefi, M. Moradi, Combined weighted multi-objective optimizer for
instance reduction in two-class imbalanced data problem. Eng. Appl. Artif. Intell. 90, 103500
(2020)
19. M.A. Febriantono, S.H. Pramono, R. Rahmadwati, G. Naghdy, Classification of multiclass
imbalanced data using cost-sensitive decision tree C5.0. IAES Int. J. Artif. Intell. 9(1), 65
(2020)
20. Y. Song, J. Zhang, H. Yan, Q. Li, Multi-class ımbalanced learning with one-versus-one decom-
position: an empirical study, in International Conference on Cloud Computing and Security
(Springer, Cham, 2018), pp. 617–628
21. S. Piri, D. Delen, T. Liu, A synthetic informative minority over-sampling (SIMO) algorithm
leveraging support vector machine to enhance learning from imbalanced datasets. Decis.
Support Syst. 106, 15–29 (2018)
22. E.M. Hassib, A.I. El-Desouky, L.M. Labib, E.S.M. El-Kenawy, WOA+ BRNN: an imbalanced
big data classification framework using Whale optimization and deep neural network. Soft
Comput. 24(8), 5573–5592 (2020)
23. C.F. Tsai, W.C. Lin, Y.H. Hu, G.T. Yao, Under-sampling class imbalanced datasets by
combining clustering analysis and instance selection. Inf. Sci. 477, 47–54 (2019)
24. N. Liu, X. Li, E. Qi, M. Xu, L. Li, B. Gao, A novel ensemble learning paradigm for medical
diagnosis with imbalanced data. IEEE Access 8, 171263–171280 (2020)
25. H. Jegierski, S. Saganowski, An “outside the box” solution for imbalanced data classification.
IEEE Access 8, 125191–125209 (2020)
26. M. Koziarski, M. Woźniak, B. Krawczyk, Combined cleaning and resampling algorithm for
multi-class imbalanced data with label noise. Knowl. Based Syst. 204, 106223 (2020)
27. M. Żak, M. Woźniak, Performance analysis of binarization strategies for multi-class ımbalanced
data classification, in International Conference on Computational Science (Springer, Cham,
2020), pp. 141–155
28. J. Wei, H. Huang, L. Yao, Y. Hu, Q. Fan, D. Huang, New imbalanced bearing fault diagnosis
method based on sample-characteristic oversampling technique (SCOTE) and multi-class LS-
SVM. Appl. Soft Comput. 101, 107043 (2021)
29. X. Gao, Y. He, M. Zhang, X. Diao, X., Jing, B. Ren, W. Ji, A multiclass classification using
one-versus-all approach with the differential partition sampling ensemble. Eng. Appl. Artif.
Intell. 97, 104034 (2021)
30. W.C. Sleeman IV, B. Krawczyk, Multi-class imbalanced big data classification on Spark.
Knowl. Based Syst. 212, 106598 (2021)
Efficient Recommender System for Kid’s
Hobby Using Machine Learning
Sonali Sagarmal Lunawat, Abduttayyeb Rampurawala, Sneha Pujari,

Siddhi Thawal, Jui Pangare, Chetana Thorat, and Bhushan Munot
Abstract A recommendation system is playing a vital role in the social media

world and is used nowadays in many applications which provide users with recom-
mendations based on their preferences. With this enormous growth in the volume of
information online these days, recommender systems are a helpful tool to overcome
information loaded by users. The utilization of recommender systems is day-by-day
increased as users are potentially influenced by many choice-based challenges. There
are many types of recommendation systems with different applications and have
successfully adopted, including healthcare, marketing, agriculture, media, and many
more. This paper provides an application for parents where the parent is filling their
preferences and based on Support Vector Machine algorithm of Machine Learning
system will recommend which is hobby suitable for their kids. The system is imple-
mented using the support vector machine algorithm where classification is done
by finding the hyperplane for differentiating the classes. We have considered the
distances between the nearest data points which helps to decide the right hyperplane.
This proposed system provides an overview of our recommender systems, their types,
different techniques, proposed system, and conclusion drawn. To check for the effi-
ciency of different algorithms are used for recommendation system are discussed in
the paper.
Keywords Recommendation systems · Machine learning · Support vector

machine · Classification · Algorithms
1 Introduction
Nowadays, many applications are built for recommendation based on user prefer-
ences. In this paper, we will be looking for types of recommendation system, machine
learning, and support vector machine algorithm.
S. S. Lunawat (B) · A. Rampurawala · S. Pujari · S. Thawal · J. Pangare · C. Thorat · B. Munot

Computer Engineering Department, Pimpri Chinchwad College of Engineering and Research,
Ravet, Pune, Maharashtra, India
e-mail: sonali.lunawat@pccoer.in
328 S. S. Lunawat et al.
1.1 Recommendation System
Recommendation system [1] is widely used nowadays as it has used to develop many
applications. The recommendation system has three types as shown in Fig. 1.
1.1.1 Collaborative Filtering
Collaborative filtering is a methods for recommender systems which uses past infor-
mation of users and the items. The inputs are nothing but historical data of user
connections with items [3]. The representation used is matrix form in Table 1.
1.1.2 Content-Based Systems
Unlike to collaborative filtering, the content-based method uses additional informa-

tion about the user or items to make predictions.
1.1.3 Hybrid Approach
A recommender system techniques such as collaborative filtering and content-based

have their unique strengths and limitations. A hybrid approach can use any one of
the traditional recommender system.
Fig. 1 Types of
recommendation system [2]
Recommenda on
Systems
Collabora ve Content Hybrid

Filtering based based
Table 1 Collaborative filtering

I1 I2 I3 …In
U1
U2
Un
Efficient Recommender System for Kid’s Hobby Using … 329
1.2 Challenges in Recommendation Systems
1. Measure performance due to the changing demands of organization.

2. Analytical measure is nothing but user satisfaction.
3. Impossible to compute user satisfaction by considering a heuristic formula.
4. Accuracy, scalability, and diversity.
2 Machine Learning
Machine learning (ML) [1] is a nothing but a simulation model which allows
computers to acquire knowledge from the real world by improving performance by
training with new knowledge. Nowadays, ML algorithms have widely used in any
domain like business, medicine, etc. Learning is nothing but acquiring knowledge
which is done by experiencing [4].
Machine learning algorithms are classified based on learning as below:
Supervised Learning and Unsupervised Learning

Supervised learning happens when algorithms have labeled classes for classification
where the algorithm is applied for training data set to build model and applied on
test data for generating classifier.
In unsupervised learning, algorithms do not have a training set as they focus on
finding hidden patterns in data. ML has become quite popular currently with advance-
ments in processor speed and memorysize. Due to a large number of research is
being carried by generating scientific publications and developing new applications.
Using ML mathematical or statistical analysis can be learned and visualize to draw
conclusions.
3 Support Vector Machine
Support vector machine (SVM) has attracted many researchers towork on different
applications. SVM comes under a supervised machine learning algorithm [5]. SVM
as shown in Fig. 2 is used for classification and sometimes as well as regression.
In the SVM algorithm, plotting of a data item as a point on graph by a number of
features. In which the value of each feature by representing coordinate. Then, by
performing classification by representing the hyperplane.
As shown in Fig. 3, SVM consists of support vectors which are datapoints near
the hyperplane. A hyperplane is a linear line that classifies between different classes
which builds confidence that the classification is exact.
The paper is arranged as follows. Section 1 gives an introduction to types of recom-
mender system, Sect. 2 explains what is machine learning, Sect. 3 covers support
vector machine Sect. 4, refers related work, Sect. 5 covers problem formulation,
Fig. 2 Support vector

machine (SVM)
Fig. 3 SVM graphical representation [6]
Sect. 6 covers proposed system, Sect. 7 covers experimental results, and Sect. 8
refers to conclusion.
4 Related Work
SVM searches this line through measuring the distances between the extreme points
(support vectors) of the cluster of our data points equidistant to the line of search, and
this distance needs to be maximized to find our hyperplane. SVM is used because the
approach of this algorithm is different in classifying categories because as mentioned
above, the extreme points are the most non likely points of that cluster, i.e., boundary
Table 2 Comparison for all the related work

Author Proposed system Outcomes
Bhojne et al. [7] Decision comparison of related The system proposed extracts the
work support system for restaurant reviews of restaurant using a
selection by using machine customer inputs and classifies the
learning algorithm as Naive review as good and bad reviews. In
Bayesian classification technique this, the system also keeps track of
the performance of restaurants
Kothari and Patel [8] A recommendation model by using They have used of linear SVM as a
support vector machine to improve classifier. The gives accurate
accuracy of the recommendations prediction and also improves
and providing accurate predictions accuracy
to users
Fayyaz et al. [2] Have summarized various areas Domain for work can be selected
and applications of the based on different applications
Recommender system
Furtado and Singh [9] Suggestion to watch movie based By getting explicit outcomes
on recommendation
points of that clusters, hence are nearer to both the clusters of other categories. For
e.g., the data points in the arts category will have an extreme point which could be
nearer to the cluster of sports category hence the same for other categories in the
dataset Naive Bayes treats the data independently, i.e., it takes into account more
of a probabilistic view considering the features (variables) of a particular category,
hence any future observation will get classified based on features and calculating the
marginal likelihood of the point to be classified. So the posterior probability for the
point will be calculated with respect to each of the other categories in the dataset,
and comparing these probabilities with each other resulting in classification of that
point into one of the categories (Table 2).
Most recommender systems have to consider many factors for building the accu-
racy of the system. Very few works have proved for accuracy for which researchers
are constantly been working to improve this system [10]. This purpose system uses
SVM as a standard technique of machine learning. SVM helps to separate data as
per classifier class by using hyperplane. User’s preferences are classified according
to the training set. Finally, proving to the model dataset is demonstrated on different
preferences to prove its high accuracy [9].
5 Problem Formulation
The major problem nowadays is parents by comparison with others kids forces their
kids to learn or take classes for what kids are not interested in. As a result of it without
interest, they cannot prove themselves. To solve this problem by giving a clear idea
to parent about how to overcome this problem of selecting a hobby for their kids. A
proposed system is built to recommend parents by providing accurate hobbies based

on preferences given.
6 Proposed System
The aim is to build a system is to take the preferences of user’s by understanding

them and provide appropriate recommendation on the basics of preferences.
The proposed system has the following key objectives:
1. Able to suggest hobbies for kids.
2. Simple graphical user interface which can be handled by anyone.
3. The categories are academics, sports, or arts.
The proposed system is taking preferences based of general questions which help
us to get the resultant hobby.
The following are steps used:
Step 1: login to website https://hobby-recommender.herokuapp.com/.
Step 2: Give the answers based on preferences.
Step 3: System will input the answers to SVM algorithm.
Step 4: Algorithm create a hyperplane to classify.
Step 4: Based on model build predication of Hobby will be resulted to users.
For the proposed system, we have created a website through which we are asking
question which are generalized based on below dataset in Table 3 and as per response
and classification model considered we recommend hobby to that user.
Table 3 Details of features considered in proposed system

Feature Meaning
Olympiad_Participation Has your child participated in any Science/Maths Olympiad?
Scholarship Has he/she received any scholarship?
School Love’s going to school?
Fav_sub What is his/her favorite subject?
Projects Has done any projects under academics before?
Grasp_pow His/Her grasping power (1–6)
Time_sprt How much time does he/she spend playing outdoor/indoor games?
Medals Medals won in Sports?
Career_sprt Want’s to pursue his/her career in sports?
Act_sprt Regular in his/her sports activities?
Fant_arts Love creating fantasy paintings?
Won_arts Won art competitions?
Time_art Time utilized in Arts?
Fig. 4 Flow graph of

proposed system
After collecting data from users through a survey form and after data prepro-
cessing we had a set of observations which belonged to different classes namely
arts, academics, and sports. The point was to find a decision boundary (optimal line
of separation) between them so that any future observations can be classified into a
certain category.
We have built a website for the proposed system. Our data set have 14 different
features based on general questions. The website takes user preferences from parents
or user and recommend suitable hobby. The questions asked are simple based on
assumption for suitable classes. Figure 4 shows the flow graph of the proposed system.
In this, on our website the user enters the questions asked to them. The preferences are
considered as input to the SVM algorithm. The algorithm then applies the algorithm
to the input and creates a hyperplane. The hyperplane then based on maximizing
margin will classify different hobby classes. The output which is recommendation
based on preferences.
7 Results
Below images show our results of proposed system as shown in Fig. 5a, b.
Fig. 5 a Screenshot of question for giving preferences. b Screenshot of prediction class
Receiver operating characteristic (ROC) does analysis of the system by using

precision and recall as shown in Fig. 6. ROC analysis is aimed to define only related
features by removing nonrelated features. The graph which we get by has recall
maximized as per our experitment which is also called the true positive rate, or else
if it minimized then called as false positive rate. ROC curves are useful to classify a
data item to be recommended or not.
Comparison of both the algorithms is as follows:
Coming to our dataset having inter-relation between each of the features it was more
likely that SVM gave off a better accuracy in comparison with Naive Bayes.
Metric Scores
Metric SVM Naive Bayes

Accuracy score 0.8878504672897196 0.8691588785046729
Recall score 0.8878504672897196 0.8691588785046729
Precision score 0.8878622230059366 0.8710215880666418
(continued)
(continued)
Metric SVM Naive Bayes
F1 score 0.887844740217096 0.8697022374398617
8 Conclusion
A recommender system (RS) has encouraged researchers to develop applications in

e-commerce, social media, transportation, entertainment, and many other domains.
Machine learning is an Artificial Intelligence (AI) are booming research field which
has a goal to predict the outcome. However, researchers need to know the domain
where they can build a recommender system. This study has proposed used strategy
that uses SVM as a classifier a machine learning algorithm. The system has been
found to be efficient as we have proved with the results achieved. In recent research,
a recommendation system has gained attention of researchers in this challenging
area. With this depth literature survey along with the results given by our system, we
prove that SVM has improved the accuracy of the recommendations. Furthermore,
we concluded users’ preferences are very important. Our future research will be
combining recommender system with trendy technologies as IoT, deep learning,
blockchain, etc.
Fig. 6 ROC curve for proposed model
References
1. I. Portugal, P. Alencar, D. Cowan, The use of machine learning algorithms in recommender

systems: a systematic review, in Expert Systems with Applications (Elsevier, 2018)
2. Z. Fayyaz, M. Ebrahimian, D. Nawara, A. Ibrahim, R. Kashef, Recommendation systems:
algorithms, challenges, metrics, and business opportunities, in An Article in Applied Sciences
3. M. Viswa Murali, T.G. Vishnu, N. Victor, A collaborative filtering based recommender system
for suggesting new trends in any domain of research, in 2019 5th International Conference on
Advanced Computing and Communication Systems (ICACCS).
4. P. Valdiviezo-Diaz, F. Ortega, E. Cobos, R. Lara-Cabrera, A collaborative filtering approach
based on Naïve Bayes classifier. IEEE Xplore
5. S. Ghosh, A. Dasgupta, A. Swetapadma, Study on support vectormachine based linear and non-
linear pattern classification, in International Conference on Intelligent Sustainable Systems
(ICISS 2019). IEEE Xplore Part Number: CFP19M19-ART. ISBN: 978-1-5386-7799-5
6. https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-exa
mple-code/
7. N.G. Bhojne, S. Deore, R. Jagtap, G. Jain, C. Kalal, Collaborative approach based restaurant,
recommender system using Naive Bayes. Int. J. Adv. Res. Comput. Commun. Eng. 6(4) (2017)
8. A.A. Kothari, W.D. Patel, A novel approach towards context based recommendations using
support vector machine methodology, in 3rd International Conference on Recent Trends in
Computing 2015 (ICRTC-2015)
9. F. Furtado, A, Singh, Movie recommendation system using machine learning. Int. J. Res. Ind.
Eng. 9(1), 84–98 (2020)
10. S.P. Sahu, A. Nautiyal, M. Prasad, Machine learning algorithms for recommender system—a
comparative analysis. Int. J. Comput. Appl. Technol. Res. 6(2), 97–100 (2017). ISSN: 2319-
8656
Efficient Route Planning Supports Road
Cache
Siddi Madhusudhan Rao and M. Vijayakamal
Abstract Due to the widespread access to a global positioning system and automated
road mapping systems for many smart devices, the road network navigation services
have become a core application. Path planning, a basic feature of road network
navigation systems, identifies a path between the given place of start and destination.
Due to multiple complex situations such as abrupt changes in moving direction,
unpredictable conditions of traffic, missing or instable GPS signals, and so on, the
effectiveness of this route planning functions on roads is vital. The route planning
service must be delivered promptly in these situations. In this article, we suggest a
system for answering a new route scheme question in real time by caching and re-
using historical queried routes, namely path planning by caching. Unlike the standard
route planning schemes relying on the cache where only a queried path is used in
the cache as it fits the current request exactly, PPC uses partly matching requests to
address a new question part(s). Consequently, only unmatched fragments of paths
are calculated, and the total workload of the device can be reduced considerably.
Extensive experiments in an actual database of the road network demonstrate that
our method conducts advanced route planning strategies by reducing the calculating
latency on average by 32%.
Keywords Spatial database · Path planning · Cache
1 Introduction
The on-road path planning in mobile navigation services is a key feature that finds a
route from a queried place to a destination. A road planning question may be made
in different situations due to unpredictable factors, for example, abrupt changes in
the direction of travel, unpredictable patterns in traffic, or the loss of GPS signals.
Road mapping must be carried out in a timely manner in these situations [1]. When
a huge amount of route planning requests are sent to the server, e.g., during highest
S. M. Rao (B) · M. Vijayakamal

Department of Computer Science and Engineering, Malla Reddy College of Engineering and
Technology, Dhulapally, Hyderabad, India
338 S. M. Rao and M. Vijayakamal
time periods, the need for timeliness is much more difficult. Since the response
time is crucial for user content for personal navigation systems, the server has a
mandate to handle high workload demands for route planning effectively. To address
this requirement, we propose a scheme that will efficiently respond to a new path-
planning question through cache and reuse traditionally queried paths: path planning
by caching (PPC) (queried paths in short). Unlike standard route planning schemes
relying on a database that only returns a stored query when it fully corresponds to a
new query, PPC uses partially matching queried paths in a cache to address part(s) of
a new query. As a result, only the unmatched track segments must be calculated and
the machine workload greatly reduced [2]. A path-planning system (PPC) is proposed
by us in order efficiently to respond to a new path-planning querying by using cached
paths so as to avoid taking time-consuming shortest path calculations, as described
in Fig. 1. The system architecture includes three key components, respectively, in
rectangular boxes: In contrast with a traditional route planning scheme, we save up
to 32% on average (without using cache). We implement a cached route that shares
segments with other routes. PPattern is a concept. PPC supports partial hits for a
new question between PPatterns. Our tests show that up to 92, 14% on average of all
cache hits are partial hits. A new probabilistic model is proposed to detect pathways
cached that are highly likely to be a PPattern for the new query-dependent on-road
network consistency. Our tests show that the retrieving of track nodes by 31.69% on
average saves these PP patterns a tenfold more than the saving of 3.04% gained by a
successful completion. In considering the user choice of roads of different kinds, we
have created a new cache replacement mechanism [3]. For each question, a metric
is allocated to answer both the type of road and the popularity of the query. The
findings reveal that our current cache substitution strategy raises the cache hit ratio
by 25, 02% over state-of-the-art cache substitution policies.
Efficient Route Planning Supports Road Cache 339
2 Problem Statement
Road preparations must be carried out promptly. When a huge amount of route
planning requests are sent to the server, e.g., during highest time periods, the need for
timeliness is much more difficult. Since the response time is crucial for user content
for personal navigation systems, the server has a mandate to handle high workload
demands for route planning effectively. For the structure of a vast road network
model, Jung and Pramanik give a HiTi graph model. HiTi is designed to reduce the
search space for computing the shortest path. HiTi performs high road weight updates
and eliminates overhead storage [4]. In the calculation of the shortest pathways, the
calculation costs are greater than those of HEPV and Hub Indexing. Demiryurek et al.
suggest the B-TDFP algorithm by using retrograde searches to decrease the search
space for time-dependent fast routes. The plan uses a road hierarchy to balance each
city. It adopts an area level partitioning system. Only when it completely fits a new
query will a cached query be returned. Time is highly complex. The content of cache
must not be up to date to answer current developments in the questions that have
been posted. The costs of cache construction are huge, since in a complete path to
query results the system must measure the benefits values for all sub-paths.
We suggest a new probabilistic model in the PPatern detection component to estimate

the probability that an enclosed queried path would be useful for the new request
by looking at its geospatial char acts. We build a grid-based index for the PPattern
detection module in order to quickly detect PPatterns instead of searching all queried
paths in cache exhaustively [5]. The shortest path estimation module builds candidate
paths for a new questionnaire and selects the right (shortest) path based on these
observed PPatterns. If the PPattern matches the query in this component, it is returned
immediately by the user; otherwise, the path segments between the PPattern and the
query are to be calculated in the server although unmatched fragments are normally
only part of the initial query, only a ‘smaller subquery’ is processed by the server
with a decreased workload. After returning the approximate user route, the cache
administration module is enabled to evaluate the queried cache paths if the cache is
full. A new replacement cache strategy that takes into account the special nature of
road networks is an important aspect of this module. We include in this paper, a new
framework to reuse the previously cached query results and an efficient algorithm
to improve query evaluation on the server. To address part(s) of the new question,
PPC enhances partially matched queried paths in the cache [6, 7]. As a result, only
the unmatched track segments must be calculated and the machine workload greatly
reduced. In order to effectively respond to a new path-planning query using a cached
method to ensure a time-consuming, shortest path calculation, we suggest a creative
solution, namely path planning with cache. In contrast to traditional track planning
340 S. M. Rao and M. Vijayakamal
systems, we save up to 32% on average (without using cache). We implement a

cached route that shares segments with other routes. PPattern is a concept. PPC
supports partial hits for a new question between PPatterns. Our tests show that up to
92, 14% of cache hits on average are partial hits.
4 Enhanced System
Using correct username and password, the admin must login. Once effective login,
certain operations like viewing and permitting users can be done, places can be
added to data. List all added sites and their documents with Disktra algorithms in
rank, pictures, and distance. See all cache links for all cache-rescinded sites, see
all transactions, and see some cache-to-cache time delay, view cache connect chart
score, and view all chart place ranks. The Tweet server displays the data of all users
and permits them to login: username, address, e-mail ID, and moving numbers for
example. The administrator adds places with information such as location, position
title, location description, location uses, photographs, location document, and the
distance to this place with a center point name. The administrator will see any cache
connection that is the keywords used more than once by users to scan. The rank of
caching links will be seen along with the cache link search sites (number of times
that the keyword is searched from cache). On searched sites (by current user), the
user can see all other comments. Information of the comment includes comment by
name, response and reply date in Fig. 1.
5 Conclusions
In order to address a new track planning inquiry quickly by effectively caching

and reuse the historical queried routes, we suggest a scheme called route plan-
ning by caching (PPC). In contrast to traditional route planner schemes in caches,
where a queried path is used only when exactly matched to a new query, PPC uses
partially corresponding cached queries to address part(s) of a new query. Conse-
quently, only the unpackaged segments could be calculated and the system’s total
workload greatly decreased. Detailed experiments on a real network database demon-
strate that our method exceeds advanced trajectory preparation strategies by reducing
average computational latency by 32%.
References
1. L. Zammit, M. Attard, K. Scerri, Bayesian hierarchical modelling of traffic flow—with appli-

cation to Malta’s road network, in International IEEE Conference on Intelligent Transportation
Efficient Route Planning Supports Road Cache 341
Systems (2013), pp. 1376–1381

2. E.W. Dijkstra, A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271
(1959)
3. U. Zwick, Exact and approximate distances in graphs—a survey, in Algorithms—ESA 2001, vol.
2161 (2001), pp. 33–48
4. A.V. Goldberg, C. Silverstein, Implementations of Dijkstra’s algorithm based on multi-level
buckets, in Network Optimization
5. P. Hart, N. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost
paths. IEEE Trans. Syst. Sci. Cybernet. 4(2), 100–107 (1967)
6. A.V. Goldberg, H. Kaplan, R.F. Werneck, Reach for A*: efficient point-to-point shortest path
algorithms, in Workshop on Algorithm Engineering and Experiments (2006), pp. 129–143
7. S. Jung, S. Pramanik, An efficient path computation model for hierarchically structured
topographical road maps. IEEE Trans. Knowl. Data Eng. 14(5), 1029–1046 (2002)
Programming Associative Memories
Garimella Ramamurthy, Tata Jagannadha Swamy, and Yaminidhar Reddy
Abstract In this research paper, the problem of spurious memories in retrieval of

stored memories is addressed. By a proper interpretation of associative memory func-
tionality, spurious memories are eliminated in the retrieval process. Programming or
synthesis of Hopfield associative memory based on innovative approach is proposed.
Keywords Hopfield associative memory · Linear separability · Hadamard

matrix · Hamming · Euclidean · Mahalonibis distances
1 Introduction
Biological living systems such as Homo sapiens are endowed with the capability of
associating one-dimensional or two-dimensional or three-dimensional information
with stored concepts: name, face, time etc. This associative memory capability is rou-
tinely invoked to conduct daily life. Hopfield attempted and succeeded in proposing a
model of associative memory for storing and retrieving one dimensional information.
Such a ‘Memory model’ is based one the notion of linear separability [10].
One of the important problems in designing an associative memory is the so-
called programming problem, i.e., to be able to store certain desired 1-D/2-D/3-D
information as the desired ‘DESIRED MEMORIES.’ It was realized by Hopfield that
G. Ramamurthy
Department of Computer Science and Engineering, Ecole Centrale School of Engineering,
Mahindra University, Bahadurpally, Hyderabad, Andhra Pradesh, India
e-mail: rama.murthy@mahindrauniversity.edu.in
T. J. Swamy (B)
Department of Electronics and Communication Engineering, Gokaraju Rangaraju Institute of
Engineering and Technology, Bachupally, Hyderabad, Andhra Pradesh, India
e-mail: jagan.tata@griet.ac.in
Y. Reddy
Department of Research and Development, Mahindra University, Bahadurpally, Hyderabad,
Andhra Pradesh, India
e-mail: yaminidhar.bhavanam@mahindrauniversity.edu.in
344 G. Ramamurthy et al.
such an approach to store memory concepts in an associative memory is possible.

But later generations of researchers showed that programming desired memories
induces ‘SPURIOUS MEMORIES,’ and hence, the input stimulus information can
be associated with a wrong memory [8]. This research paper is an effort to overcome
the problem of spurious memory associated with the input stimulus. The approach
proposed in this paper is based on ACCURATE INTERPRETATION of the purpose
of an associative memory such as the Hopfield associative memory (HAM).
This research paper is organized as follows. In Sect. 2, relevant research literature
is reviewed. In Sect. 3, programming Hopfield associative memory (based on linear
separability) is discussed.
2 Review of Relevant Research Literature
Let us consider Hopfield neural network (HNN)/Hopfield associative memory

(HAM) with ‘M’ neurons connected to each other with symmetric synaptic weights.
The model of neuron utilized in HAM is the McCulloth-Pitts model. Such a HNN con-
stitutes homogeneous nonlinear dynamical system based on an undirected, weighted
graph G = (V̄ , Ē) (V̄ ... set of vertices, Ē... set of edges) with symmetric (synaptic)
weight matrix, W̄ . The dynamics of such a HNN is captured by the following modes
of operation. The state of ith neuron (+1 or −1 value) at the time ‘n,’ i.e., vi (n) is
updated in the following manner (vi (n) is the state of ith neuron (+1 or −1 value) at
the time ‘n’ and vi (n+1) is the state of ith neuron at the time ‘n + 1’, ti ... threshold
at ith neuron)
M
vi (n + 1) = Sign{ Wi j vi j (n) − ti } (1)
j=1
In the serial mode, the above updation takes place (asynchronously) at only one
neuron (say ‘i’), whereas in the fully parallel mode, the above updation takes place
(synchronously) at all the M nodes simultaneously. In the partial parallel modes of
operation, state updation takes place at more than one node but strictly less than ‘M’
neurons. Thus, denoting by V̄ (n) vector the state of HNN (whose components are
v1 (n), v2 (n),…,vm (n) ({+1, -1} vector), in the fully parallel mode of operation, we
have V̄ (n + 1) = Sign{W̄ V̄ (n) − T̄ }, where T̄ is the threshold vector.
Definition 1 The state vector Z̄ (i.e. +1, -1) is called the ‘stable state’ if and only if
Z̄ = Sign{W̄ Z̄ − T̄ }
Definition 2 The state vectors J, K (i.e. {+1, −1} vectors) constitute a cycle of
length2 if and only if
J¯ = Sign{W̄ K̄ − T̄ }
K̄ = Sign{W̄ J¯ − T̄ }.
Programming Associative Memories 345
Note: Stable state can be associated with a cycle of length 1.

The following convergence theorem explains the operation HNN as an associative
memory.
Convergence Theorem: Given a HNN (i.e., Ḡ = (V̄ , Ē) with symmetric synaptic
weight matrix, all of whose diagonal elements are nonnegative, starting in an ini-
tial state V̄ (0)({+1, -1}) vector). (i) In the serial mode of operation, HNN always
converges to a stable state (ii) In the fully parallel mode, convergence to a stable
state takes place or a cycle of length at most 2 is reached. Thus, in the serial mode,
starting in an initial state (a corner of unit hypercube) a stable state is reached that is
‘programmed’ or ‘spurious.’
Hopfield proposed an outer product rule to store certain desired memories. But
researchers showed that Hopfield’s synthesis approach includes exponentially many
spurious memories. The author, after carefully understands Hopfield’s idea showed
that the synthesis of desired stable states (also ‘anti-stable’ states introduced in [9])
is based on the corners of unit hypercube that are eigenvectors of synaptic weight
matrix W. Specifically, (after introducing the concept of anti-stable states), it is shown
that the eigenvector of W (which is a corner of unit hypercube) corresponding to
positive/negative eigenvalue constitutes a stable/anti-stable state, when the threshold
vector is zero, i.e., W (ū) = λū
Sign(W ū) = Sign(λū) = ū if λ > 0 with T̄ ≡ 0̄
Sign(W v̄) = Sign(λv̄) = ū if λ < 0 with T̄ ≡ 0̄
If v̄ is an anti-stable state, it leads to a cycle of length 2, since Sign{W̄ v̄} =

−v̄. This synthesis approach was readily generalized to the case with T̄ ≡ 0̄ in [5].
As shown in [8], the organized synthesis procedure proposed by the author still
introduces ‘spurious stable/anti-stable’ states. This research paper is an effort to
overcome the problem of introduction of spurious states (by the synthesis procedure
based on eigenvectors of W that are corners of hypercube). Details are discussed in
the following section.
.
3 Programming Hopfield Associative Memory ..Linear
Separability
We now take a careful look at the purpose of ‘programming HAM’ problem. It is

clear that at most ‘M’ stable/anti-stable states (M is the number of neurons) can be
programmed as desired memories. With such a synthesis of W, exponentially many
spurious memories(stable/anti-stable states) are introduced. Thus, with some initial
.
condition vectors (i.e., V̄ (0) .. {+1, -1} vectors), spurious stable/antistable states are
reached. These spurious memories do not correspond to any stored/programmed

desired memories.
Main Idea: The innovative idea (to overcome the spurious memory problem) is to
associate a spurious memory with the nearest(in Hamming distance) desired memory
for the purpose of retrieval of memory state(starting in any initial condition). Thus,
on running the HAM with a given initial condition vector, if a spurious stable state
is reached, the nearest desired stable state is declared as the corresponding memory
state. It is thus clear that recall/retrieval always generates a programmed memory.
But the programmed stable/anti-stable state may not be the correct memory state.
We now briefly discuss synthesis of {+1, −1} vectors that are eigenvectors of W as
the desired/programmed stable/anti-stable states. The following facts readily follows
I) Since, the set of eigenvectors of a symmetric matrix must form an orthogonal basis;
{+1, -1} orthogonal eigenvectors exist only when ‘M’ is an even number.
M Thus Tletting
f¯i ’s as the normalized {+1, -1} eigenvectors, we have that W = i=1 λi f¯i f¯i (with
‘M’ even number). II) It should be noted that the set of {+1, -1} eigenvectors which
are orthogonal constitute the columns of an M × M Hadmard matrix. Thus, the
Hamming distance between the programmed stable states is M/2. III) In such a
synthesis approach, the freedom in choice of eigenvalues is capitalized to program
desired domains of attraction. For instance, M real eigenvalues are symmetrically
located about the origin. For example, with M = 4, the eigenvalues can be chosen
as {−8, −4, 4, 8}. Such a choice of eigenvalues ensures that diagonal elements of W
are all zeros (and hence Trace (W ) = 0).
.
• Associative memories Based on Linear Separability (e.g.,HAM).. Uniqueness
of our Retrieval Procedure
It should be noted that in the traditional retrieval approach in associative memories,
e.g., HAM, the noisy initial condition (e.g., vector) could be retrieved as a spuri-
ous memory and not as one among the programmed desired memories. But in our
approach, on converging to a spurious memory (in serial or fully parallel mode), the
retrieval is done by ‘determining the closest (in Hamming distance) programmed
memory’. Hence, we are guaranteed to associate initial condition with a desired
memory. But, it can happen that the retrieved desired memory may not be the correct
one if the noise corrupting the initial vector introduced large number of errors.
we now summarize some relevant results documented in [3–5] for completeness.
(1) When the dimension ‘M’ is odd, there are no orthogonal corners of unit hyper-
cube. Hence, when the number of neurons is odd, at most one corner of hypercube
can be programmed as the eigenvector of W (and hence as a stable state/anti-
stable state).
(2) It is clear that the columns of Hadmard matrix are programmed as stable/antistable
states when dimension is ‘even’ and a multiple of 4. Hadmard conjectured that
Hadmard matrix exists in dimension N, if and only if (N) mod 4 = 0.
4 Numerical Results
We now consider synthesizing HAM withe 4 neurons.

1 1
H2 =
1 −1
where H2 is the Hadamard matrix of dimension 2. Using Sylvester construction, we

have ⎡ ⎤
1 1 1 1
H2 H2 ⎢1 −1 1 −1⎥
H4 = =⎢⎣1 1 −1 −1⎦
⎥
H2 −H2
1 −1 −1 1
Let columns of H4 be chosed as eigenvectors of W, i.e., f¯i ’s are the eigenvectors

of W.
. . .
H4 = [ f¯1 .. f¯2 .. f¯3 .. f¯4 ]
Since f¯i s are of L 2 -norm, 2 the spectral representation of W is given by

4
λi
W = ( f¯i )( f¯iT )
i=1
4
where λi ’s are eigenvalues. Choose λ1 = −8, λ2 = −4, λ3 = 4, λ4 = 8,
W = (−2 f¯1 f¯1 − f¯2 f¯2 + f¯3 f¯3 + 2 f¯4 f¯4 )
we have V0 = V(0), V1 = V(1), V2 = V(2) and so on.

We have empirically observed that with above synthesis of HAM with 4 neurons,
no spurious stable states are introduced. It should be noted that Sign(W ū)= ū implies
Sign(W-ū) = -ū. Thus, if ū is a stable state, -ū is also a stable state. Similar inference
holds true for cycle of length 2.
We briefly provide a generalization of synthesis of HAM (i.e., synaptic weight
matrix, W) when the dimension W (i.e., number of neurons) is say 2 K for integer
‘K.’ Let 2 K = L . Using Sylvester construction, ‘L’ eigenvectors of W are chosen
as the columns of Hadamard matrix, HL . The eigenvalues (L of them) are located
symmetric about the origin (i.e., L/2 of them are positive, L/2 are negative) and are
chosen to be integral multiples of L, i.e.,

4
W = λi f¯i f¯iT
i=1
i.e. M = L.
Fig. 1 Dynamics of HAM with all possible initial conditions
Fig. 2 Dynamics of HAM with all possible initial conditions
It is clear that the a Hamming distance between the eigenvectors f¯’s (which are
corners of hypercube) L/2. Figures 1 and 2 represent the dynamics of HAM with all
possible conditions.
(A) We conjecture that all corners of hypercube that are at a Hamming distance less
than L/2 from f¯j lie in the domain of attraction desired memory f¯j .
(B) If a spurious memory is reached (starting in an initial condition) it is retrieved or
decoded as the desired or programmed memory which is closest to it in Hamming
distances. This step requires at most M 2 distance comparisons. Such decoding
ensures that always retrieval result is a programmed memory.
Since any two desired memories are at Hamming distance L/2, if the domains of
attraction (like coding spheres) are disjoint at least L−1
4
errors can be corrected. We
expect the domains of attraction to be disjoint with our synthesis procedure. We are
currently attempting a proof of this conjecture (by capitalizing the freedom in the
choice of eigenvalues) [6].
We make the following conjecture: By capitalizing the freedom in choice of
eigenvalues, domains of attraction of desired programmed memories (L of them)
can always be ensured to be disjoint (thereby connecting 21 ( L2 - 1) = L−1 4
).
It is well known that Hopfield associative memory (HAM) is based on McCulloch-
Pitts neuron (which uses linear separability as the basis). But, in [1, 2, 7], we proposed
various interesting associative memories based on spherical separability. Essentially,
in such associative memories(with state space being the symmetric unit hypercube),
updation at any neuron is performed in the following manner: vi (n + 1) =
the state
Sign{ M j=1 Wi j v j (n) − ti }, where d(., .) is a suitable chosen distance measure such
as Hamming, Euclidean, Mahalonibis distance.
Even in such associative memories, spurious memories can be retrieved or decoded
as the closest (in Hamming distance) programmed or desired memories.
5 Conclusions
By a careful interpretation of the purpose of associative memories, spurious memories

are mapped to desired memories using nearest neighbor approach (using Hamming
distance). Thus, retrieval always ensures that a programmed memory is associated
with any arbitrary initial condition.
References
1. Garimella, Rama Murthy, Ganesh Yaparla, and Rhishi Pratap Singh. Optimal Spherical Sep-
arability: Artificial Neural Networks. In International Work-Conference on Artificial Neural
Networks, pp. 327–338. Springer, Cham, 2017
2. Ganesh, Yaparla, Rhishi Pratap Singh, and Garimella Rama Murthy. Pattern classification using
quadratic neuron: An experimental study. In 2017 8th International Conference on Computing,
Communication and Networking Technologies (ICC-CNT), pp. 1–6. IEEE, 2017
3. Haykin, Simon S. Neural networks and learning machines/Simon Haykin. (2009)
4. Garimella Rama Murthy, and Moncef Gabbouj. On the design of Hopfield Neural Networks:
Synthesis of hopfield type associative memories. In 2015 International Joint Conference on
Neural Networks (IJCNN), pp. 1–8. IEEE, 2015
5. RamaMurthy, Garimella, Vamshi Devaki and Divya, Synthesis or Programming of Hopfield
Associative Memory, Proceedings of International Conference on Ma-chine Learning and Data
Science (ICMLDIS2019), December 2019, (ACM Digital Library)
6. G.Ramamurthy and D.Praveen, Complex-valued Neural Associative Memories on the Complex
Hypercube, proceedings of the IEEE conference on Cybernetics and Information Systems (CIS
2004), Singapore, Dec 2004
7. Garimella Rama Murthy, Munugoti, Sai Dileep, and Anil Rayala N:ovel Ceiling Neuronal
Model Artificial Neural Networks. (2015)
8. J. Bruck, V.P. Roychowdhury, On the number of Spurious Memories in the Hopeld Model
(Neural Network). IEEE Transactions on Information Theory 36(2), 393–397 (1990)
9. G. Rama Murthy and B. Nischal, “Hopfield-Amari Neural Network: Minimization Quadratic

forms,” The 6th International Conference on Soft Computing and In- telligent Systems, Kobe
Convention Cener (Kobe Portopia Hotel)November 20–24, 2012, Kobe, Japan
10. J.J. Hopfield, Neural Networks and Physical Systems with emergent computational abilities.
Proceedings of National Academy of Sciences, USA 79, 254–2558 (1982)
Novel Associative Memories Based
on Spherical Separability
Garimella Ramamurthy and Tata Jagannadha Swamy
Abstract In this research paper, based on the concept of spherical separability (pro-
posed by the authors), novel associative memories are proposed (with centering
vectors being any corners of unit hypercube). Convergence results are established.
Also, hybrid associative memories which associate 1-D stimulus with 2-D memory
states (and 2-D stimulus with 3-D stable states) are proposed.
Keywords Neural networks · Associative memories · Spherical separability ·

Hopfield networks · Euclidean and hamming distances
1 Introduction
Biological memories are capable of associating names, face etc. with an input stim-
ulus presented to the eyes/ears etc. In fact, humans invoke associative memory capa-
bilities in an effortless manner. Hopfield attempted and succeeded in innovating an
artificial neural network (ANN) model of associative memory. Such a model is based
on the concept of “Linear Separability”, utilized in McCulloth-Pitts model of neuron.
In [1–3], the authors proposed the concept of “spherical separability” of pat-
terns. It is reasoned that “liner separability” implies spherical separability but not
the other way. Also, ANNs based on spherical separability were proposed in [4].
As a natural consequence, we innovated a model of associative memory based on
spherical separability which has strong connections to clustering approaches (i.e.
unsupervised learning approaches). Also, ANNs based on spherical separability are
G. Ramamurthy (B)
Department of Computer Science and Engineering, Ecole Centrale School of Engineering,
Mahindra University, Bahadurpally, Hyderabad, Andhra Pradesh, India
e-mail: rama.murthy@mahindrauniversity.edu.in
T. J. Swamy
Department of Electronics and Communication Engineering, Gokaraju Rangaraju Institute of
Engineering and Technology, Bahadurpally, Hyderabad, Andhra Pradesh, India
e-mail: jagan.tata@griet.ac.in
352 G. Ramamurthy and T. J. Swamy
related to radial basis function neural networks (RBFNN). In fact, the idea of hav-
ing “clustering vectors” (utilized in RBFNN hidden layer) finds the equivalent in
ANNs based on spherical separability. In this research paper, we propose a spherical
separability-based associative memory which is more general than that reported in
[4].
This research paper is organized as follows. In Sect. 2, relevant research litera-
ture is reviewed. In Sect. 3, novel spherical separability-based associative memory
is proposed. In Sect. 4, simulation results confirming the convergence theorem are
provided. In Sect. 5 hybrid associative memories based on linear separability are
discussed. The research paper concludes in Sect. 6.
2 Review of Related Research Literature
In an effort to understand the logical motivation for Hopfield neural network, the
authors were naturally led to proposing “spherical separability” as a basis for design
of novel associative memories. In some sense, the initial effort borrowed the ideas of
clustering algorithms such as “k-means” algorithm. Also, the relationship between
error correcting codes and ANN’s such as Hopfield associative memory provided the
initial logical basis for the conception of new ideas, discussed in this research paper.
Furthermore, RBFNNs and their operation consolidated our proposal for spherical
separability-based associative memories [5, 6]. In RBFNN’s, the distance/metric
chosen is the Euclidean distance (between centering vectors and input vectors) [7,
8]. In a well-defined sense, the concepts/ideas proposed in this research paper are
not totally incremental but highly innovative. We hope that the humble beginning
will lead to many research efforts on the theme pioneered in this research paper.
In [1, 2], the concept of spherical separability was introduced. W briefly explains
the concept in the following discussion.
Definition: Patterns belonging to two classes in N-dimensional Euclidean space
are said to be spherically seperable if there exists a N-dimensional Euclidean hyper
sphere boundary which seperates the two classes.
Definition: Patterns belonging to ‘L’ classes are spherically seperable if every
pair of them are spherically seperable.
Note: In 2 dimensions, patterns belonging to 2 classes are spherically seperable
if there exists circle which seperates them.
Note: Suppose the patterns lie in a bounded region of N-dimensional Euclidean
space. Then, it readily follows that if the patterns are linearly seperable, they are
spherically seperable, but not the other way.
Novel Associative Memories Based on Spherical Separability 353
3 Spherical Separability: Novel Associative Memory
Consider a network of M, artificial neurons whose state value is {+1, or −1}. These
neurons correspond to the vertices of a graph G = {V, E}, which are connected to
each other with edges E, whose weights (edge weights) are synaptic weights. The
graph is an undirected graph with the edge weights being symmetric between vertices.
In terms of the synaptic weight matrix, we have that Wi j = W ji , i.e. W is a symmetric
matrix. Let V̄ (n) be the state of the ANN at time index ‘n’, i.e. V̄ = [v1 v2 · · · Vm ]
with vi ∈ {+1, −1}. Thus, the state space of the ANN is symmetric unit hypercube.
In this associative memory model, each artificial neuron is associated with a centering
vector ū i , which lies on the symmetric unit hypercube. There is no external input to
that ANN, and the state updation takes place in the following manner based on the
initial state vector, V̄ (0) in the simplest possible architecture [4].
Vi (n + 1) = Sign{d H (V̄ (n), Ūi ) − ti } for n ≥ 0 (1)
where (d H (V̄ (n), Ūi ) is the Hamming distance between vectors (V̄ (n), Ūi ) lying on
the symmetric unit hypercube. Thus, the above updation takes place at any neuronal
node ‘i’. The modes of state updation are possible:
• Serial Mode: At any given time ‘n + 1’ the state updation described in (1) takes
place at any one neuron, ‘1’.
• Fully Parallel Mode: The state updation described in Eq. (1) takes place simulta-
neously at all the nodes, at any given time ‘n + 1’.
Partial Parallel Modes: At any given time ‘n + 1’, the state updation in Eq. (1)
takes place at more than one neuron, but strictly less than ‘M’ neurons.
Thus, in the fully parallel mode of the operations, state vector V̄i (n+1)
becomes (2)
⎡ ⎤
Sign{d H (V̄ (n), Ū1 ) − t1 }
⎢ Sign{d H (V̄ (n), Ū1 ) − t2 } ⎥
⎢ ⎥
V̄i (n + 1) = ⎢ .. ⎥ (2)
⎣ . ⎦
Sign{d H (V̄ (n), Ū M ) − t M }
In the state space of the associative memory proposed (based on spherical sep-
arability) there are distinguished states called ‘stable states’. Once the nonlinear
dynamical system reaches a stable state, there is no further change of the state of it.
Formally, we have the following definition. Definition: If Z̄ is a stable state (3), then
⎡ ⎤
Sign{d H ( Z̄ , ū 1 ) − t1 }
⎢ .. ⎥
Z̄ = ⎣ . ⎦ (3)
Sign{d H ( Z̄ , ū M ) − t M }
Note: The main difference compared to the research report in [3] is that the
centering vectors ū i ’s at the neurons need not be orthogonal. (In [4], only the case
where ū i ’s are necessarily orthogonal is considered.)
Note: The above associative memory is based on spherical separability unlike
Hopfield associative memory which is based on linear separability.
Conjecture: Patterns belonging to L classes (L > 2) which are not spherically
seperable in a lower dimensional space can be rendered spherically seperable in a
higher dimensional space (by suitable projection approach).
The following theorem summarizes the dynamics of non-linear dynamical system
proposed in Eq. (1).
Theorem: The dynamical system based on Eq. (1) always converges to a stable
state in serial mode, whereas in the fully parallel mode, either convergence to a stable
state occurs or at most a cycle of length 2 is reached.
Proof: The theorem follows from the argument utilized in theorem (6), in Ref.
[9]. Details are avoided for brevity.
4 Simulation Results: H4, H8
Figures 1 and 2 represent the illustrations of fully parallel mode of operation with 4
and 8 neurons. Figures 3 and 4 representing the illustration of serial mode of operation
4 and 8 neurons.
5 Hybrid Associative Memories Based on Linear

Separability
Biological memories can associate on dimensional signal (e.g. speech waveform)

with a two-dimensional memory state (e.g. a image), effortlessly. The authors are
motivated emulate such biological associate memories in artificial neural networks
(ANNs). In such research effort, one-dimensional inputs stimulus (e.g. {+1, −1}
vector) is associated with a two-dimensional memory state {+1, −1} matrix). One
possible ANN architecture which can accomplish such an associative memory capa-
bility in the following
V̄ (n + 1) = Sign(W̄ V̄ (n) − T̄ )
. . .
where Ṽ (0) = [V̄ (0)..V̄ (0).. · · · ..V (0)], i.e. V̄ (n + 1) is the state of the hybrid asso-
ciative memory at the time ‘n + 1’ (with V̄ (0) being the stimulus column vector).
It is a matrix of {+1’s and −1’s}. Also, T̄ is a matrix of threshold values. W̄ is the
synaptic weight matrix. Let us label such an associative memory as SAM-2.
Fig. 1 Illustration of fully parallel mode of operation with 4 neurons
Fig. 2 Illustration of fully parallel mode of operation with 8 neurons
In the sprite of the above idea, we now associate a two-dimensional stimulus, i.e.
two-dimensional input signal with a three-dimensional memory state. The architec-
ture of such an associative memory involves stacking SAM-1 ANN’s as depicted in
the following figure, i.e. Fig. 5.
At each level of stack, we implement state updation in the following manner
V̄ (n + 1) = Sign(W̄k V̄ (n) − T̄k ),
where Wk is the synaptic weight matrix at the kth level of stack. Also, Ṽ (0) is the
two-dimensional stimulus signal (i.e. initial condition matrix of {+1’s, −1’s}). T̃k is
Fig. 3 Illustration of serial mode of operation with 4 neurons
Fig. 4 Illustration of serial mode of operation with 8 neurons
Fig. 5 SAM matrices flow representation

the threshold matrix at level ‘k’. Let us label such a hybrid associative memory as
SAM-2.
Note: Using the convergence theorem for Hopfield associative memory, bot SAM-
1 and SAM-2 reach stable states (memories) in the serial mode.
5.1 Simultaneous Retrieval of Multiple Memory States
The problem of retrieval of spurious memory arises when the initial condition is
corrupted by noise (many errors) moving it from the domain of attraction of associated
desired memory state to that of a spurious memory state. In [4], one possible solution
the problem in presented by retrieving only the desired stable/memory states and a
spurious memory/stable state is never retrieved. We now present another solution
to the problem. This solution depends on the specification of minimum, maximum
number of errors that can be allowed to occur. In this approach, multiple initial
conditions are simultaneously presented to the associative memory for retrieval. The
set of initial conditions are determined by minimum, maximum number of errors
that are allowed by specification.
Thus, in this approach, the initial condition is chosen to be a +1, −1 matrix, and
the associated associative memory is utilized to retrieve all related/associated stable
states/memory states.
Specifically, on distinguished choice of initial condition matrix is that it constitutes
a Hadamard matrix (which necessarily means |M4 | = 0). Thus, the initial condition
vectors are orthogonal and are at a Hamming distance of M/2. The corresponding
memories that are retrieved are utilized to decide the programmed memory that is
“most likely” to be the correct choice.
6 Conclusions
In this research paper, based on the concept of spherical separability, an associative

memory is innovated (with arbitrary centering vectors that are not necessarily orthog-
onal). An interesting convergence theorem is discussed along with simulations. Also,
certain hybrid associative memories are discussed.
References
1. G. Ramamurthy, G. Yaparla, R.P. Singh, Optimal spherical separability: artificial neural net-
works, in International Work-Conference on Artificial Neural Networks (Springer, Cham, 2017),
pp. 327–338
2. G. Yaparla, R.P. Singh, R.M. Garimella, Pattern classification using quadratic neuron: an exper-
imental study, in 2017 8th International Conference on Computing, Communication and Net-
working Technologies (ICCCNT) (IEEE, 2017), pp. 1–6
3. S.S. Haykin, Neural Networks and Learning Machines (2009)
4. R. Garimella, J.S. Tata, Spherical Separability: Associative Memories. EasyChair Preprint No.
4990 (2021). https://easychair.org/publications/preprint/6gbR
5. G. Rama Murthy, M. Gabbouj, On the design of Hopfield neural networks: synthesis of Hopfield
type associative memories, in 2015 International Joint Conference on Neural Networks (IJCNN)
(IEEE, 2015), pp. 1–8
6. G. Ramamurthy, V. Devaki, Divya, Synthesis or programming of Hopfield associative mem-
ory, in Proceedings of International Conference on Machine Learning and Data Science
(ICMLDIS2019), Dec 2019 (ACM Digital Library)
7. G. Ramamurthy, D. Praveen, Complex-valued neural associative memories on the complex
hypercube, in Proceedings of the IEEE conference on Cybernetics and Information Systems
(CIS 2004), Singapore, 2004
8. G. Rama Murthy, S.D. Munugoti, A. Rayala, Novel Ceiling Neuronal Model Artificial Neural
Networks (2015)
9. G. Rama Murthy, Resolution of P = NP conjecture, Invited talk at 14th Polish–British Workshop,
June 2014 (Sponsored by DST, Government of India)
An Intelligent Fog-IoT-Based Disease
Diagnosis Healthcare System
Chandan Kumar Roy and Ritesh Sadiwala
Abstract Modern health services are the greatest problem in developed countries
in particular, where remote regions are not supplied with good quality drugs and
hospitals. IoT is a major player in medical treatment to provide people with better
clinical services, which also facilitates physicians and hospitals. In this paper, we are
presenting a novel and smart healthcare system focused on advanced techniques such
as IoT (i) which offer a platform to fog-aid IoT-enabled health-related disease diag-
nostics. (ii) The implementation of a server-side health diagnosis device for patient
diagnosis outcome (PDO). (iii) Monitor the severity of the disease by implementing
a mechanism for alarm generation. This device is smart enough for a clinical decision
support system to detect and analyzes patient data. This machine is a suitable alter-
native for people living in rural areas; it can determine if they have a major health
problem and to cure this by approaching nearby hospitals. We also developed a state-
of-the-art IoT process management tool that delivers operating states and facilitates
improved preparation and efficient use of resources and physical resources in the
healthcare process.
Keywords Internet of things (IoT) · Fog computing · Patient diagnosis outcome ·

Healthcare system
1 Introduction
By using the latest scientific and technological innovations, the health system
improves the well-being of a specific individual and the public while putting the
impact on public welfare. That is because proactive medicine normally predicts
diseases and abnormalities much earlier than the actual time that the problem
emerges, because it prevents deaths and injuries. The benefit of constructing an
IoT device is much less than medical and ambulance costs in the latter years. IoT
healthcare technologies will speed up the healthcare market in the next generation,
as its potential varies from clinical surveillance and diagnostic automation to many
C. K. Roy · R. Sadiwala (B)

RKDF University, Gandhinagar, Bhopal, Madhya Pradesh, India
360 C. K. Roy and R. Sadiwala
potential applications. In creating health information systems, the IoT-based health-

care framework plays an important role [1, 2]. The dependence on IoT is growing
every day in order to increase access to treatment, improve healthcare delivery, and
eventually reduce care costs. An improved medical infrastructure should cope with
the early indicator of critical health and include home-based treatment rather than
expensive hospital care. The health sector is shifting to a value-based approach that
provides patients with the highest value. It is about time that the staff concentrated
solely on patients and spending as little time before a monitor as practicable. It is hard
to adjust to telemedicine, particularly in older adults, by many doctors and patients
[3–5].
Doctors are very concerned with the poor management of patients. Though
medical advancements have rendered technologies more effective, device failures
occur. There is still a possibility of error and technology always can recognize the
human touch. Telemedical solutions networks should confirm that it requires an enor-
mous amount of resources. The development of a new method demands for training,
and this transition is often challenging for employees to accept. Practice adminis-
trators, clinicians, doctors, and others must understand how to leverage the method
and show the advantages of practice. While telemedicine is costly at the beginning,
health systems can have more patients and fewer employees contributing to a greater
return on investment. Identification, location, sensing, and synchronization, as seen
in Fig. 1, are key components of healthcare systems. A broad variety of technologies
Fig. 1 Typical components of IoT-based smart healthcare

An Intelligent Fog-IoT-Based Disease Diagnosis Healthcare System 361
is in use for intelligent medical services: ambulance, intelligent computing, cameras,

chips’ laboratories, interactive control, wearables, networking, and big data [6–8].
They may use the IoT for continuous surveillance and connectivity besides medical
treatment and to monitor patients through a hospital or clinic. Using IoT-driven non-
invasive checks [9] can be always used for hospital patients whose physiological
condition needs near consideration. Sensors collect far-flung physiological informa-
tion and use pathways and the cloud to break down and archive the information and
then transmit them remotely to parents to promote research and survey.
This is the type of system. Sensors collect data. It eliminates the procedure of a
medical practitioner coming at regular intervals to monitor the status of the patient
instead of delivering an ongoing automated information stream. This increases the
standard of treatment by continuous attention while reducing care costs by removing
the requirement for an individual to actively take part in the collection and processing
of data. There are individuals worldwide who could suffer from their well-being
because they have no ready access to successful health surveillance. However,
lightweight, efficient IoT-connected wireless solutions now enable for tracking to
access these patients rather than vice versa.
They can use these technologies to securely collect patient health information from
various sensors, to interpret data using complicated algorithms, and then exchange
data with medical specialists through wireless communication who can make the
health recommendations. Intelligent sensors combining a sensor with a microcon-
troller enable IoT to calculate, measure, and analyzes various health status indicators
to leverage their power in healthcare. This can involve essential indicators of health
such as heart rates and blood pressure, glucose levels, or blood saturation. Including
intelligent sensors in prescription bottles, they may be linked to the network to decide
whether such a patient has a scheduled dosage of medicine. The main objective of
the IoT-based healthcare system is to offer all people in all parts of the world better
healthcare. This can be handled more patiently and cost-effectively. Thus, medical
health tracking devices need to be improved to improve the effectiveness of patient
treatment in Fig. 1.
The main contributions and organization of this paper are summarized as follows:
In Sect. 2, we describe the literature review of IoT-based healthcare systems. Section 3
proposed work. Section 4 results and discussion. Finally, in Sect. 5 we concluded
the paper.
2 Related Work
Number of studies have dealt with different concerns relating to IoT in health care,
complex IoT healthcare technologies, IoT healthcare security problems and analysis.
In [10], the authors proposed a system for the tracking of diseases based on cloud-
centric IoT which predicts the condition with its seriousness. In addition, the authors
would discuss the following: They describe their primary concepts such that the
concept of data sciences can be analyzed to produce user-oriented health evaluations.
For the implementation case, the architectural prototype for intelligent student health
is designed. They measured the findings particularly during treatment with health
measurements. They then created the comprehensive student health data through
UCI and medical sensors to estimate the student with various diseases. Diagnostic
systems are introduced by different state-of-the-art algorithms, and the outcomes are
determined based on accuracy, sensitivity, specificity, and F-measure.
In [11], both psychic and physical health are introduced. They use IoT-dependent
sensors either within or inside the body. In addition, the reactive healthcare infrastruc-
ture can be turned into constructive and preventive healthcare services by leveraging
mobile computing technologies in IoT-based health systems. A smart student m-
healthcare monitoring system focused on the IoT cloud is proposed in this case. In
this context. This system measures the magnitude of student diseases by estimating
the level of the illness by temporarily eliminating medical and IoT measures. They
developed an architectural model for the intelligent student health system to effi-
ciently interpret the student health results. In our case study, 182 alleged students
are simulated with a health dataset to establish applicable waterborne diseases.
Something evaluates further this data to verify our model using a k-cross-validation
approach. They used diagnostics based on patterns for different classification algo-
rithms, and the outcomes are calculated based on precision, sensitivity, specificity,
and response times. The experimental findings show that decision tree (C4.5) and
k-nearest neighbor algorithms are more effective in relation to the above parameters
than in other classifiers. By providing caretakers or doctor’s time-safe details during
a time, the suggested solution is useful for decision-making. Finally, the presentation
focused on temporal granules gets effective diagnostic outcomes for the proposed
scheme.
In [12], the authors propose a cloud-centered energy-efficient method for eval-
uating drought and forecasting the present state of affairs. Based on the study of
data variability using the Bartlett test, the architecture specifies the active and the
sleep interval of IoT sensors. With kernel principal component analysis (KPCA) at
the fog layer, the dimensionality of data on drought-causing elements is decreased.
Drying strength is calculated on the cloud level by the Naïve Bayes classification, and
drought is estimated based on SARIMA models over various periods of time. Exper-
imental and performance research show the feasibility of the method proposed for
the evaluation and estimation of drought with enhanced drought-causing attributes.
It also shows substantial savings in energy in comparison with other systems.
In [13], the authors introduce an IoMT-based diagnostic model of health care
through the usage of smart techniques. In this paper, the healthcare system for
cardiomyopathy predictions based on IoMT is being established based on the BBO-
SVM model. The model suggested includes the tuning of SVM parameters with
the BBO algorithm. A data collection for Statlog Heart Disease is used to vali-
date the proposed model. The thorough experimental review showed strongly that,
by achieving a maximal precision of 88.33%, the proposed BBO-SVM model had
outstanding performance, a recall of 8 77.60%, an accuracy of 89.26%, and an F-score
of 87.96%.
In [14], authors have adopted a novel systemic method in the areas of diabetes and
the use and prediction of patients who have experienced from diabetes with the UCI
repository dataset and medical sensors have created associated patient information.
They also suggest diagnosing the disease and its seriousness, a modern classification
algorithm named the fuzzy rule-based neural classifier. They performed the tests
with the normal UCI repository dataset and the actual health reports from different
clinics. The experimental results show that the work suggested improves the current
disease prediction systems.
In [15], the authors presented a new algorithm dependent classification with a
deep neural network (DNN) named OGSO-DNN for distributed healthcare systems.
In the oppositional glowworm swarm optimization [15], the cluster heads (CHs) from
the IoT devices are chosen with the OGSO algorithm in this analysis. The chosen
CHs then performed DNN-based classification procedure to the cloud servers. They
performed a simulation and modeling study using data from UCI repository and IoT
devices created from the student perspective to predict the severity of the disease
among the students. By attaining a combined mean sensitivity of 96.95, 95.07%, the
precision of 95.76%, and an F-score value of 96.88%, the proposed OGSO-DNN
model outperformed the previous models.
There has been more effort in the past (existing works) to build a platform for
medical and IT communication, in particular IoT. In these approaches, fortunately,
it is not acceptable to apply strong notions in computing. Machine learning, for
instance, is an interesting area. If a medical specialist is unavailable, a computer
specialist is there to help diagnose the issues of the patient. Algorithms developed this
expertise for machine learning. Practicable to this are attractive topics like learning
techniques and the fuzzy neural system. High-level characteristics must also train
an expert system during signal processing. The challenge that this study will thus
address is how an expert system might be designed using machine learning methods
in IoT. They carry this effort out to monitor, predict, and diagnose major diseases
by developing a cloud and IoT-oriented healthcare program. We develop a disease
detection system based on IoT and cloud in this study.
3 Materials and Methods
This IoT machine will also help persistently involve and serve patients by allowing
them to spend more time in communication with their physicians. This proposal
suggested a more intelligent system for monitoring patient health intelligently by
intelligent bio-sensors that capture patient health data in real time. A heartbeat, blood
pressure, and a temperature sensor DS18B20 were connected to the patient. This will
allow the physician to observe the patient from anywhere, and even to submit the
patient directly without leaving the hospital as shown in Fig. 2. Within the healthcare
sector, IoT employs a long-term history of continuous measurements to identify a
disease. In a healthcare context, the diagnosis requires for an aggregated collection of
measures for effective outcomes that cannot be achieved with a single clinic visit. IoT
Fig. 2 An overview of IoT-based m-health architecture for disease diagnosing
devices give personal healthcare a means to a healthy and affordable environment.

Thus, a good healthcare system that promotes patient-centered care uses IoT devices.
The study provides a framework for the diagnostic disease from the m-health aspect
by using fog-assisted IoT. Developing a server-side health diagnostic system for
computer user diagnostic results (UDR). Finally, the rehabilitation of the severity of
the disease via an alert mechanism.
Patient subsystem
We gather patient health data through a data gathering framework that enables for a
smart, miniature low-energy sensors and other medical instruments to be integrated
seamlessly. We place these sensors in, around, or on the human body to measure
the activity of the body. In our approach, all wearable and embedded sensor devices
form the body sensor network of the human. These sensors capture physiological
parameters from patients in an organized and unstructured way; we send them to the
coordinator, identified as a local processing unit (LPU). Because heterogeneous IoT
devices have various internal clock structures, we must sync them for timely cloud
layer processing in Fig. 2.
In addition, the gateways must be scheduled to have temporal convergence with

multiple datasets before transmission with the current viewpoint, where the time
is a significant feature. It transmits obtained data through wireless communication
devices such as 4G/GPRS mobile networks in the connected cloud storage repository.
The system uses a wireless transmission module comprising LoRa, a gateway, and
a server in the absence of mobile networks in some areas. As a mobile detector,
the terminal system incorporates multi-sensor and LoRa shields to capture and relay
datasets to the LoRa gateway, a transfer station, which transmits datasets to the
net server. With high quality, low power usage, and a long reach, this LoRa-based
model will transfer data from the terminal node to the server wirelessly. We build
the transmission mechanism on a broadcast protocol that guarantees that the user
(patient) connects to the gateway anywhere he is. We encrypt the channel with secure
socket layer (SSL) for authentication and security for data access during transmission.
IoT-based Cloud layer
In a cloud-based network, each user’s balanced sensory IoT data is collected. Because
we sensed ubiquitously the data with various time units, it is processed as a file storage
cache on a cloud side server. We transferred the measures of well-being through the
health diagnosis scheme, which is used to assess the person’s health status by the
examination and diagnostic process. We focused the diagnostic approach on pre-
defined terminology from patient documents, medical expertise, and consultants. In
comparison, the tenant database only supplies registered physicians and caretakers
with users’ personal details. We collected the information in the name of a patient
diagnosis outcome (PDO) record and comprises the {potential condition, severity,
probability} relation.
In the medical industry, cloud computing is fast becoming a requirement. It might
help transform health care into a real-time exchange of patient information between
physicians on critical situations. For instance, it might use a public cloud infrastruc-
ture to provide public access to general health information or to leverage medical
resources through a public cloud approach in the healthcare facility. Hospitals and
medical clinics might use their own medical data (not patient data) with a public
cloud for distant storage. The public cloud might basically provide agility and cost
savings to the healthcare sector. Alternatively, a private cloud might link healthcare
professionals to securely transmit electronic records and exchange patient health
information. Such data could include the clinical uses.
Alert generation in proposed methodology
User diagnostics data were used to generate alerts to clinicians and caregivers in our
health area. Input data for generating alerted and emergency alerts when the DSTP
is the patient’s form of disease are: PDO: (PATIENTi , T start , T end ) → (DSTp , Level,
Probability). Alerts depend on the state of user’s health and probabilistic values
developed in our method for DSTp disease noted as (DSTp ). The T start provides
information about the diagnostic starting period and continued until the end time.
We define the type of disease with an attribute of DSTp , the disease stage is optional
and the level attribute determines the disease’s reliability. First, we extracted the
PDO from the diagnostic module for the patient. If the probabilistic value for the
PDO instance is lower than the prefixed threshold, we record the health condition
of the individual as healthy. On the other side, if the probabilistic value exceeds the
prefixed threshold, then the health of the patient is not healthy.
They sent the recorded data to the server using sensors. We show the results of the
application and web browser in Arduino. The accuracy of the scheme suggested is
calculated by the formulas.
α(xi)
Accuracy = (1)
m
The accuracy of the scheme proposed is determined by (1). The specificity of
the percentage for the data in the experiment is (1), α(xi), and m is the number
of tests. In this series of data, the average accuracy is 98%. The test results show
that intelligent and logical decision-making renders the IoT device dependent on the
sensor effective and workable. The IoT approach increases device functionality and
performance. The effects are determined by the formula showed the percentage error
(2). Here accuracy is essential for the accepted value, and it achieves an accuracy of
the experiments.
Accepted value − Experimental value

Error % = × 100 (2)
total value
Table 1 shows the patient report sample produced on the server when data is
received from sensors and transmitted via smart devices. The patient’s records, the
Table 1 Report of patient

Patient data
Patient name (PN) MANISH SHRIVASTAVA
Patient ID (PID) 32,154
Patient address E-7 ARERA COLONY HYDERABAD
Body sensor data
Body temperature 100 °F
Pulse rate (PR) 77 BPM
Blood pressure (BP) 95/140
Symptoms
Fever
Headache
Weight loss
Table 2 Sensor data for the experiments

PN Temperature (°F) Pulse rate (PR) (%) Blood pressure Blood pressure
(BP-low) (BP-high)
P1 99.4 105 82 125
P2 100.5 98 102 180
P3 102.3 108 82 133
P4 102.8 75 85 138
P5 103.2 65 94 140
Table 3 Tuning of sensor data with patient diagnostic outcome module

PN Temperature PR BP Accuracy (%) Error (%)
P1 H L VH 98.6 2.5
P2 H L H 95.2 4.6
P3 N H M 88.7 13.5
P4 L H M 90.1 11.1
P5 N N L 93.5 6.7
H: high, L: low, VH: very high, N: normal, M: medium
sensor data, and the effects of the patient are three sections of the study. The data
gathered through sensors are presented in Table 2 for 5 patients at different intervals.
The input data is obtained and calibrated; the second stage includes utilizing the
patient diagnostic outcome module to determine the state of the patient. The tuning
performance values of the input data are displayed in Table 3.
The data obtained by the temperature monitor, the pulse rate (PR) sensor, and
the blood pressure sensor (BP) are variations of Figs. 3 and 4. Particularly we can
observe that for patient P5 all the data is abnormal.
The patient diagnostic result module decides, and as seen in Fig. 4, the accuracy
of the decision is calculated. We see the accuracy of the method for the suggested
system in Table 3 from 94 to 100%. It shows that the system proposed operates under
rules specified for patient care and management decision making.
CakePHP: It is an open-source platform for quick web applications developed into
PHP. It is a free source framework. We focus it on the MVC design concept, which
enables users to quickly and easily create PHP web applications with less code.
CakePHP allows one to distinguish business theory from the data layer and layer of
presentation in Fig. 5.
In this work, we use the CakePHP Web server page for the physician to view
medical data in real-time clinical data, and patient historical health data have to enter
patient secret ID, as shown in Fig. 5. In addition, practically VMware or Microsoft
virtualization is the choice of medical institutions that finally decide on a private,
public, or hybrid cloud solution. Usually, we advocate using the Microsoft secured
cloud platform on intelligent designers that leverage Hyper-V and System Center
Fig. 3 Data discrepancy Temperature

collected through sensors 180
Pulse rate(PR)
BP-low
160
BP-high
140
120
Sensor value
100
80
60
40
20
0
P1 P2 P3 P4 P5
Patient name
Fig. 4 Results of patient 100

diagnostic outcome
90
Accuracy
80
Error
70
Performance in %
60
50
40
30
20
10
0
P1 P2 P3 P4 P5
Patient name
Windows Server. This scalable solution meets the demands of most expanding busi-
nesses, simply powering cloud applications and/or providing cloud-based services
and operations. In particular, the Microsoft Azure cloud service provider offers easy
access to healthcare applications and data on request. Microsoft offers network,
server, and storage providers with a PaaS environment.
Fig. 5 Authorization user

interface to patient access
health data
5 Summary/Conclusion
The diagnostic method can be rendered more accurate and efficient by indulging
medical devices in the IoT environment. We, however, introduced our smart devices,
adapted, and optimized to be configured to a wider elderly and disabled commu-
nity automatically. The method suggested comprises body temperature, pulse, and
blood pressure sensors for determining the patient’s condition under examination.
The system used the information base and patient diagnosis outcome of the system for
smart decision-making of health treatment, surveillance, and management to deter-
mine the potential symptoms and remedy. However, our healthcare system architec-
ture derives a conclusion from the evaluation of patients (PDOs) based on medical
and other sensor measurements. This formal model often includes main terminology
and principles, the technique for diagnosing disease, and the mechanism for alarm
production. Future research needs to be developed in order to create a correctly error-
free tracking and acquisition method in health care with high-quality bio-sensors and
to gather health data may even be useful to predict diseases in patients.
References
1. A. Sharma, A. Bhatt, Quantum Cryptography for Securing IoT-Based Healthcare Systems

(2021). https://doi.org/10.4018/978-1-7998-6677-0.ch007
2. K. Narmatha, M. Banuchitra, Privacy Enrichment Model for IoT Based Healthcare System
(2020), pp. 1–5. https://doi.org/10.1109/ICCIT-144147971.2020.9213786
3. K. Hameed, I. Bajwa, S.A. Ramzan, A. Khan, An intelligent IoT based healthcare system using
fuzzy neural networks. Sci. Progr. 1–15 (2020).https://doi.org/10.1155/2020/8836927
4. P. Parker, S. Banerjee, B. Korc-Grodzicki, Communicating with the Older Adult Cancer Patient
(2021). https://doi.org/10.1093/med/9780190097653.003.0085
5. C. Perissinotto, C. Zhang, T. Oseau, D. Balik, C. Sou, C. Burnight, K. Burnight, Feasibility of
a tablet designed for older adults to facilitate telemedicine visits. Innov. Aging 3, S975–S975
(2019). https://doi.org/10.1093/geroni/igz038.3534
6. M. Bansal, B. Gandhi, IoT & Big Data in Smart Healthcare (ECG Monitoring) (2019), pp. 390–
396. https://doi.org/10.1109/COMITCon.2019.8862197
7. C. Rajakumar, S. Radha, Smart Healthcare Use Cases and Applications (2020). https://doi.org/
10.1007/978-3-030-37526-3_8
8. A.D. Preetha, T.S. Pradeep Kumar, Leveraging fog computing for a secure and smart healthcare.
Int. J. Recent Technol. Eng. 8, 6117–6122 (2019). https://doi.org/10.35940/ijrte.B3864.078219
9. U. Ulusar, E. Turk, A. Oztas, A. Savli, G. Ogunc, M. Canpolat, IoT and Edge Computing as
a Tool for Bowel Activity Monitoring: From Hype to Reality (2019). https://doi.org/10.1007/
978-3-319-99061-3_8
10. P. Verma, S. Sood, Cloud-centric IoT based disease diagnosis healthcare framework. J. Parallel
Distrib. Comput. (2017). https://doi.org/10.1016/j.jpdc.2017.11.018
11. P. Verma, S. Sood, S. Kalra, Cloud-centric IoT based student healthcare monitoring framework.
J. Ambient Intell. Hum. Comput. 9 (2018).https://doi.org/10.1007/s12652-017-0520-6
12. S. Sood, Cloud-centric IoT-based green framework for smart drought prediction. IEEE Internet
Things J. 1111–1121 (2019). https://doi.org/10.1109/JIOT.2019.2951610
13. K. Kamarajugadda, M. Pavani, M. Raju, S. Kant, S. Thatavarti, IoMT with Cloud-Based Disease
Diagnosis Healthcare Framework for Heart Disease Prediction Using Simulated Annealing with
SVM (2021). https://doi.org/10.1007/978-3-030-52624-5_8
14. M.K. Priyan, L. Selvaraj, R. Varatharajan, G. Chandra Babu, P. Panchatcharam, Cloud and
IoT based disease prediction and diagnosis system for healthcare using Fuzzy neural classifier.
Future Gener. Comput. Syst. 86 (2018). https://doi.org/10.1016/j.future.2018.04.036
15. K. Praveen, P. Prathap, S. Dhanasekaran, I. Punithavathi, P. Duraipandy, I. Pustokhina, D.
Pustokhin, Deep learning based intelligent and sustainable smart healthcare application in
cloud-centric IoT. Comput. Mater. Contin. 66, 1987–2003 (2021). https://doi.org/10.32604/
cmc.2020.012398.R
Pre-processing of Linguistic Divergence
in English-Marathi Language Pair
in Machine Translation
Simran N. Maniyar, Sonali B. Kulkarni, and Pratibha R. Bhise
Abstract Machine translation (MT) in the field of natural language processing

(NLP) and computational linguistics (CL). In the MT research area, divergence is
one of the major issues. The complexity that occurs in the specific translation, diver-
gence affects the language translation quality. This present paper discusses the pre-
processing techniques such as tokenization, stopwords removal, POS, and parsing
in machine translation. First, we take thematic divergences sentences and we do all
pre-processing of these sentences. The source sentence is analyzed and finds the set
of morphemes that are used to check the type of sentence. Using the parts of speech
(POS), word information and attach POS tags with each individual unit. These units
can be used for word order identification and further thematic sentence processing.
With the help of parsing, we observe the sentence structure and identify the thematic
divergence.
Keywords Machine translation · Tokenization · Stopwords · Parsing · POS
1 Introduction
The use to translate text from one natural language known as the source language
to another known as target language explores in machine translation (MT) [1, 2].
Machine translation does not simply take over from words but the application of
complex linguistic knowledge; grammar, morphology, and machine translation entire
by humans; meaning all these things are taken deliberation major goals of machine
translation are morphological analysis, POS tagging, chunking, parsing, and word
sense disambiguation [3]. MT uses various approaches for translate source language
to target language between two language pairs [4, 5]. The divergence is a major
complication in translation between two language pairs. The variations that arise
in the language with respect to grammar is known as divergence. The divergence
mainly arises when translation from natural language as a source language to the
S. N. Maniyar (B) · S. B. Kulkarni · P. R. Bhise

Department of Computer Science and Information Technology, Dr. Babasaheb Ambedker,
MarathwadaUniversity, Aurangabad, India
372 S. N. Maniyar et al.
target language, MT presents a high case of divergence at various grammatical levels

and extra grammatical levels in Marathi-English language pair [6, 7]. It is important
to identify the various types of divergences to get the correct translation for Marathi
sentences to English and to Marathi also. Now, we present a research paper on
pre-processing of the English to Marathi sentences. One thing is, we observed that
there are many tools available for tokenization of the English language but does
not work well for the Marathi language because there are different morphological
symbols which used in word-formation of Marathi language which is removed by
these tokenization tools in Indian languages (Marathi in our case). We tokenize the
Marathi and English language sentences using our own code before training. And
after tokenization, we apply the POS of tagging and parsing the sentences. Using the
parsing, we identify the structure of sentences and identify the divergence. Finally,
completing the tokenization, stopwords removal, POS tagging, and parsing all are
pre-processing techniques and after doing this, we removing non translated words
in target sentences, removing noisy translations, and unwanted punctuations. After
that, we will apply artificial neural network technique.
Artificial neural network is the technique of machine learning. This is used for
predict the output values from given input parameters from their training values and
also used for modeling non-linear problems. Pattern recognition or data classification
is the application of ANN and this is technique of machine translation. Using artificial
neural network to increase translation accuracy and fluency, the process of machine
translation. Neural machine translation is based on a simple encoder-decoder-based
network [8].
2 Related Work
Jisha P Jayan et al.: Here, authors say that they study the divergence in Malayalam-
Tamil languages pair, the divergence is more reported in structural and lexical level,
that is been resolved by using a bilingual dictionary and transfer grammar The accu-
racy is increased to 65%, which is promising here discussed the type of divergence
related to translation based on Dorr’s classification. Here, they found semantic or
syntactic types. Here, they use statically machine translation technique and apply
the creation rule. In this paper, author study on thematic, promotional, demotional,
structural, conflational, categorical, and lexical divergences [9].
Vimal Mishra et al.: In this study of divergence, issues in MT are required for
accurate classification and detection. The language divergences between English
and Sanskrit can be considered as representing the divergences between subject-verb-
object and subject-object-verb classes of languages. Here, focuses on conjunctions
and particles related divergence, participle related divergence, and gerunds related
divergence using an artificial neural network approach. Here, apply the detection rule
and adaption rule. They develop EST and the system is ANN and rule-based model
[10].
Pre-processing of Linguistic Divergence in English-Marathi … 373
Niladri Sekhar Dash et al.: In this paper, observed translation divergence in first
language English to target Bengali language pair, it also investigates how the different
linguistic and extra linguistic constraints can play decisive roles in translation,
resulting in divergences and other problems. The main objective of this paper is
to examine the type of divergence problems that operate behind English to Bengali
translation, and resolution of this problem is a pre-requesting for designing a robust
machine translation system here used the rule-based machine translation system
focuses on syntactic divergence and lexical-semantic divergence [11].
Sreelekha. S et al.: In the present study, author describes some ways to utilize
different lexical resources to improve the quality of the statistical machine translation
system. They develop the training corpus with different lexical resources such as
IndoWordnet semantic relation set, Kridanta pairs, function words, and verb phrases
here study on the usage of lexical resources mainly focused on two ways such as
augmenting with various word forms and augmenting parallel corpus with more
vocabulary. Here, analysis of errors for both Marathi to Hindi and Hindi to Marathi
machine translation systems. Here, they have used various measures to evaluate
such as BLEU score, METEOR, TER, and fluency and adequacy using subjective
evaluation [12].
Pitambar Behera et al.: In these study, different types based on the problem that they
are focused on grammatical, communicative, cultural, and so on. In case, those two
languages owe their origin to various language families, grammatical divergences
emerge. This research attempts to classify various types of grammatical divergences:
lexical-semantic and syntactic. In addition, it also helps to identify and resolve the
divergent grammatical features between Bhojpuri language to English language pair
him methodology are concerned, they have adhered to the Dorr’s lexical conceptual
structure for the resolution of divergences. This study has proven to be useful for
developing efficient MT systems if the mentioned are incorporated, considering the
inherent structural constraints between source and target languages [13].
Ritu Nidhi et al.: In this paper, the author says Maithili is a less-resourced language
in terms of technology development. This paper, therefore, is an attempt to create
a general-purpose machine translation (MT) system between this pair of languages.
Divergence detection and handling as a pre- or post-process are critical in automatic
translation to result in comprehensible outputs. The authors have focused only on
the lexical-semantic divergences in this paper. “This paper has reported the results
of only one training and testing on the MTHub.” They while reporting the progress
of developing an MT system for the English-Maithili pair present an account of
identifying and classifying MT divergences in the English-Maithili language pair
[14].
R. Mahesh K et al.: In the present study, they take Dorr’s (1994) classification of
translation divergence examine to implication of these divergences and the translation
patterns between Hindi and English locate further details. They attempt to identify
the potential topics that fall under divergence and cannot directly or indirectly be
accounted for or accommodated within the existing classification. They classify the
divergence from Hindi to English and vice versa on the basis of that recommend an
augmentation in the classification of translation divergence, this is the objective of
this study. In this paper, they have examined the barrier of classification of translation
divergence for MT between English and Hindi [15].
3 Methodology of Pre-processing
In any machine translation system, this topic is much needed as to obtain a correct
translation, it is important to solve the nature of translational divergence. Divergence
can be seen at various levels. Here, we do pre-processing steps for ANN technique.
We are using Python language. First, we clean the text, then we tokenize the sentences.
After sentence tokenizing, removes all punctuations of these sentences, we do stop-
words and next we part of speech tagging the sentences and last parsing the sentences
in Fig. 1.
3.1 Database
Most of use a first language is English to other languages. “Indian constitution has
22 “scheduled” or national languages and almost 2000 dialects. Though, only about
5% of the world’s population speaks English as a first language.” English is very
widely used in media, commerce, science, technology, and education in India. In
such a situation, there is an obvious large market for translation between English and
the various Indian languages. In Maharashtra widely speak, the Marathi language. It
Fig. 1 Pre-processing of
ANN technique Input sentences
Tokenization
Stopwords
Parts of speech tagging
Parsing
Fig. 2 Tokenization of English and Marathi sentences
is a mother tong is Maharashtra people that why we give a database in Marathi to

English language pair. We give a Marathi and English sentences.
3.2 Tokenize Words
For the data supplied by the organizers, we stick to their sentence segmentation and
tokenization. For the further data, we use a trainable tokenize that can be easily
adapted to a new language simply by providing a few instances of sentence and
token breaks. Tokenization is a process of identification of token/topics within input
sentences and it helps to decreased search with a significant degree [16]. The advan-
tage of tokenization is reduced the storage spaces required to store tokens iden-
tified from input sentences and also tokenization effective use of storage space.
The first step in text analysis and processing is to split the text into sentences and
words, a process called tokenization. This is the first step for machine translation and
tokenizing a text makes further analysis easier.
Input English = “Ram likes Sita, I likes sweets, I want sweets”
Input Marathi-p = “
”
Here, we take an English and Marathi sentences. We tokenize this sentence and
show the output in Fig. 2. In tokenization, break the sentences into the words.
3.3 Stopwords Removal
The text is to be classified into different categories; removed from the given text so
that focus can be given to those words which characterized the meaning of the text
in the stopwords. It is like a text classification. After applying stopwords, the time to
train the model decreases and dataset size also decreases. Then, removing stopwords
Fig. 3 Stopwords removal of English and Marathi sentences
in the database can potentially benefit better meaningful tokens left and performance
as there are fewer and also help the increases classification accuracy. We use the
NLTK library for removing stopwords in Fig. 3.
”
Here, we remove the stopwords in the database. We show the output Fig. 3, we
classified the database and removing the meaningless words. Clean the database and
we use a NLTK library.
4 Parts of Speech Tagging
Tagging sentence in a broader sense indicates the verb, noun, etc., by the context
of the sentence. Identification of POS tags is a problematic process. Thus, generic
tagging of part of speech is manually not possible as some words may have specific
meanings give to the structure of the sentence. Conversion of text in the form of a
list is an important step since tagging as each word in the list is looped and estimate
for a particular tag in Fig. 4.
Here is an example using this text:
”
Here, we apply POS of tagging for both sentences. Show the output for Fig. 4.
Tagging means labeling of the words part of speech tagging is the one where we add
the parts of speech category to the word depending upon the context in a sentence, it is
also called the Morph-syntactic Tagging. Tagging is essential in machine translation
to follow the target language. This is the objective of our study. Particularly, in MT,
Fig. 4 Part of speech tagging of English and Marathi sentences
when the system explains the POS of source language text then only it will translate
into the target language without any errors. So POS Tagging is an important role in
MT.
5 Parsing
The parser is used for main purposes in the MT system. The parser is used for the
syntactic study of the English sentence in order to give the parse tree structure of the
English sentence by a context-free grammar. The parser is used for parts of speech
(POS) tagging of the English sentence to allow English words and their comparable
POS tags.
Input English = I like sweet
Input Marathi =
Here, we show the output of parsing of sentences in Fig. 5. We use an NLTK
library. The parsing process is the first basic of the process engine. This basic is
important for the top to bottom analysis. This is the sub process of parsing such as
the input process, sentence analyzer process, morphological analysis process, the
EtranS Lexicon, and the parsing process.
This divergence occurs owing to the differences in understanding of the argu-
ments of a verb. In the following examples, the accusative case with the other NP
(sweets) whereas in Marathi and the English sentence has the nominative case with
the pronominal, the NP (pasand) has nominative and the pronominal seems to possess
dative case.
Fig. 5 Parsing of English and Marathi sentences using NLTK
6 Conclusion
In this work, it explains the various standardized approaches in the field of machine
translation word wide and especially with context to Indian languages. We use the
method word tokenize. Here, we are splitting the sentence into words. Using the
output of tokenization, understanding the text in machine translation. It can also be
provided as input for further text cleaning steps like punctuation removal, numeric
character removal POS of tagging result. In this result, some words have different
meanings according to the structure of the sentences. Use parsing technique for the
morphological analysis of words in the English and Marathi sentence, to get the
morphology of English and Marathi words. The morphology information of English
is used in the morphological synthesizing for equivalent Marathi words. With the
help of parsing, we identify the structure of sentences and divergences.
References
1. S. Muzaffar, P. Behera, G.N. Jha, Classification and Resolution of Linguistic Divergences in

English-Urdu Machine Translation (2016)
2. N. Ata, B. Jawaid, A. Kamran, Rule Based English to Urdu Machine Translation
3. W.S.N. Dilshani, S. Yashothara, R.T. Uthayasanker, S. Jayasena, Linguistic Divergence of
Sinhala and Tamil languages in Machine Translation (IEEE, 2018)
4. S.B. Kulkarni, P.D. Deshmukh, M.M. Kazi, K.V. Kale, Linguistic divergence patterns in English
to Marathi translation. Int. J. Comput. Appl. 87(4) (2014)
5. M.D. Okpor, Machine translation approaches: issues and challenges. Int. J. Comput. Sci. 11(5,
No. 2) (2014)
6. R. Mahesh, K. Sinha, A. Thakur, Translation Divergence in English-Hindi MT (2005)
7. D. Gupta, N. Chatterjee, Identification of Divergence for English to Hindi EBMT (2003)
8. M.S. Mhatre, F. Siddiqui et al., A review paper on artificial neural network: a prediction
technique. Int. J. Sci. Eng. Res. 6(12) (2015)
9. J.P. Jayan, E. Sherly, A Study on Divergence in Malayalam and Tamil Language in Machine
Translation Perceptive
10. V. Mishra, R.B. Mishra, Divergence Patterns Between English and Sanskrit Machine Transla-
tion (2014)
11. N.S. Dash, Linguistic divergences in English to Bengali translation. Int. J. Engl. Linguist. 3(1)
(2013). E-ISSN: 1923-8703
12. S. Sreelekha, R. Dabre, P. Bhattacharyya, Comparison of SMT and RBMT. The Requirement
of Hybridization for Marathi–Hindi MT (2019)
13. P. Behera, N. Maurya, V. Pandey, Dealing with linguistic divergences in English-Bhojpuri
machine translation, in Proceedings of the 6th Workshop on South and Southeast Asian Natural
Language Processing (2016)
14. R. Nidhi, T. Singh, Divergence identification and handling for English-Maithili machine.
Pramana Res. J. 9(2) (2019). ISSN: 2249-2976
15. R. Mahesh, K. Sinha, A. Thakur, Divergence Patterns in Machine Translation between Hindi
and English
16. V. Singh, B. Saini, An effective tokenization’s algorithm for information retrieval system, in
First International Conference on Data Mining (2014), pp. 109–119. https://doi.org/10.5121/
csit.2014.4910
Futuristic View of Internet of Things
and Applications with Prospective Scope
M. Prasad, T. K. Vijay, Morukurthi Sreenivasu,

and Bekele Worku Agajyelew
Abstract The IoT connects with physical and digital environment. Nowadays, one
of the purposes of the Internet is its movement. The IoT is an example where
common articles can be incorporated into collecting, acquiring, managing systems,
and managing opportunities that will allow them to compare with the Internet to
achieve any cause. The IoT will generally join everything in our reality under a
broad framework. Each item will have a selected interface and will actually want to
find it for themselves as well as the interface and the Internet. RFID methods will
form basis of IoT. IoT gadgets will be unavoidable, will allow for the surrounding
environment, and enable the development of knowledge in the right way. Paper covers
current state of testing, and importance of IoT is reflected in mobility structure. The
current research takes care of IoT initiatives through systematic review of academic
papers, and competent expert discussions.
Keywords Internet of things · IoT · Machine · Ubiquitous · Ambient · Internet ·

RFID · Wi-Fi · Sensors · Actuators · Distributed computing · Beautiful city ·
Wireless access points
1 Introduction
The term IoT was begun by industry examiners however has gotten progressively
ordinary over the long run. Some keep Internet of Things will totally change how
PC networks are utilized in the following 10 or 100 years, while others imagine that
IoT is only a revelation that won’t positively affect the everyday lives of many. The
Present Address:
M. Prasad · B. W. Agajyelew
School of C & I, CoET, Dilla University, Dilla, Ethiopia
T. K. Vijay (B)
Information Technology Department, CoET, Samara University, Semera, Ethiopia
M. Sreenivasu
Department of CSE, GIET College of Engineering, Rajamahendravaram, A.P., India
e-mail: msreenivasucse@giet.ac.in
382 M. Prasad et al.
IoT has become an undeniably developing subject of discussion both all through
the working environment. The thought will not simply influence the way we live so
far how we work. The IoT sets the opportunity to investigate, gather, and abuse the
developing choice of social information. It is normal for IoT devices to be introduced
on a wide range of hot force sources like switches, bulbs, power plugs, TVs, and so
forth and to have the option to speak with the stock association so it can more readily
change energy age and force utilization. Web of Things (WoT) is the original name
of Kevin Ashton, who saw the worldwide tactile arrangement of this present reality
on the Internet. Aside from the way, that things, the Internet, and access are three
things in the IoT, which is imperative to close the hole between the actual world and
the high-level world in self-creation and self-improvement [1, 4].
Any individual who asserts that the Internet has generally improved society might
be correct yet at the same time, a reasonable change is still in front of us. The different
works of art of hypothesis are as of now dynamic which implies that the Internet is
very nearly opportunity advancement as the tales are huge and unusual and imitate
their Web uniqueness. The triumph from the Internet of PCs where our workers and
PCs are associated with the more extensive authoritative system, just as the Internet of
adaptable structures, while it was the turn of telephones and other portable segments,
the following period of progress is IoT, will be coordinated and open in the visual
space [2, 3]. This change will be a significant extension of the Internet consistently and
will have control limits on every industry, just as the completion of our day-by-day
presence.
1.1 History
1.1.1 IoT Definition
No uncommon depiction is accessible on the Internet of Things that applies to the

area of the client area [10]. Honestly, there are broad gatherings including scholastics,
experts, experts, train suburbanites, engineers, and corporate individuals who have
characterized the term, despite the fact that its fundamental use has been given to
Kevin Ashton, an improvement subject matter expert. The Web of Things is creating
and keeps on being the latest, and still broadly promoted, in the IT world [5] when-
ever, anyplace associated with anything and not simply to anybody [6]. The Web has
profited more from our lives in a lot more limited time than some other advance-
ment in the arrangement of conditions. It has aggravated the transmission frame-
work influencing our current circumstance. [8] The association will utilize a solitary
Internet gathering address that gives transmission without human intervention. This
new status is called IoT. The IoT name is legitimized with the MIT Auto-ID center
around [15]. Things are dynamic individuals in business, information and social
cycle’s l where they are engaged to coordinate and communicate themselves and
Futuristic View of Internet of Things and Applications … 383
the climate by exchanging data and information got on the climate [1–4]. Special-
ists have gaged that IoT have almost 50 billion duplicates by 2020 which is too big
achievement and it gets increased year-on-year basis.
The IoT is something irritating about the finish of enlistment and writing, and
its advancement relies upon uncommon unique improvements in an assortment of
significant fields, from far off sensors to nanotechnology. The principle Internet assets
were the Coke machine at Carnegie Melon University during the 1980s.
1.1.2 IoT Highlights
The essential idea of IoT was introduced in an uncommon ITU report in 2005. It
has genuine articles on the Web. To be exact, genuine articles are related with that
utilization of the Internet [7]. The ITU has verbalized the idea of IoT, and it has
coordinated the material into four key classes of marking materials, items, ideas,
and agreements. Moreover, Wikipedia additionally traces the advantages of IoT,
and it recommends six classifications of shrewd designing [6–8], complex systems,
size thinking, time contemplations, and space contemplations. Consequently, model-
driven strategies and compelling methods will be accessible just as new ones that
are prepared to treat the delivery and strange improvement of cycles. For IoT, the
meaning of the occasion won’t be founded on a dynamic or execution model. It tends
to be, so to speak, in light of the setting of the real occasion.
The brand name is an unforeseen construction. In open or shut circles, after some
time, it will be thought of and engaged as an erratic structure because of the huge
number of various associations and communications between smug purchasers, just
as its capacity to fuse new possible pioneers and the thought of time. In the IoT,
comprised of billions of times something similar and reliable, time will presently
don’t be utilized as a norm and exact measure yet relies upon everything (object,
estimation, information system, and so on).
2 IoT Use
The opportunities offered by IoT make it possible to consider building a variety of

dependent systems, of which only a few are currently deployed [10]. Web models of
items ranging from home related to direct wear to medical care. Truth be told, IoT
is becoming increasingly important in every aspect of our lives. In the future, there
will be smart applications for bright homes and workplaces, smarter transportation
agencies, smarter clinics, smarter efforts, and production lines [14]. Not only does
the Internet of Things improves our comfort, but it also gives us more control over
how to improve the quality of life of a typical employee and personal commitment.
In the accompanying paragraphs, some part of the IoT methods are temporarily used.
2.1 Health
The IoT is proposed to improve the quality of life of the individual through the use
of technology as part of the vital human endeavors. In that sense, the test and the
force of movement can be removed from the human side to the machine side [16].
The main use of IoT in medical services is in assisted living conditions. Sensors can
be set up to monitor the hardware used by the patients. The information collected
by these sensors is made accessible to professionals, relatives, and other investors to
improve treatment and more responses.
Utilization of IoT
(1) Smart use
Keen metropolitan networks should screen water accessibility to guarantee
that there is satisfactory admittance to the necessities of inhabitants and orga-
nizations. Far off sensor networks offer new systems to metropolitan networks
to completely assess their water admission structures and recognize their most
pessimistic scenario situations [11–16]. Metropolitan zones that focus on the
progression of water through tangible innovation make better quality to end
speculations from their business. Tokyo, for instance, has chosen to save $ 170
million yearly by distinguishing the impacts of early flooding (LIBELIUM,
2013). The structure may report the subtleties of the funneling of the streams
through pipelines, similarly as it sends prudent steps if water use is past ordi-
nary assumptions. This permits the reasonable city to decide the area of the
pipelines and to zero in on therapeutic estimates dependent on the size of the
water debacle that can be shut by woods [20–23].
(2) Smart homes and work environments
Different electrical deficiencies around us, for instance, microwaves,
coolers, warming, constrained air frameworks, fan, and lights are surrounding
us. Actuators and sensors can be acquainted into these devices with utilize
increasingly more energy to add additional solace all through your everyday
life. These sensors can quantify open air temperature and can likewise figure
out which inhabitants are inside and suitably control the measure of warmth,
cooling and light dispersion, and so forth Doing this can assist us with
decreasing expenses and increment energy investment funds [10].
(3) Improved rec offices
Consolidating new advancements, for example, an alternate exercise profile,
which can be brought into the machine, can improve the comprehension of the
amusement community and everybody can be seen from their own interesting
character, and therefore, the significant profile will be presented [23–25].
(4) Food stockpiling
The food we eat requirements to go through various stages before it can
show up in the fridge. They are caught in an intricate food cycle: making,
reaping, shipping, and conveying. With the utilization of suitable sensors, we
can hold food back from being harmed by the climate by checking the temper-
ature, stickiness, light, heat, etc. [10]. The sensors can quantify these sorts
straightforwardly and educate the individual concerned. Attention to assets to
keep away from a plant you can envision [24].
2.2 Transport and Liaison Area
(1) Smart stance

Another smart parking sensor will be introduced in the parking garages to
isolate the look and feel of the vehicles. Savvy suspension gives a far-reaching
suspension of board design that helps drivers during investment funds and
fuel utilization (LIBELIUM, 2013). More noteworthy obligation to conclusion
comes from drivers searching for open parking spots [20]. Giving explicit
data about parking spots assists traffic with better conveyance, and this will
likewise take into consideration the arranging of the utilization of parking spots
straightforwardly from the vehicle [16]. This will help lessen CO2 outflows and
cutoff gridlocks.
(2) 3D help driving
Vehicles like vehicles, street transport and tangible rail, and tactile rail can
give significant data to the driver to give better course and wellbeing. With the
utilization of helped driving, we will really need to follow the correct track
with past insights regarding gridlocks and appearances. In business planning,
information on moving vehicle deals and information about the kind and status
of items can incorporate giving significant information about conveyance time,
conveyance postponements, and inadequacies [15].
(3) Unlikely Guidelines
Voyager has created named rules that permit NFC-arranged telephones to
utilize information about spots and immediately interface them to Web site
admins who give data about convenience, eateries, milestones, theater, and
close by area.
(4) Combination
Making an Internet of Things for vision chain has numerous benefits: RFID
and NFC can be utilized to adequately handle all stock organization inter-
changes, from object identification, unlawful buying, creation, transport and
force, object supply and subsequent to managing the board. With the assis-
tance of IoT, we will follow stocks so that stocks are delivered on schedule for
ordinary exchanges, and this will lessen the client catch period which brings
shopper devotion, prompting more agreements in the trade [21–23].
3 Trouble of IoT
Giving security to this new goliath is extremely troublesome, particularly in light of

the fact that there is no restriction or deterrent to strolling. In this segment, we give
the potential troubles that IoT can confront. An assortment of remote and distant
access networks is needed to empower diverse framework necessities [22].
Fundamental force: Many IoT applications require the utilization of yearly over
two batteries and decrease ordinary force utilization. IoT inconveniences the psyche:
The advancement of an IoT application ought to be simple for all architects, not
simply experts.
Government interest: If the government at that point takes into account, the simple
situation of IoT in a specific country can be thought of. The public authority permits
it when they profit by this advancement. So, depend essentially on the economy and
pay of the country.
Likenesses: As devices from various producers will be connected, a matter of
similitude is recorded as a hard copy and watching the gather. Albeit this relapse
might be decreased in the event that all producers concur on a standard method, even
from on, extraordinary issues will be endured [22, 23]. Today, we have Bluetooth-
empowered devices and comparable issues that exist even in this astonishing! Issues
with comparability can bring individuals who purchase gear from a specific maker,
making its framework more participatory.
3.1 Thought of Substance for Security
With security techniques dependent on set reasoning, it is fundamental that any

critical part in the setup will be inclined to progress [19]. For instance, if the sensor
in view of the fantastic quality can’t identify the picture, security use can’t be applied
to that picture. A portion of the features of the section ought to be given to give the
necessary data from the setting. Essentially, every so often, booked security chiefs
may work inaccurately in specific circumstances, for the most part since they have
not had the option to see the novel circumstance. By giving speculations on setting
a fundamental test in IoT [23].
3.2 Computerized Contraption in Genuine Combination
As of late, the estimation of different information, the association between the

genuine climate, and the processor has been totally improved. For instance, a vehicle
that can be worked by a focal PC or patient medication will be utilized as a sensor
utilized in its body to give state of being [20]. Also, it is inescapable that when prop-
erty is obtained through genuine climate, it can without a doubt be misused by the
intruder [25].
3.3 Identification in IoT Environment
Proof isolating article and the executives are viewed as one of the approach diffi-
culties during the time spent making the World Wide Web (IoT). Various ID the
board is accessible these days for an assortment of approaches to create and test
ID for better information protection. Regardless, it has never been clarified on this
point concerning what proof-based methods are intentionally proper or how to apply
them in IoT environment. What is more, by far, most of existing proof frameworks
have effectively been made for momentary use in nearness to clear purposes. Conse-
quently, the requirement for a worldwide trust to recognize proof is fundamental
[21].
3.4 Approval of Contraptions
Heaps of sensor-fueled devices and actuators should keep up clear techniques and
rules for confirmation to permit sensors to communicate their information. As of
now, shortsighted plans in this field have not been given much regarding setting [23].
Presently, in a circumstance where we need to give tactile security, we need to utilize
practical courses of action, which is in opposition to the objective of the IoT strategy
to give lightweight gatherings [14].
3.5 Information Mix
We will have many data made by IoT. Joining this data to give more prominent
comprehension by just giving an incredible new wide-running security mix can make
information methodologies, which moves us to a sudden customer profile. Nonethe-
less, these segments can jeopardize the security of clients by sharing their data which
can make critical challenges in such manner [9].
3.6 Contrasts in IoT
As innovation turns into the quantity of clients and contraptions with a wide scope of
correspondence and consistently developing turns of events. IoT needs to give coop-
eration on a limitless number of things that are basic in authoritative construction.
Consequently, IoT needs to empower segments that control energy consumption,
guaranteeing the wellbeing of this enormous number of items [15].
3.7 Safe Setup and Configuration
Handling IoT versatility tests should be driven so as to have secure settings and
fabricate. A significant system of the structure can be made dependent on wellbeing.
For instance, help can be coordinated so all customers can adapt to specific assort-
ments of individuals who can get to information, and populace decay can be seen with
power [26]. Along these lines, it is essential to furnish security designing with proper
projects. At the end of the day, having symmetric or upside-down cryptographic
declarations according to conditions give a safe structure. Joint effort to construct
this undertaking is trying, particularly the enormous number of IoT contraptions that
it will confront [18–20].
3.8 CI and IoT
The effect of IoT upgrades on CI (Sensitive Infrastructure, for example, power,

telecom and resources) should be removed on the grounds that IoT innovation will
be promoted in CI gadgets, a sensible model is machine motion movement (M2M).
And the new security that gives IoT CI is a preventable test to consider. In addition,
providing IoT security is very important in this regard, for the reason that IoT in CI has
to act with important CI angles, for example, to provide welfare preventing modern
risk, or providing administrators who are expected to have consistent electrical power
in clinics [20].
3.9 Controversial Market Debate
IoT will create a sensitive market by providing relevant information from a variety of
sources. In this way, it will help to meet the needs of most consumers. Accordingly,
the provision of various strategies for verifying individual information will be a legal
issue for brushing with the associated data. This goal should be met with low-weight
protection measures, which are considered experiments [20].
3.10 Thinking of IoT in the Growing Internet
The impact of Internet development is undeniable on IoT. How the Internet is used
and the framework for doing things online are the two main components that affect
IoT.
In any case, data security and protection play a key role in building the Internet.
Obtaining security and readiness to protect the Internet by doing normal work will
create difficulties in this field [12].
3.11 Relationship of the Human IoT Trust
There should be a certain level of trust a person can have in various parts of the IoT.
Relying on machines alongside people who can actually have protection is highly
regarded by analysts. Reliability can be viewed as a level of confidence that can be
achieved with explicit help or an object. In addition, trust is not seen differently from
humans, it can be seen in structures or machines, for example pages, showing a level
of trust in the computer community. From another standpoint, trust can be seen as a
means by which we can be sure that the framework manages its work properly and
provides accurate details.
3.12 Data on Board
Another point of view can be seen as a way to deal with information. The crypto-
graphic and convention components as a rule are excellent data verification decisions,
but for now, we will not be able to apply these techniques to small things. As a result,
we need to be strategic in how we can handle information with different strategic
tools. Alternatively, if this assumption needs to be made, we have to change a lot of
current instruments [16].
Life expectancy for individual IoT items.
The way any object in the IoT should have a short life, and should not last for
long years is obvious. For example, User Datagram Protocol (UDP) management
provides a level of durability, which means they respond with more information than
they have mentioned in the UDP [16]. This development is a result of the fact that the
source address can be ridiculous for the reason that UDP is not connected. Similarly,
the same situation with Global System for Mobile Communications (GSM), wired
equivalent privacy (WEP), and various other remote conventions has shown that this
assumption is wrong [17].
4 The Future of IoT
New information about our IoT future has prevented organizations from looking
at key Internet of Things objects—i.e., equipment, systems, and support—to give
designers the ability to transfer applications that can connect anything within the IoT
level. In this paper, we introduced the IoT and summarized the contextual investiga-
tion into the IoT. With the various propels of the new Internet, the world is looking at
anytime, anywhere, for anyone looking at the world. In the current scenario, “Things”
are basically those gadgets that are programmed and programmed into the IoT. Part
of those items will be available directly on the Internet although some will apparently
be integrated into nearby organizations behind logs and addressing tools. New appli-
cations and organizations are constantly being made, and Internet content is evolving.
In this area, many scientists have suggested the development of IoT. Besides, there
is still much difficulty. To address these issues, we must overcome the difficulties of
IoT. Therefore, future work requires the goal of these problems. Gathering the Web
management required by the client as their disclosure is an important issue in the
IoT environment.
5 Conclusions
IoT has been finding a sea of new changes in our day and day of life, which works
to make our lives easier and more accessible but progress and different applica-
tions. There is an endless supply of IoT applications in all areas including medical,
manufacturing, fabrication, transportation, training, management, mining, and envi-
ronment. Although the fact that IoT has many advantages, there are few errors in
IoT management and execution rate. The main points in this document are that (1)
there is no universal definition however, the most important requirements that exist
in IoT application areas and document development. These areas will develop and
influence human existence through mysterious practices over the next decade.
References
1. B. Mitchell, Introduction to the Internet of Things (IoT). https://www.lifewire.com/introduct

ion-to-the-internet-of-things-817766
2. J. Soldatos, Internet of Things Tutorial: Introduction. http://www.kdnuggets.com/2016/12/int
ernet-of-things-tutorial-chapter-1-introduction.html
3. D. Giusto, A. Iera, G. Morabito, L. Atzori (eds.), The Internet of Things (Springer, 2010).
ISBN: 978-1-4419-1673-0. https://aws.amazon.com/iot/
4. S. Madakam, R. Ramaswamy, S. Tripathi, Internet of Things (IoT): a literature review. J.
Comput. Commun. 3, 164–173 (2015). http://www.scirp.org/journal/jcc
5. E.A. Kosmatos, N.D. Tselikas, A.C. Boucouvalas, Integrating RFIDs and smart objects into a
unified internet of things architecture. Adv. Internet Things: Sci. Res. 1, 5–12 (2011)
6. R. Aggarwal, M. Lal Das, RFID security in the context of “Internet of Things”, in First
International Conference on Security of Internet of Things, Kerala, 17–19 Aug 2012, pp. 51–56
7. M.-W. Ryu, J. Kim, S.-S. Lee, M.-H. Song, Survey on internet of things: toward case study.
Smart Comput. Rev. 2(3), 195–202 (2012)
8. E. Biddlecombe, UN Predicts “Internet of Things” (2009). Retrieved 6 July
9. D. Butler, Computing: everything, everywhere. Nature 440, 402–405 (2020)
10. R. Parashar, A. Khan, Neha, A survey: the internet of things. Int. J. Tech. Res. Appl. 4(3),
251–257 (2016). e-ISSN: 2320-8163. www.ijtra.com
11. H.Yinghui, L. Guanyu, Descriptive models for Internet of Things, in IEEE International Confer-
ence on Intelligent Control and Information Processing, Dalian, China, 2010, pp. 483–486
12. Y. Bo, H. Guangwen, Application of RFID and Internet of Things in monitoring and anticoun-
terfeiting for products, in International Seminar on Business and Information, Wuhan, Hubei,
China, 2008, pp. 392–395
13. A. Grieco, E. Occhipinti, D. Colombini, Work postures and musculo-skeletal disorder in VDT
operators. Bollettino deOculistica Suppl. 7, 99–111 (1989)
14. K. Pahlavan, P. Krishnamurthy, A. Hatami, M. Ylianttila, J.P. Makela, R. Pichna, J. Vallstron,
Handoff in hybrid mobile data networks. Mob. Wirel. Commun. Summit 7, 43–47 (2007)
15. X.-Y. Chen, Z.-G. Jin, Research on key technology and applications for the Internet of Things.
Phys. Procedia 33, 561–566 (2012)
16. M. Chorost, The networked pill. MIT Technology Review (2008)
17. E.Zouganeli, I. Einar Svinnset, Connected objects and the internet of things—a paradigm shift,
in International Conference on Photonics in Switching, Pisa, Italy, 2009, pp. 1–4
18. Z. Tongzhu, W. Xueping, C. Jiangwei, L. Xianghai, C. Pengfei, Automotive recycling infor-
mation management based on the internet of things and RFID technology, in IEEE Inter-
national Conference on Advanced Management Science (ICAMS), Changchun, China, 2010,
pp. 620–622
19. G. Gustavo, O. Mario, K. Carlos, Early infrastructure of an Internet of Things in Spaces (2008)
20. B. Gubbi, P. Marusic, Internet of Things (IoT): a vision, architectural elements, and future
directions. Future Gener. Comput. Syst. 29, 1645–1660 (2013)
21. M. Wu, T.-J. Lu, F.-Y. Ling, J. Sun, H.-Y. Du, Research on the architecture of Internet of
Things, in 2010 3rd International Conference on Advanced Computer Theory and Engineering
(ICACTE), Chengdu, 2010, pp. V5-484–V5-487. https://doi.org/10.1109/ICACTE.2010.557
9493
22. M. Weyrich, C. Ebert, Reference architectures for the internet of things. IEEE Softw. 33(1),
112–116 (2016)
23. J. Gubbi, R. Buyya, S. Marusic, M. Palaniswami, Internet of Things (IoT): a vision, architectural
elements, and future directions. Future Gener. Comput. Syst. 29(7), 1645–1660 (2013)
24. F. Bonomi, R. Milito, P. Natarajan, J. Zhu, Fog computing: a platform for internet of things and
analytics, in Big Data and Internet of Things: A Road Map for Smart Environments, pp. 169–186
(Springer, Berlin, Germany, 2014)
25. J. Sung, T. Sanchez Lopez, D. Kim, The EPC sensor network for RFID and WSN integration
infrastructure, in: Proceedings of IEEEPerComW’07, White Plains, NY, USA, March 2007
26. G. Broll, E. Rukzio, M. Paolucci, M. Wagner, A. Schmidt, H. Hussmann, PERCI: pervasive
service interaction with the internet of things. IEEE Internet Comput. 13(6), 74–81 (2009)
Identifying and Eliminating
the Misbehavior Nodes in the Wireless
Sensor Network
Navaneethan Selvaraj, E. S. Madhan, and A. Kathirvel
Abstract In recent years, advanced research in wireless sensor networks (WSN) has
become a trending and emerging technology. Sensors can be used to monitor physical
and environmental conditions, and they are also used in the manufacturing industry.
Battery life and security issues are the two most significant problems and challenges
in wireless sensor networks. Many algorithms have been developed to implement
the above issues in many other situations, but both issues are not fully resolved due
to a variety of factors, such as duplication of data that is not filtered, wasting battery
power, and bandwidth. Some nodes in the network become selfish, unable to forward
packets to neighboring nodes. These nodes cause network misbehavior, rendering
the network partially inactive. Our proposed method entails removing misbehaving
nodes from the network as well as checking for message duplication in the network
before sending data, and our algorithm satisfies the aforementioned requirements.
Keywords WSN · Nodes · Sensor · Monitoring
1 Introduction
WSNs are organized automatically by own and structured relationships that periodi-
cally watching physical or common conditions such as temperature, sound, vibration,
pressure, and improvement harms and obligingly send their information through the
relationship to a fundamental area or sink where the information can be seen and
N. Selvaraj (B)
Research Scholar, Department of CINTEL, SRM Institute of Science and Technology,
Kattankulathur, Chennai, India
e-mail: ns2066@srmist.edu.in
E. S. Madhan
Assistant Professor, Department of CINTEL, SRM Institute of Science and Technology,
Kattankulathur, Chennai, India
e-mail: madhane2@srmist.edu.in
A. Kathirvel
Professor, Department of Computer Science and Engineering, NIOT research group, Karunya
Institute of Technology and Sciences, Coimbatore, TN, India
394 S. Navaneethan et al.
isolated. The sink or base station serves as an interface. By blending questions and
assembling results from the sink, one can recover required data from the relationship.
A far-flung sensor network almost certainly has a large number of sensor focuses.
The sensor network nodes can communicate with one another using radio signals.
A sensor place point that has been removed is equipped with recognizing and force
part.
The individual nodes in a wireless sensor network (WSN) are generally resource
constrained: they have a managed speed gathering cutoff and a correspondence infor-
mation transfer limit. The sensor networks are in danger of self-figuring out a suitable
connection structure after they have been sent occasionally, I will have to deal with
them via multi-skip correspondence. By this time, the sensors have been installed and
are collecting data from requests sent to a “control site” in order to perform express
headings or give perceiving tests. The sensor environments can work in either a
perpetual or event driven mode. All things system (GPS) and local engineering eval-
uations can be used to get a sense of the area and sort out data. Actuators can be added
to difficult-to-reach sensor devices to allow them to “act” when certain conditions are
met. These associations are more plainly referred to as wireless sensor and actuator
networks.
Wireless sensor networks (WSNs) link new applications, necessitating non-
standard ideal models for a few goals. As a result of the requirement for low contrap-
tion multi-layered nature close to low energy use, a suitable agreement between
correspondence and sign/information dealing with limits should be found (e.g., long
affiliation lifetime). This necessitates a massive effort in resealing since the last
decade; there have been numerous hypotheses in this field. Currently, a large portion
of WSN research has focused on the design of energy and computationally amazing
persuading figures and shows, and the application area has been limited to organize
data designed seeing and deciding. A propose a cable mode transition (CMT) evalua-
tion, in which a small number of dynamic sensors are selected to maintain K-thought
about a scene in the same way that K-accessibility of the association is maintained.
It assigns lethargy times to interface sensors without affecting the alliance’s idea
and receptiveness requirements, which are subject to close scrutiny. A careful date
is important in a deferral network.
The proposed network structure aims to reduce data collection delays in difficult-
to-reach sensor networks, extending the alliance’s lifespan. The experts considered
hand-off concentrations to reduce the alliance’s numerical requirements and used
particle swarm optimization (PSO)-based calculations to find the best sink loca-
tion. Energy-saving correspondence has also been capable of proposing a numerical
response for locating the ideal sink position for extending the connection lifetime.
Traditionally, evaluations of inaccessible sensor networks have focused on homoge-
neous sensor location centers. In any case, researchers are now concentrating their
efforts on heterogeneous sensor networks, which differ from one another in terms of
energy consumption.
In any case, researchers are now focusing their efforts on heterogeneous sensor
networks, which differ in terms of energy consumption from one another. New asso-
ciation structures with heterogeneous devices, as well as a new movement in this
Identifying and Eliminating the Misbehavior Nodes … 395
Fig. 1 Architecture of WSN
direction, are removing current roadblocks and expanding the range of possible
applications for WSNs, all of which are rapidly evolving.
Wireless sensor networks have recently piqued people’s interest (WSNs). It isn’t
a trick to think of WSNs as one of the most recently explored zones. Here is a
summary of a difficult from the structure. Every day, two or three new uses and
business opportunities emerge. The WSN market is figure to ascend from $0.45
billion of each 2012 to $2 billion out of 2022. Figure 1 shows the chose ascending
in income from the WSN market for the hour of 2010–2014.
A WSN is a network of small devices known as sensor focuses that are spatially
distributed and collaborate to provide data totaled from the saw field via remote affil-
iations. The data collected by the various focus points are sent to a sink, which either
uses it locally or connects it to other networks, such as the Internet. WSN progres-
sion has several advantages over traditional structure-based association strategies,
including lower costs, adaptability, endurance, exactness, adaptability, and affilia-
tion simplicity, all of which are associated with their use in a variety of applications.
As levels of progress advance and sensors become more astute, truly unassuming,
and moderate, billions of distant sensors are passed on in various applications.
Military, climate, clinical benefits, and security are just a few of the potential
application areas. Sensor focuses can be used in the military to see, find, and track
foe headways. If an occurrence of damaging events occurs, sensor focus focuses can
continuously separate the climate to identify problems early. Sensor focus focuses
can assist in observing a patient’s flourishing from a clinical standpoint. Sensors can
provide wary reconnaissance and expanded knowledge of potential controller attacks
in security.
The examined climate plays a significant role in determining the size, geog-
raphy, and strategy of the affiliation. For example, if the checked climate is a vast
region that is difficult to reach by people, a spur-of-the-moment dispatch of focuses
is preferred over a planned affiliation. Furthermore, outside conditions necessitate
constant focus focuses to cover a vast area, whereas indoor conditions necessitate
fewer focus focuses to lay out a relationship in a constrained space [1, 2].
A WSN also has two or three asset destinations, each of which has a limited amount
of energy, a short correspondence range, uninformed transmission, and a limited
preparing limit and cutoff. WSN’s evaluation goal is to address recently mentioned
course of action and asset objectives by presenting new plan ideas, improving existing
shows, and growing new calculations. WSN is a promising advancement with enor-
mous potential to change our reality if we can resolve some examination issues. In
the WSN synthesis, there are two or three overviews on various evaluation districts,
for example, directing system, MAC shows, blockage control, information blend,
power safeguarding constringent security, and applications. We should add that target
application driven headway of advancements achieves a Silo approach.
A wireless sensor network (WSN) is made up of sensor center points or pieces that
are devices with a processor, a radio interface, a simple to digital converter, sensors,
memory, and a power supply. The processor prepares the data and sets the board limits
for the digit. Temperature, clamminess, light, and other factors can all be detected
using the sensors attached to the piece. Due to move speed and power requirements,
bits essentially maintain uninformed units with limited computational power and a
limited recognizing rat. Programs (bearings that the processor executes) and data
are stored in memory (unrefined and dealt with sensor assessments). Pieces are
equipped with a low-rate (10–100 kbps) and short-range (under 100 m) distant radio
to communicate with one another. Because radio correspondence consumes the vast
majority of the power, the radio should incorporate energy-saving correspondence
techniques.
Common power source is batteries that are fueled by batteries. Because pieces can
be passed on in far-flung and dangerous environments, they should be low-power
and worked in sections to extend network lifetime. Pieces could, for example, be
equipped with convincing power gathering strategies, similar to solar cells, allowing
them to be left unattended for long periods of time. Batteries that are powered by
batteries. Because pieces can be passed on in far-flung and dangerous environments,
they should be low power and worked in sections to extend network lifetime. Pieces
could, for example, be equipped with convincing power gathering strategies, similar
to solar cells, allowing them to be left unattended for long periods of time. On the
other hand, coordinated sending is helpful for limited incorporation where less center
points are sent at express zones with the potential gain of lower network support and
the heads cost.
Remaining section are organized as follows: Section 2 describes detailed literature
works of previous papers are given. Our proposed solution is illustrated in Sect. 3.
Simulation results is tested using Qualnet 5.02 simulator and result were discussed
in Sect. 4. Finally we have conclude our work in Sect. 5
2 Related Works
Algorithms for detecting selfish nodes in a MANET have been developed in the
literature. To encourage packet forwarding without discrepancies, a fuzzy reputation
system is used to discipline nodes that behave selfishly [2]. Deering [3] proposes
a reputation-based algorithm in which each node is expected to keep track of all
other nodes and obtain reputation from a centralized node. In Fig. 1, Ballardie and
Crowcroft [4] propose a scheme in which each node earns credits by forwarding
packets of other nodes, allowing them to transmit their possess packet. In addi-
tion, Zhou et al. [5] propose activity-based overhearing, which uses iterative and
unambiguous probing to detect the presence of selfish nodes in MANETs. To distin-
guish between trusted and selfish behavior in nodes, Jeong et al. [6] use a fuzzy-
based analyzer. To combat selfishness, the method incorporates trust and certificate
authority. In [7], the authors propose a collaborative watchdog for detecting selfish
nodes. Miner and Staddon [8] propose a two-tier acknowledgment scheme to iden-
tify misbehaving nodes and then inform the routing protocol to take routes without
misbehaving nodes in the future. Game theory is applied as a tool to encourage
cooperation in [9], and reputation is used to study node behavior.
Ad hoc On-Distance Vector (AODV) [10, 11] and DSR [12] are traditional
MANET routing protocols that presume which all modules throughout the system
cooperate and concur on forwarding. Packet level, suggesting that certain devices
are truthful in their packet forwarding actions. According to findings published in
the journal [13, 14], nodes in a MANET tend to be narcissistic over time. Personality
devices are reluctant to devote resources like rechargeable batteries time and storage
to the benefit of other nodes. Dishonesty is frequently linked to the onset of resource
scarcity, such as battery power, and the desire of nodes to save resources for their
own use such as electric power and nodes’ willingness and save reserves for their
own consumption. As a consequence, a module in a MANE has a powerful incentive
and being selfish. The setting Marti et al. [15, 16] classified selfish node behavior
into the following categories:
• Selfish nodes reduce the TTL value and slip packets or tamper with routing
protocol and reverse path packets.
• If a device does not pay attention to hello messages, the following conditions
apply: A selfish node could refuse to acknowledge hello messages, making it
impossible for nearby nodes to identify it and reverse packets.
There is no immune reaction to foreign bacteria in the gut or to the food we eat
although both are foreign entities. The principle behind danger theory [17, 18] stems
from the fact that everything foreign is not harmful for the cell structure. Instead of
attacking everything foreign, it is better to measure the degree of damage incurred
to the cells based on the distress signal sent by the cells due to the foreign entity.
Danger theory advocates the principle of non-isolation of foreign entity until proved
dangerous.
3 Proposed Method
Selfish nodes cause a faulty network because there is no guarantee that they will not
delay, split, or make packets, or put them out of order. To provide that reliability,
the protocols that put forward truthful communication over those networks use a
combination of acknowledgments, retransmission of missing or broken packets, and

checksums. Each network is equipped to other nodes and shares its data items for
data packet transmission to adjacent routers, with such a terms of payment to each
connected node. To find the shortest path from a source to a destination node, Dijk-
stra’s algorithm is used. However, the presence of a selfish node causes a massive
performance degradation in Fig. 2.
The presence of selfish node creates a defective network; there is no assurance that
they will not delay, split, or make the packets, or take them out of order. The protocols
those put forward truthful communication over those networks use a combination of
acknowledgments, retransmission of missing or broken down packets, and check-
sums to provide that reliability. Here, each node connected to other nodes and shares
their data items for transferring of data packets to neighboring nodes, and the cost
is assigned to every linked node. Dijkstra’s algorithm is used to find the shortest
path from source to destination node. But, presence of the selfish node creates huge
network failure. The original grouping of data packets are forwarded from source
node can’t reach at the destination node and some of the packets are missing or total
packets are missed at the destination node. So, retransmission numbers of node occur
between these nodes.
In this paper, we are considering retransmission number of nodes to detect a selfish
node. Each node itself retransmits data packets before successfully sending a packet
(NDj , j = 1, 2, …, n) and records of retransmission numbers (n) within a certain
period (recordnum). By using NDj and n, we calculate the average retransmission
numbers (NDi ) of each node itself. After that calculate for the maximum value of
average retransmission numbers (NDmax ) in the period. Finally, it is judged whether
retransmission numbers of node i to meet the equation, when it is satisfied the condi-
tion indicates that the node is a selfish node, and if not, the node is a non-selfish node
and then repeating the process until the end of the result.
We proposed two different kind of algorithms that can be used to address the
selfish nodes identity and security enhancement which is explained in the upcoming
section
Fig. 2 Topology design

G = [0 2 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 4 0 9 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 1 0 0 0 6 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 2 2 3 4 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 2 7 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 5 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 5 7 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 5 6 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 5 6 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 3 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 8 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]
After finding the shortest path, we are applying equation to every node in the
path between source node and destination node and find out the presence of selfish
node in the network. This process continues to check every related nodes. From the
observation, we can identify the node as a selfish node or non-selfish node by taking
threshold value with the average retransmission numbers and maximum average
retransmission numbers of the node as given in equation.
3.1 Algorithm for Security
Each node in the network generates a token and the token consist of two fields first
one is flag bit, i.e., status bit
• The status bit consists of two parts green flag and red flag. The first flag indicates
that it is a valid path.
• The red flag indicates that it is not a valid path to the destination.
• After the identification of valid path, the second token is generated only by source
node it contains the address and data field.
• The source node monitors the intermediate address of each node which the packet
travels until it reaches the destination.
• Once the packet reaches the destination, the token is released by the source.
3.2 Algorithm to Identify Selfish Nodes
Threshold value is between 0 and 1.

In to detect three different types of selfishness in a network node, the threshold
value is divided into three categories. We raise the limit for sensing slightly selfish
nodes to 0.8 for a better result. We have chosen 0.8 as an experimental value.
1. It is referred to as a partially selfish node if the decision is less than 0.8.
2. If decision > 0.8 and decision 0.9, it is referred to as a fully selfish node.
3. It is said to be non-selfish if the decision is greater than 0.9.
// NDmax is the maximum average number of retransmissions in a given time
period. //
• for (linking node for G) // decision = NDi / NDmax //
• if (Threshold low decision)
• Nk is identified as completely selfish node;
• else if (Threshold > range exceeded)
• Nk is identified as partially selfish node;
• otherwise,
• Nk is identified as not an selfish node.
Figures 3 and 4 show implementation results.
5 Conclusion
The algorithm that we devised improves the network’s detection rate. Because of
its no cooperative nature to other nodes, selfish behavior causes network failure and
degrades overall network performance in wireless sensor networks. The self-centered
node for proper network data communication management, timely detection is a
critical issue. The selfish node has a significant. The network is disrupted as a result
of the impact. In order to solve the problem of selfish nodes and their behavior in
WSN, we must transform them into cooperative nodes that forward data packets to
other nodes.
Fig. 3 Implementation results

Fig. 4 Number of nodes versus threshold
References
1. A. Kathirvel, R. Srinivasan, ETUS: enhanced triple umpiring system for security and robustness
of wireless mobile ad hoc networks. Int. J. Commun. Netw. Distrib. Syst. 7(1/2), 153–187 (2017)
2. A. Kathirvel, R. Srinivasan, ETUS: an enhanced triple umpiring system for security and
performance improvement of mobile ad hoc networks. Int. J. Netw. Manag. 21(5), 341–359
(2018)
3. S.E. Deering, Multicast routing in internetworks and extended LANs, in Proceedings of
the ACM SIGCOMM Symposium on Communication Architecture and Protocols, Aug 2019,
pp. 55–64
4. T. Ballardie, J. Crowcroft, Multicast-specific security threats and counter-measures, in Proceed-
ings of the Second Annual Network and Distributed System Security Symposium (NDSS ‘95),
Feb 2019, pp. 2–16
5. Y. Zhou, X. Zhu, Y. Fang, MABS: multicast authentication based on batch signature. IEEE
Trans. Mob. Comput. 9(7), 982–993 (2018)
6. J. Jeong, Y. Park, Y. Cho, Efficient DoS resistant multicast authentication schemes, in Proceed-
ings of the International Conference on Computational Science and Its Applications, 2010,
pp. 353–362
7. A. Perrig, R. Canetti, J.D. Tygar, D. Song, Efficient authentication and signing of multicast
streams over lossy channels, in Proceedings of the IEEE Symposium on Security and Privacy
(SP ‘00), May 2000, pp. 56–75
8. S. Miner, J. Staddon, Graph-based authentication of digital streams, in Proceedings of the IEEE
Symposium on Security and Privacy (SP ‘01), May 2001, pp. 232–246
9. N. Koblitz, Elliptic curve cryptosystems. Math. Comput. 48, 203–209 (1987)
10. M. Kefayati, H.R. Rabiee, S.G. Miremadi, A. Khonsari, Misbehavior resilient multi-path data
transmission in mobile ad-hoc networks, in Proceedings of the fourth ACM Workshop Security
of Ad Hoc and Sensor Networks (SASN ’06), 2006
11. R. Mavropodi, P. Kotzanikolaou, C. Douligeris, SecMR—a secure multipath routing protocol

for ad hoc networks. Ad Hoc Netw. 5(1), 87–99 (2007)
12. B. Xiao, B. Yu, C. Gao, Chemas: identify suspect nodes in selective forwarding attacks. J.
Parallel Distrib. Comput. 67(11) (2007)
13. X. Zhang, A. Jain, A. Perrig, Packet-dropping adversary identification for data plane security,
in Proceedings of the 2008 ACM CoNEXT Conference (CoNEXT ‘08), 2008
Deep Learning Approach
for Image-Based Plant Species
Classification
E. Venkateswara Reddy, G. S. Naveen Kumar, Baggam Swathi,

and G. Siva Naga Dhipti
Abstract The main persistence of this research is to spread over machine learning
for plant species identification in agricultural science. This discipline has so far
received less attention rather than the other image processing application domains.
Various plant species may have extensive resemblance among them, and it is time
consuming to make a distinction. Plant species empathy takes in pre-processing,
dissects, feature drawing, and organization. This paper proposes plant species iden-
tification by image classification using AI and machine learning techniques which
take account of huge information in the form of binary leaf pictures and features like
dimensions, thickness, and color to identify the species of plants using various image
classification techniques/classifiers. The commonly used necessary classifiers are
linear, non-linear, bagging, and boosting. The algorithms proposed are random forest,
K-nearest neighbor, support-vector machine, gradient boosting, and Naive Bayes
for on-spot checking. Linear discriminant analysis is performed to plot graphs for
accuracy and loss versus classifier to expand cataloging presentation of a particular
species. And finally, selecting preeminent model for prediction after standardization
of the dataset, the results are used.
Keywords Binary leaf pictures · Classifiers · Discriminant analysis ·

Support-vector machine · Random forest · K-nearest neighbor
1 Introduction
The world inherits a very large number of plant species. In ancient days’, identifica-
tion of plant species was done by the experience of peculiar touch and smell sense.
Plant identification is not exclusively the job of botanists and plant ecologists. It is
required or useful for large parts of society, from professionals to the general public.
But, the identification of plants by conventional means is difficult, time consuming.
Automatic plant image identification is the most promising solution toward bridging
the botanical taxonomic gap, which receives considerable attention in both botany and
E. Venkateswara Reddy · G. S. Naveen Kumar (B) · B. Swathi · G. Siva Naga Dhipti

Depatrment of CSE, Malla Reddy University, Hyderabad, India
406 E. Venkateswara Reddy et al.
computer community. Empathy of various plant species was progressed by contem-

porary expansion in analytical and trending technologies. As the machine learning
technology advances, sophisticated models have been proposed for automatic plant
identification. Morphological picture has been proven for its non-cataclysmic system
to detect the species [1]. The easiest and efficient way to set apart the various plant
species with a great precision is to choose leaf of a plant [2].
2 Proposed Algorithm
The moto of the proposed work compacts the acquisition of data followed by
processing of various data collected which is called pre-processing of image, with a
further step is extraction of different characteristics of the leaves for identification of
plant species [3]. Here, the role of intelligent algorithms or classifiers like K-nearest
neighbor, random forest, support-vector machine, and gradient boosting algorithms
for classifying the plant species comes into play [4]. This proposal briefly reviews the
workflow of applied machine learning techniques, and discuss challenges of image-
based plant identification. A data flow diagram illustrates how progressions stream
end-to-end framework. Figure 1 shows that flow pattern for plant species recognition.
Fig. 1 Flow pattern for

plant species recognition
Deep Learning Approach for Image-Based Plant … 407
Fig. 2 Pre-processing of leaf
3 Methodology
3.1 Collecting Dataset
In hitherto times, there was no particular standard pattern for a leaf to catalog the
plant species automatically. After a great research which led to dataset of Flavia was
ready to capture the huge dataset for plant species classification.
3.2 Image Pre-processing
The leaf photographs hold only one entity, the leaf [5]. As all leaves doesn’t have
absolutely smooth level, while photo grasping it may hold shadow beneath the leaf,
which must be detached before dissection of image [6, 7]. Initially, any color pixel
of desired picture was changed to Hue saturation value esteem. This progression fills
in as a track leading to the relevant edge discovery of RGB esteem leaf pictures, as
opposed to creating a last picture to pull out the characteristics. Further, treatment
intricate a stage in modifying actual pictures over to grayscale pictures. Further,
pictures were changed over from gray scale to binary pictures as shown in Fig. 2.
The binary pictures detached any irregularity within the leaf outline and exhibited
the ample leaf in white patch. The segmentation process of leaf is shown in Fig. 3.
3.3 Attribute Identification
Every plant species retain matchless attribute that creates it as a unique one. Attributes
or characteristics are categorized as dimensions [8], color, and venation [9, 10]. In
Fig. 4, it is shown about different kinds of features in an image leaf.
Geometry of the leaves defines many divergences in dimensions. The proportion
of length with its respective width of a leaf is defined as aspect ratio. Area is premed-
itated as the product of range of pel and over-all count of pels existing in a leaf.
Rectangularity (R) is a property of being shaped like a rectangle. The color features
Fig. 3 Image segmentation process
Fig. 4 Features of leaf

include mean and standard deviation. These veins are unique attributes of each plant
species [11].
3.4 Classification
Upon the completion of leaf pattern abstraction, the evidence is called as attribute
vectors which are caste of two auxiliary scrutiny, appraisal before being clustered
into their precise modules. And the intelligent methods suggested for classifiers
are random forest, KNN, SVC, gradient boosting, and Naive Bayes for on-spot
checking. For image cataloging, support-vector network and random forest methods
are proven [12]. Random forest is a classifier that encloses a figure of decision trees on
innumerable detachments of the given dataset and takes the middling to advance the
prognostic precision of that dataset. The countless figures of trees in the forest leads
to greater precision and foils the tricky of overfitting. It takes too short time upon
other algorithms even for enormous dataset [13]. K-nearest neighbor is the naivest
machine learning algorithms centered on controlled learning technique which holds
good for regression and classification too, but much suited for classification purpose
[14]. During training, it works to withhold the dataset, and as soon as it receives
new data, it organizes the data into a set which is in resemblance to it. Extraordinary
dataset can be transacted by most widespread supervised learning algorithms called
as support-vector machine, which can be also used for regression problems [15]. To
map the high dimensional data, kernels are used. Support-vector network chooses
the extreme vectors that help in creating hyperplane which triggers the accuracy of
SVM [16, 17].
Proof of identity for plant species has foremost benefits for its wide range of stake-
holders extending from pharmaceutical laboratories, botanist, forestry services, and
consumers. Ten diverse machine learning classifying techniques were used to assess
the identification accuracy rate. To find the best classifier among all the proposed
classifier techniques, a series of experiments have been conducted on dataset. Table
1 shows the comparison of test accuracy and log loss among the proposed models
supported for spices identification, and Table 2 shows that precision, recall, and f1-
score of proposed algorithm. The plot for classifier versus accuracy and classifier
versus log loss is shown in Figs. 5 and 6, respectively.
Table 1 Comparison for

Classifier Accuracy Logloss
various classification
techniques K neighbor-classifier 91.4142 1.5678
SVC-classifier 80.8082 4.3094
NuSVC-classifier 97.9823 2.2753
Decision tree-classifier 25.7654 6.0912
Random forest-classifier 98.9763 0.7382
AdaBoost-classifier 58.0912 2.8071
Gaussian NB-classifier 57.0812 14.8312
Linear discriminant analysis 98.4749 0.9879
Quadratic discriminant analysis 1.5153 34.0157
Table 2 Precision, recall,

Precision Recall F1-score
and f1-score
1 1.00 1.00 1.00
2 1.00 1.00 1.00
3 1.00 1.00 1.00
4 1.00 1.00 1.00
5 1.00 1.00 1.00
6 1.00 1.00 1.00
7 0.67 1.00 0.8
8 1.00 1.00 1.00
9 1.00 1.00 1.00
10 1.00 1.00 1.00
11 1.00 1.00 1.00
12 1.00 1.00 1.00
13 1.00 1.00 1.00
14 1.00 1.00 1.00
15 1.00 1.00 1.00
16 1.00 1.00 1.00
17 1.00 0.5 0.67
18 1.00 1.00 1.00
19 1.00 1.00 1.00
20 1.00 1.00 1.00
Avg/total 0.99 0.98 0.98
Fig. 5 Accuracy plotting for various classifiers
Fig. 6 Log loss plotting for various classifiers
5 Conclusion
The proposed work for cataloging of plant species is supported by matching various
algorithms. 98.9% of precision is shown by using random forest algorithm for cata-
loging and is the faster learning model by performing undeviating proportionality of
compound attributes with learning capability for machine learning algorithms.
References
1. J. Hossain, M.A. Amin, Leaf shape identification-based plant biometrics, in Proceedings of

the 2010 13th International Conference on Computer and Information Technology (ICCIT),
Dhaka, Bangladesh, 23–25 Dec 2010, pp. 458–463
2. A. Khmag, S.A.R. Al-Haddad, N. Kamarudin, Recognition system for leaf images based on
its leaf contour and centroid, in Proceedings of the 2017 IEEE 15th Student Conference on
Research and Development (SCOReD), Putrajaya, Malaysia, 13–14 Dec 2017, pp. 467–472
3. G.S. Naveen Kumar, V.S.K. Reddy, Video shot boundary detection and key frame extraction
for video retrieval, in Proceedings of the Second International Conference on Computational
Intelligence and Informatics (Springer, Singapore, 2018), pp. 557–567
4. S.H. Lee, C.S. Chan, S.J. Mayo, P. Remagnino, How deep learning extracts and learns leaf
features for plant classification. Pattern Recogn. 71, 1–13 (2017)
5. S. Sladojevic, M. Arsenovic, A. Anderla, D. Culibrk, D. Stefanovic, Deep neural networks
based recognition of plant diseases by leaf image classification. Comput. Intell. Neurosci.
2016 (2016)
6. B. Swathi, K. Murthy, An effective modeling and design of closed-loop high step-up DC–DC
boost converter, in Intelligent System Design (Springer, Singapore, 2021), pp. 303–311
7. A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional
neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105
8. T. Sathwik, R. Yasaswini, R. Venkatesh, A. Gopal, Classification of selected medicinal plant
leaves using texture analysis, in 4th ICCCNT, 4–6 July 2013
9. A. Gopal, S. Prudhveeswar Reddy, V. Gayatri, Classification of selected medicinal plants leaf
using image processing, in Proceedings International Conference on Machine Vision and Image
Processing (MVIP), 2012
10. E. Venkateswara Reddy, E.S. Reddy, Image segmentation using rough set based fuzzy K-means
clustering algorithm. GJCST 13(6), 23–28 (2013)
11. S.H. Lee, C.S. Chan, P. Wilkin, P. Remagnino, Deep-plant: plant identification with convolu-
tional neural networks, in 2015 IEEE International Conference on Image Processing (ICIP),
27 Sept 2015 (IEEE, 2015), pp. 452–456
12. K.J. Willis (ed.), State of the World’s Plants 2017. Report (Royal Botanic Gardens, Kew, 2017)
13. SG Wu et al., A leaf recognition algorithm for plant classification using probabilistic neural
network, in IEEE International Symposium on Signal Processing and Information Technology,
2007
14. E. Adam, O. Mutanga, J. Odindi, E.M. Abdel-Rahman, Land-use/cover classification in a
heterogeneous coastal landscape using RapidEye imagery: evaluating the performance of
random forest and support vector machines classifiers. Int. J. Remote Sens.
15. M.G. Larese, R. Namías, R.M. Craviotto, M.R. Arango, C. Gallo, P.M. Granitto, Automatic
classification of legumes using leaf vein image features. Pattern Recogn. 47(1), 158–168 (2014)
16. G.S. Naveen Kumar, V.S.K. Reddy, Key frame extraction using rough set theory for video
retrieval, in Soft Computing and Signal Processing (Springer, Singapore, 2019), pp. 751–757
17. V.R. Eluri, C. Ramesh, S.N. Dhipti, D.Sujatha, Analysis of MRI based brain tumour detection
using RFCM clustering and SVM classifier, in International Conference on Soft Computing
and Signal Processing (ICSCSP-2018), Springer Series, 22 June 2018
Inventory, Storage and Routing
Optimization with Homogenous Fleet
in the Secondary Distribution Network
Using a Hybrid VRP, Clustering
and MIP Approach
Akansha Kumar and Ameya Munagekar
Abstract The paper considers the inventory routing and storage problem and
suggests a satisfactory solution by finding dispatch quantities and vehicle route allo-
cations when the objective is to minimize transit cost, vehicle cost and storage cost.
To be specific, the problem can be categorized as a cyclic inventory routing problem
(CIRP) with homogenous fleets. The approach mentioned here is a hybrid of vehicle
routing problem (VRP), graph-based clustering (GC) and mixed integer program-
ming model (MIP) to find a solution when the scale is large enough which makes it
difficult to solve it using exact methods. The VRP module is used to find the feasible
routes of the customers from the depot using metaheuristic approach. The GC module
is further used to decompose the route network into clusters using connected graph
networks, and eventually, a MIP model is used to select the routes, find the daily
dispatch of gases and thus also find the optimal storage required both at the depot
and the customers. The MIP formulation is designed in a way to reduce the solving
time complexity by converting the binary variables which are used in a traditional
formulation of the inventory routing problem to integer variables by decomposing the
constraint. The approach has been tested on a simulated business case that spans two
hundred customer locations, demands fulfilment for a week and homogenous fleet
with a truck carrying capacity of four cylinders. The scaling studies have been done
on the GC module by analysing the time complexity and the optimization feasibility
with respect to the cluster size. The MIP approach is designed to solve the problem
to less than 1% MIP gap considering the number of customer locations.
Keywords Inventory optimization · Routing optimization · Storage optimization ·

Supply chain optimization
1 Introduction
Distribution network optimization is a key component in increasing the bottom line

margins and improving the customer experience and loyalty. One of the strategic
A. Kumar (B) · A. Munagekar

Jio Platforms Ltd, Hyderabad, Telangana, India
414 A. Kumar and A. Munagekar
decisions that needs to be made at the time of setting up a distribution network is the
storage capacities at the depot, the storage capacities at the customer/retail locations
and the last mile fleet for the fulfilment of the demand. The demand of the customers
can either be fixed (constant rate for the entire horizon) or it may have a fixed pattern
over the given time horizon or even for that matter variable demand in a given time
horizon. However, if the supply product is flammable or else needs special freight
for transportation that needs a homogeneous fleet, there is an additional capital cost
which is incorporated in the formulation.
This paper focuses on the aspect of strategic decision making, i.e. to decide on the
vehicle fleet required, the storage required at the depot and the customer locations by
minimizing the travel cost, vehicle cost and storage cost and satisfying the demand
in the time horizon. The flammable liquid supply chain considered is a two-stage
distribution network consisting of a primary network which delivers the material
to intermediate storage units from the manufacturing unit. The secondary network
delivers the material from the depots to the customer locations. The focus of this
paper is on the secondary distribution network where the depot and the customers
mapped to that particular depot are considered. The CIRP problem was solved by
[1] which solves the problem of fixed demand rate for infinite time horizon.
Reference [2] provides a variable neighbourhood descent heuristic for problems
more than 20 customers while a MIP model for lesser scale problems which provides
a good solution for the problem.
The objective of this papers is to solve the problem when the problem size is large,
i.e. more than 50 customer locations, and each location has either fixed demand rate
or variable demand rate for a time frame greater than a week and also when the route
times are more than a day.
The challenges are due to the nature of the problem. The complex formulation
makes it a difficult use case to solve in a limited computational time and resources.
Particularly, due to the space, the VRP which uses a meta-heuristics method does not
guarantee optimality as it is not guaranteed that the metaheuristic technique reaches
global optimum [3]. In this paper, we try to solve the problem keeping into account
the scale and the limited amount of computational capability to reach to an agreeable
solution that is useful to the business.
The VRP model is solved using [4]. Due to the size of the customer locations,
solving the VRP in a desired time is not realizable; hence, we used a GC-based method
to split the problem space into several clusters and solved the clusters independently.
We have developed a hybrid approach that solves the problem in three stages,
namely: vehicle routing problem (VRP), graph-based clustering (GC) and mixed
integer programming (MIP). The VRP module is used to find the feasible routes
of the customers from the depot using metaheuristic approach. The GC module is
further used to decompose the route network into clusters using connected graph
networks, and eventually, an MIP model is used to select the routes, find the daily
dispatch of gases and thus also find the optimal storage required both at the depot
and the customers.
Inventory, Storage and Routing Optimization … 415
2 Literature
Inventory routing problems are present in various domains and industries like gas
industries [5], crude oil refineries [6], cold storage food distribution, food and super-
market chains [7]. The cyclic inventory routing problem was first introduced by [8].
It was more like a strategic-level inventory routing problem whose objective was
to minimize the required fleet size over a very long period. Another example of
the application of the long-term IRPs is in the ship fleet sizing for a liner shipping
company with fixed long-term cargo contracts [9].
The basic inventory routing problem assumes that vehicles are available. The
objective considers a trade-off between inventory costs and transportation, without
taking fixed vehicle costs into account [10]. Integration of inventory and distribution
decisions are formulated and approached in different directions. Some constraints
considered are different inventory policies, time window horizons and service restric-
tions. The literature on IRP in the past couple of decades summarize the problems.
Some of these reviews are [11–15]. IRP and its variants are now well-developed. The
IRP may be further classified in continuous time models [2, 16, 17], most often with
a constant demand rate over a time period, and models in discrete time [18–20], with
a fixed time period but varying demand rates. The various graph-based clustering
algorithms and machine learning methods used for clustering are reviewed in [21,
22]; the methods described use different ensemble method approaches to optimize
on the unsupervised learning methods. This paper is the combination of VRP and
MIP and connected graphs approach to formulate a modified IRP which undercuts
the global optimum but eventually provides a very good solution in a feasible amount
of computation time in cases where quick solutions are required.
3 Problem Description
The supply chain considered is a two-stage distribution network consisting of a

primary network which delivers the material to intermediate storage units from the
manufacturing unit. The secondary network delivers the material from the depots
to the customer locations. The focus of this paper is on the secondary distribution
network where the depot and the customers mapped to that particular depot are
considered. The objective of the problem is to find the optimal storage at the depot
and customer and also to find the optimal vehicle fleet required at the depot to fulfil
the demand of the customers. The customers have a fixed demand rate and based
on it, the problem is formulated in three main steps: routing, clustering and MIP
formulation shown in Fig. 1.
The process flow of the optimization:
Fig. 1 Process flow block diagram
1. Input Data Processing:

The input data required is the location coordinates of the customers and the
cyclic demand or the constant rate of demand at the customer locations for the
given time horizon.
2. Vehicle Routing Problem (VRP) formulation:
The idea of the vehicle routing problem is to build the feasible routes to
reach the customer locations from the depots. The route construction heuristic
is explained in detail in the upcoming sections.
3. Connected Graph Clustering:
The clustering approach is primarily used to reduce the complexity of the
problem. The approach used is to extract connected clusters from the feasible
routes build by the route building heuristic.
4. Mixed Integer Programming (MIP) formulation:
The MIP formulation is used to decide the dispatch of the cylinders from
the depot on each day and on each route, number of vehicles used and storage
required at the depot and the customer locations. The objective is to minimize
the transport cost, transit cost and the storage cost.
4 Optimization Framework
Optimization Flow:
See Fig. 2.
4.1 Input Parameters and Data Pre-processing
The input parameters include the coordinate locations of the customers and depot,
demand of the customers, vehicle capacity, vehicle fixed cost, storage cost and transit
cost. The demand at the customers is provided in terms of the tones of liquid fuel
required thus mapped to the number of cylinders required shown in Fig. 2.
Fig. 2 Optimization framework
4.2 Vehicle Routing Problem (VRP) Model
Distance matrix:
In the data pre-processing step, the data is prepared for the mathematical model.
The distance matrix is prepared using open source API for calculating the distance.
Finding the feasible routes using VRP (vehicle routing problem).
The input for the VRP is the daily demand at the customer locations and the
coordinates of the depot and the customer locations. Additionally, vehicle capacity
is also considered. The methodology used for the capacitated VRP is simulated
annealing using [23] as shown in Fig. 3. Any other metaheuristic, MIP formulation
or heuristic can be used as the problem is a standard problem with multiple approaches
available to solve the problem.
The heuristic used in Fig. 4 is considering the vehicle capacity as 4 cylinders;
hence, the iteration stops after i = 4. One assumption used for all the instances is
that a truck travels at max 450 km in one day. The idea here is to find the routes such
that the MIP model can select whichever route gives the best optimal. Any alternate
Fig. 3 VRP structure

Fig. 4 VRP flowchart
strategy to find the feasible set of routes can be used. The output of the heuristic
contains the feasible routes for the customer locations as shown in Fig. 5.
4.3 Graph Partitioning Clustering
The output of the VRP is a set of feasible routes which can be represented as a graph
with connected nodes. If a customer does not have a route in which it is connected
to another customer, we can consider that particular customer independently. Thus,
the problem can be decomposed into smaller problems and these sub-problems can
be solved independently. Clustering using connected components of an undirected
graph [24–26] helps us in reducing the time complexity. A straightforward breadth
first search strategy can be implemented for finding the connected components of
the undirected graph. Output: Feasible routes connected to each customer as shown
in Figs. 6 and 7.
Fig. 5 VRP sample output
Fig. 6 GC structure
1. Mixed Integer Model:

The MIP formulation is used to find the optimal outflow/dispatch of the liquid fuel
cylinders on each day for each route and for each customer as shown in Fig. 8. Input:
• A cluster of customer locations, feasible routes for the customers in the cluster,
daily demand of the customer locations, planning horizon length in days.
• Daily outflow/dispatch for each location and route, storage at both, depot and the
customer locations and the number of vehicles required daily on each route.
Fig. 7 Sample GC output and clusters
Fig. 8 MIP structure

5 Problem Formulation
In this section, the mixed integer problem formulation is elaborated in detail. The
following notations are considered:
I: A vector containing the set of customers.
R: A vector containing the set of routes.
T: A vector containing the sequence (set) of days.
Decision Variables:
Qou , is the outflow quantity of cylinders to be delivered at customer i on day t and

route r: Integer Variable ∀ (i, r, t).
QinK , is the inflow quantity of cylinders for customer i on day t and route r: Integer
Variable ∀ (i, r, t).
Qin , is the surplus inventory quantity at the customer i on day t: Integer Variable
∀ (i, t).
Qmin_inv , is the minimum surplus inventory required at the customer i: Integer
Variable ∀ i.
Qs , is the quantity to be stored in the warehouse/depot w: Integer Variable ∀ (w).
N veT , is the number of vehicles on route r on day t.
Ri, , is the binary mapping to whether route r is selected for customer i throughout
the time horizon. A value of 1 means route r is selected for customer i, otherwise
zero parameters:
Dr = Distance of route r.
C st = The fixed cost of storage analogous to the inventory carrying cost.
C veT = The fixed cost of a vehicle. This includes capital cost and maintenance
cost.
QveTicle_cap = The maximum carrying capacity of the vehicle.
Qdemand = The demand quantity at customer i at time t.
N routes = The number of allowable routes for customer i.
M = Big number mostly used as a pivot in the formulation.
Objective function:
Minimize r t NrveT ∗ Dr ∗ 60 + i Q imin _inv ∗ C st ∗ t

veT
+ Q st
o ∗ C ∗ t + r t Nr
st
(1)
Subject to:
i,t = Q i,r,t − Q i,t

Q inv + Q inv
i,(t−j) ∀(i ∈ I, t ∈ T )
in K demand
(2)
t Q inv
i,t ≤ Q i
min _inv
∀(i ∈ I ) (3)
veT
i∈r pi,r,t
out
Nr,t ≤o + 0.99 ∀(r ∈ R, t ∈ T ) (4)
p vexcile_cap
veT
i∈r pi,r,t
out
Nr,t ≤o ∀(r ∈ R, t ∈ T ) (5)
p vexcile_cap
out
Q i,r,t = Q in K
i,r,t, ÄT ATr
∀(i ∈ I, r ∈ R) (6)
i r Q i,r,t
out
≤ Q st
O ∀(t ∈ T ) (7)
out
pi,r,t
Ri, >= t ∀(i ∈ I, r ∈ R) (8)
M
r Ri, <= Niroutes ∀(i ∈ I ) (9)
veT
out
Q i,r,t , Q ink
i,r,t , Q i,t , Q O , Q i
inv st min _inv
, Nr,t ≥ 0 ∀(i ∈ I, r ∈ R, t ∈ T ) (10)
The MIP model tries to solve for two main problems. One is the inventory problem
considering the fixed demand rate at the customer location and the other being to
minimize the resources, i.e. the storage and the fleet. It considers the three objectives
with equal weightages (1) as the transit cost with the assumption that each kilometre
costs 60 units; the surplus inventory cost and the vehicle cost. The inventory balance
is achieved using (2). The minimum inventory constraint is also added (3). (4) and
(5) represent the mapping of the outflow to the number of vehicles as the number of
vehicles in this case is an integer variable. The inflow and outflow balance is achieved
using (6). (7) is used to map the storage at the depot and the outflows and thus used
to find the optimal capacity of storage required at the depot. (8) is used to convert
the outflow from route r to a binary value in order to add constraints on it. (9) is used
to limit the number of unique routes which should be allowed to be selected by the
model for each customer.
6 Results
Data set used are random locations in the state of Madhya Pradesh, India. The storage
capacity, demand rate and parameters are also generated randomly to test out the math
model.
From Table 1, we see the scaling of the model based on number of customer
locations considered. The horizon length considered for all the instances in 30 days.
Except the last instance of 100 customer locations, the MIP model was able to
successfully converge to 0.01% gap. When the number of customers were 100, there
Table 1 Instance results

Number of Total time for Total number of Total storage Total storage
customers solving MIP vehicles required capacity required at required at
model (in depot customer
seconds) locations
20 12 30 85 5
40 28 69 166 5
60 36 96 243 4
80 62 123 306 4
100 1800 152 394 5
Fig. 9 Box plot of distance travelled by vehicles each day on an average
was a time limit of 1800s provided and the CPLEX solver was able to converge to
0.2% gap.
Figure 9 is a box plot of the distance travelled by the vehicles on an average each
day. There were 3 instances considered, i.e. horizon length: 10, 20 and 30 days. The
distribution for the three instances is very similar with the medians being 297 km,
296 km and 296 km, respectively for 10, 20 and 30 days of horizon length. This indi-
cates that there is a similar pattern which is generated by the MIP model considering
the demand was cyclic in nature. Table 2 shows the metrics for the MIP model output
Table 2 Reference results for Fig. 9

Horizon length Total number of vehicles Total Storage capacity Total storage required at
required required at depot customer locations
10 123 306 5
20 123 306 4
30 123 306 4
Fig. 10 Box plot of efficiency of vehicles each route on each day
for the horizon lengths 10, 20 and 30 days and we can see that the metrics are very
similar.
Figure 10 shows the box plot of the efficiency of vehicles on each route based on
the distance travelled (day level utilization) and the number of cylinders transported
(capacity efficiency). The reason the day level utilization is considered as in most of
the real-world scenarios, a truck is generally rented on a day level irrespective of the
distance it travels. Thus, the idea being minimizing the idle time of the truck. The
equation to find the efficiency of the trucks on the routes is:
Round trip distance

Day level utilization = (11)
450 ∗ Days
No. oK cylinders delivered
Capacity utilization = (12)
4 ∗ Days
Day level utilization − capacity utilization
Efficiency % = ∗ 100 (13)
2
In Fig. 11 [9], the denominator signifies the maximum possible distance which
could have been travelled by the truck for the number of days it was on the route.
Here, the assumption being that the truck travels 450 km at max in a day. In (12),
the denominator signifies the number of cylinders which the truck could have ideally
delivered for that many numbers of days it was on the route. Here, the assumption
is that the truck capacity is 4 cylinders at max. Thus, the metric tries to capture the
capacity utilization and also the idle time of the truck. Figure 10 shows the percentage
efficiency of the trucks for the instance of 20, 40, 60, 80 and 100 customers. The
median efficiency ranges from 61 to 64% for the instances. The horizon length
considered for the instances is 30 days.
Figure 11 shows the average loading efficiency percentage of the vehicles. The
formula used to calculate the capacity efficiency of a vehicle is:
Fig. 11 Loading efficiency percentage for different number of customer instances
Table 3 VRP time limit results

Number of customers 300 s 600 s 900 s 1200 s
20 58,926 58,926 58,926 58,926
40 123,509 123,509 123,413 123,413
60 201,477 201,477 201,477 201,477
80 261,803 261,562 261,392 261,392
100 323,260 322,262 322,087 322,087
120 801,562 801,562 796,699 794,807
140 1,016,512 1,016,512 1,012,730 1,012,053
No. oK cylinders
Loading efficiency = (14)
ã
In (14), the denominator signifies the capacity of the vehicles which is 4 in the
case of the instances. The idea here is to find the loading efficiency of the vehicles
at the time of dispatch.
In Table 3, the objective function results are tabulated for time interval of 300 s.
The stopping criteria considered is being 0.5% for 3 consecutive iterations, i.e. if
the objective function does not improve more than 0.5% in any of the 3 consecutive
iterations, the model is stopped.
7 Conclusion
In this paper, we have demonstrated one of the approaches for solving the inventory
storage and routing optimization with homogeneous fleet in the secondary distri-
bution network. The method demonstrated here uses a hybrid VRP, Clustering and
MIP approach. We have demonstrated “satisfactory” optimality in the VRP stage by
Fig. 12 An illustration of connected components
showing scaling study results. In the MIP stage, we have solved the problem up to
an MIP gap less than 1% depending on the number of customers. We have used a
stopping criterion for VRP based on the improvement in the results, the stopping
criteria is an improvement of less than 0.5%. In addition, we have shown several
reports related to vehicle capacity utilization and day-level utilization. The method
can be applied to distribution networks where there are sources, depots and customer
location and there is a demand, inventory and transit cost involved. Even though we
have used homogeneous fleet, it can easily be scaled to heterogeneous fleet. The VRP
formulation is solved using Google-OR tools, Graph-based connected components
using Python and the MIP using CPLEX.
Appendix
Figure 12 shows an example illustration of how the connected components look on

map. The connected components are those in which there is at least one route which
is common directly or indirectly to the nodes. The nodes with the same symbols
are connected. The connected components are influenced by two metrics, i.e. the
demand rate and the distance. The idea of finding the connected components is to
solve these connected components individually using the MIP approach. In most of
the physical systems where the number of customer nodes are high, the connected
components can be solved in a distributed manner.
References
1. M. Chitsaz, A. Divsalar, P. Vansteenwegen, A two-phase algorithm for the cyclic inventory

routing problem. Eur. J. Oper. Res. (2016). https://doi.org/10.1016/j.ejor.2016.03.056
2. S. Anily, A. Federgruen, One warehouse multiple retailer systems with vehicle routing costs.
Manag. Sci. 36(1), 92–114 (1990)
3. arXiv:1704.00853v1 [cs.AI]. 4 Apr 2017

4. L. Perron, V. Furnon, OR-Tools 7.2. https://developers.google.com/optimization/
5. W.J. Bell, L.M. Dalberto, M.L. Fisher, A.J. Greenfield, R. Jaikumar, P. Kedia, Improving the
distribution of industrial gases with an on-line computerized routing and scheduling optimizer.
Interfaces 13, 4–23 (1983)
6. J.A. Persson, M. Göthe-Lundgren, Shipment planning at oil refineries using column generation
and valid inequalities. Eur. J. Oper. Res. 163, 631–652 (2005)
7. V. Gaur, M.L. Fisher, A periodic inventory routing problem at a supermarket chain. Oper. Res.
52(6), 813–822 (2004)
8. R.C. Larson, Transportation sludge to the 106-mile site; an inventory/routing model for fleet
sizing and logistics system design. Transp. Sci. 22(3), 186–198 (1988)
9. M. Christiansen, K. Fagerholt, B. Nygreen, D. Ronen, Maritime transportation. Handb. Oper.
Res. Manag. Sci. 14, 189–284 (2007)
10. L.C. Coelho, J.F. Cordeau, G. Laporte, Thirty years of inventory-routing. Transp. Sci. 48(1),
1–19 (2014). https://doi.org/10.1287/trsc.2013.0472
11. D. Adelman, A price-directed approach to stochastic inventory/routing. Oper. Res. 52(4), 499–
514 (2004). https://doi.org/10.1287/opre.1040.0114
12. H. Andersson, A. Hoff, M. Christiansen, G. Hasle, A. Løkketangen, Industrial aspects and
literature survey: combined inventory management and routing. Comput. Oper. Res. 37(9),
1515–1536 (2010). https://doi.org/10.1016/j.cor.2009.11.009
13. A.J. Kleywegt, V.S. Nori, M.W. Savelsbergh, The stochastic inventory routing problem with
direct deliveries. Transp. Sci. 36(1), 94–118 (2002). https://doi.org/10.1287/trsc.36.1.94.574
14. N.H. Moin, S. Salhi, Inventory routing problems: a logistical overview. J. Oper. Res. Soc. 58(9),
1185–1194 (2007)
15. V. Schmid, K.F. Doerner, G. Laporte, Rich routing problems arising in supply chain
management. Eur. J. Oper. Res. 224(3), 435–448 (2013). https://doi.org/10.1016/j.ejor.2012.
08.014
16. E.-H. Aghezzaf, Y. Zhong, B. Raa, M. Mateo, Analysis of the single-vehicle cyclic inventory
routing problem. Int. J. Syst. Sci. 43(11), 2040–2049 (2012). https://doi.org/10.1080/00207721.
2011.564321
17. G. Gallego, D. Simchi-Levi, On the effectiveness of direct shipping strategy for the one
warehouse multi-retailer R-systems. Manag. Sci. 36(2), 240–243 (1990)
18. L. Bertazzi, G. Paletta, M.G. Speranza, Deterministic order-up-to level policies in an inventory
routing problem. Transp. Sci. 36(1), 119–132 (2002)
19. A.M. Campbell, M.W.P. Savelsbergh, A decomposition approach for the inventory routing
problem. Transp. Sci. 38(4), 488–502 (2004). https://doi.org/10.1287/trsc.1030.0054
20. J. Li, H. Chen, F. Chu, Performance evaluation of distribution strategies for the inventory
routing problem. Eur. J. Oper. Res. 202(2), 412–419 (2010)
21. A.S.V. Praneel et al., A survey on Accelerating the classifier Training Using various boosting
schemes within cascades of boosted ensembles, in international Conference with Springer SIST
Series, vol. 169 (2019), pp. 809–825
22. B. Tarakeswara Rao, M. Ramakrishna Murty et al., A comparative study on effective approaches
for unsupervised statistical machine translation, in International Conference and Published the
Proceedings in AISC Springer Conference, vol. 1076, pp. 895–905 (2020)
23. M. Vidović et al., Mixed integer and heuristics model for the inventory routing problem in fuel
delivery. Int. J. Prod. Econ. (2013). https://doi.org/10.1016/j.ijpe.2013.04.034i
24. H. Gazit, An optimal randomized parallel algorithm for finding connected components in a
graph. SIAM J. Comput. 20(6), 1046–1067 (1991). https://doi.org/10.1137/0220066
25. I. Jonyer, D.J. Cook, L.B. Holder, Graph-based hierarchical conceptual clustering. J. Mach.
Learn. Res. (2001)
26. M. Ramakrishna Murty, J.V.R Murthy, P.V.G.D. Prasad Reddy, et al., Homogeneity separate-
ness: a new validity measure for clustering problems, in International conference and published
the proceedings in AISC and computing (indexed by SCOPUS, ISI Proceeding DBLP etc.).
vol. 248 (Springer, 2014), pp. 1–10. ISBN 978-3-319-03106
Evaluation and Comparison of Various
Static and Dynamic Load Balancing
Strategies Used in Cloud Computing
Homera Durani and Nirav Bhatt
Abstract There are many research carried on cloud computing, but key factor is
load balancing. Load balancing plays an important role to organized work allocation
on server which consists of cost, material, time duration, etc. There are few issues
which are faced in load balancing are resource utilization, security reason, tolerance,
and many more. Many researchers have done work on different parameters of load
balancing, and similar results are found in different articles. This paper consists of
simulation of three different loads balancing which include static as well as dynamic
strategies. Three algorithms are taken each from static and dynamic. Static algorithm
include are round robin, threshold, and randomized. Dynamic algorithm include are
active clustering, honey bee and Join-Idle-Queue where simulation is performed in
cloud simulator and parameter considered is resource utilization, response period,
and processing time with overall data transfer cost. This paper shows study regarding
comparison between three static algorithm and dynamic algorithm with their param-
eter and result outcome. This work is still in progress where it aims to do analysis
and evaluation for success of load balancing. In cloud computing to improve quality
approaches.
Keywords Static load balancing · Dynamic load balancing · Data center · VM

load balancer
1 Introduction
Today’s present information technology field cloud computing has assumed an

arising part in both business and scholarly territory. Cloud computing [1] is offering
types of assistance to customer whenever and any area on a compensation for
H. Durani (B)
RK University, Rajkot, Gujarat, India
MCA Department, B H Gardi College of Engineering and Technology, Rajkot, Gujarat, India
N. Bhatt
MCA Department, RK University, Rajkot, Gujarat, India
e-mail: nirav.bhatt@rku.ac.in
430 H. Durani and N. Bhatt
every utilization premise (Brown 2017; Buyya et al. 2010). Even in 2020, due to
pandemic situation, it gave boom to IT company which lead to widely usage of
cloud environment.
This can be grouped into two different ways: First on the spot and administrations
offered, second on the spot whether it is public, private, mixture or network base. It
moves both processing and information from compact PC’s with work area to enor-
mous server farms. Accordingly, this environment give a casing work to reasonable
admittance to registering assets and that too in on request approach. Cloud computing
likewise expands accessibility of assets.
Major issue involve in cloud environment consists of load balancing. It distributes
load across all the nodes in the cloud. But still, the major issue in load balancing
is some node remains idle, whereas few nodes are never occupied. Major avoiding
factor is when situation occurs where some nodes are overloaded and few which are
idle. Thus, the working principle of load balancing increases the overall performance
system along with its resource utilization property.
2 Load Balancing
It can be express a process where allocating load to multiple computers on a cluster

of computer with help of network to attain best result where load is not overload
and best resources is obtain. In this method, division is done among server where
data is transfer and received with less delay of time. But, most crucial thing in load
balancing is division of load is done dynamically. Thus, load balancing [2] is a key
factor where modification is needed to improve performance in service of cloud in
Fig. 1.
Fig. 1 Load balancer

Evaluation and Comparison of Various Static and Dynamic Load … 431
2.1 Its Requirement in Environment
In could environment, key factor is load balancer where work is divided among all
node equally. The main basic requirement of load balancing is to reach user satisfac-
tion where no node is overloaded neither under loaded but give best performance.
Even resource consumption can be minimize if proper use of load balancing [7]
is done which result to give benefit of scalability, avoid disturbance, and time is
reduced.
2.2 Purposes
Key purpose of load balancing is:

1. Its performance
2. Back plan should be kept if system fails
3. Constant system
4. Improvement of system for future enhancement.
2.3 Types of Load Balancing
Different grouping are demonstrated by Fig. 2.

It can be classify by three factors [5] in Fig. 2.
Sender Initiated: Here, dispatcher send request and receiver is send notification
that it has been assign workload, i.e., initialization process is done by sender.
Receiver Initiated: Here, receiver admits to sender that it is ready to share
workload, i.e., initiation process is done by receiver.
Symmetric: Here, combination of sender as well as receiver by which load adjust-
ment is done. On the basis of current status, load algorithms are separated into
two parts.
Fig. 2 Types of load

balancing
Static
In this algorithm, all the basic information is provided which include memory perfor-
mance, power processing, and user data requirement. The main disadvantage of this
algorithm is when there is sudden failure where task allocation cannot be done. Best
example is round robin algorithm in static load balancing. Even this calculation had
many disadvantages which lead to new process weightage round robin algorithm. In
this algorithm, individual server is been allocated with weight. The server with the
highest weight is been allocated more connections which balances traffic.
Dynamic
This algorithm is choice-related load adjusting dependent on present status, for
example prior information is not needed. This will overthrow the weaknesses of
static methodology. The dynamic calculations are intricate; however, it results to
improvement then static algorithm. A few approaches are utilized in unique load
adjusting calculation. These can be characterized by following boundaries like
exchange strategy, determination strategy, location strategy, information strategy,
load assessment strategy, measure move strategy, and need task strategy.
3 Techniques
The present techniques are classified into two parts static and dynamic [11]. In this
research paper, three static and dynamic load balancing algorithm are selected, and
simulation is performed according to taken parameters [3].
Following algorithms of static load balancing.
3.1 Round Robin
The working apparatus is like disk style. In this algorithm, process will perform
single as per time allocated to it. In this algorithm [8], all task are processed in
group. Process works until all tasks are completed. This algorithm [12] is use in
Web-like http request. In this algorithm, work is selected by VM and then assigning
request to virtual machine in circular order.
3.2 Randomized
Without knowing any data from the current phase or previous phase, nodes are
selected randomly. In this algorithm, each node keeps its own record of heap. Even
to adjust the heap randomly, nodes are chosen at time of processing. Calculation is
maintained in such a way that firstly size of cycle is checked then after it does testing
of nodes that are moved one after another in VM. This record is maintained in stack
for further processing.
3.3 Threshold
In this algorithm, measurement is done with help of nodes. Here, a heap structure is
maintained with three different levels which are low level, medium, and over data.
Two parameters are taken that is t_upper and t_lower that revels below formula:
Under loaded: load < t_under
Medium: t_under ≤ load ≤ t_upper
Overloaded: load > t_upper
Adjustment of load is done by setting criteria of limits. On the off chance that we set
edge boundary 30% better than expected worth then it will be exceptionally stacked.
Setting threshold boundary 70% above normal worth will be softly stacked. In the
event that the processor state isn’t over-burden then the cycle is distributed locally.
Load balancer will circulate a portion of its work to the VM. If it is over-burden
having least work, VM is similarly stacked.
Following algorithms of dynamic load balancing.
3.4 Active Clustering
Active Clustering algorithm deals with the guideline of collection of the comparable
node and work together on the accessible gatherings. A bunch of process is alliter-
atively executed by every node on the organization. At first, any node can turn into
an initiator and chooses another node from its neighbors to be the go between nodes
fulfilling the rules of being an unexpected sort in comparison to the previous one.
The intermediary node at that point shapes an association between neighbors of it
which are like the initiator. The relational arranger node at that point eliminates the
association among itself and the initiator.
3.5 Honeybee Foraging Behavior
In this algorithm, it behaves like honey bee which goes to search for food after
searching of food it does announcement. They pass message while dancing which
is known as waggle dance. In load balancing, this algorithm behaves same as honey
bee in virtual machine. Here, it has server of clusters where it has its own virtual
queue in load balancing. This mechanism occupies server for process.
3.6 Join-Idle-Queue
In this algorithm, calculation for network administrations frameworks is calculation.

In each communication, right off load adjusting calculation is done with processor
for accessibility. Afterward, distribution is taken to mainframes so that reached to line
length at every part. This calculation eliminates the heap adjusting work from basic
way of solicitation preparing which helps in compelling decrease of the framework
load.
4 Cloud Simulator
Tool use for simulation is cloudsim were various model can be use like user base,
datacenter, and VM load balancer [4] on large platform of cloud in Fig. 3.
The point-by-point configuration did in this paper describes the use of VM load
balancer which helped to allocate proper node in datacenters. It contains the mainly
three things which are: user base, datacenters (DCs), VM load balancer.
User Base
Use Base means where collection of different clients are done as per their allocated
unit in their cell [6]. The main important goal is to avoid traffic for starting new cycle.
Fig. 3 Main screen of cloud analyst

Fig. 4 Loading of data center
Data Center
In cloud analyst, the heart which control in the substance is data center (DC). It works
on cloud simulator where all things are controlled. In data center [14], UB user base
solicitation is send.
VM Load Balancer
In virtual machine VM, everything is controlled by data center for doing load
balancing. To check which virtual machine is allotted in cloudlet [9], it is done
by user base on its destination point in Figs. 4, 5, and 6.
5 Parameters for Load Balancing
Subjective measurements comprise a few boundaries which are valuable to discover

best calculation among them. The distinctive subjective measurements or boundaries
that are viewed as significant for load adjusting in distributed computing are talked
about as follows:
Fig. 5 Average response time
Fig. 6 Hourly average processing times DC

5.1. Through-put: count number of task by which execution can be faster.

5.2. Overhead: Minimum requirement is expected for successful result.
5.3. Fault-tolerance: perform appropriately even if there is failure in node.
5.4. Migration time: Transfer from one machine to another, minimum time is
considered.
5.5. Response time: Minimum time is taken into consideration for load balancing.
5.6. Resource utilization: Maximum operation is considered for best result.
5.7. Scalability: Limited number of processor is taken into considered.
5.8. Performance: All the parameters are satisfied.
The paper illustrates the simulation [10] of static load balancing algorithms which
are RR (round robin), R (randomized), and T (threshold). The overall parameter
performance was based on overall data transfer cost, response, and time of processing
in Fig. 7.
Fig. 7 Home screen of 5 data center

Table 1 Simulation result of static load balancing algorithm

Algorithm Round Randomized Threshold Round robin Randomized Threshold
robin
No of DC 5 5 5 5 5 5
No of VM 25 25 25 50 50 50
Response 76.77 75.97 74.7 94.64 95.88 95.85
time
Processing 2.96 2.17 0.9 0.7 2.09 2.14
time
Data 537.31 537.31 537.31 3,652,100.76 3,652,100.76 3,652,100.8
transfer
data
Table 2 Simulation result of dynamic load balancing algorithm

Algorithm Active Honey Join Idle Active Honey bee Join Idle
clustering bee Queue clustering Queue
No of DC 5 5 5 5 5 5
No of VM 25 25 25 50 50 50
Response 84.56 83.17 82.1 97.56 94.28 95.85
time
Processing 2.86 2.14 0.82 1.45 1.84 2.12
time
Data transfer 632.17 632.17 632.17 4,873,100.76 4,873,100.76 4,873,100.76
data
Here, in Tables 1 and 2, source allocations of DCs 5, VMs 25, and 50 have been
taken to compare overall. Here, consideration of three algorithms from static and
dynamic have been chosen. While comparing all the overall data transfer cost [15]
remain same, but there is variation in time of response and processing time.
Figure 8 indicates the processing time taken by three algorithms of static which
include RR—round robin, R—randomized, and T—threshold.
Figure 9 indicates the processing time taken by three algorithms of dynamic which
include AC—active clustering, HB—honey bee and JIQ—Join-Idle-Queue.
7 Comparison
Past segment distinctive algorithm [13] proposed by different scientists has been
chatted in Table 3.
Fig. 8 Simulation result of static load balancing algorithm
Fig. 9 Simulation result of dynamic load balancing algorithm
8 Conclusion
Simulation is done on six different load balancing algorithm which includes three of
static and similarly of dynamic. Each algorithm was observed with their criteria like
processing time, response time, and overall data transfer. In this paper, simulation is
done for 5DC with 25 and 50 VMs and result is shown in above figures. Still, future
enhancement is to develop on large data and improve overall response time and cost.
Table 3 Comparison between static and dynamic load balancing algorithm

Comparison of load balancing algorithm
Parameters Response time Processing time Resource utilization Over data transfer
cost
Round robin Less Decreases Less Remain same
Randomized More Increase Less Remain same
Threshold Less Decreases More Remain same
Active clustering More Increase More Remain same
Honey bee Less Increase Less Remain same
Join Idle Queue Less Increase Less Remain same
References
1. E. Choi, B.P. Rima, I. Lumb, A taxonomy and survey of cloud computing system, in
International Joint Conference on INC, IMS and IDC, Seoul, Korea, Aug 2009
2. I. Chana, N. Jain, Cloud load balancing techniques: a step towards green computing. IJCSI
(2012)
3. S. Kinger, S. Kaur, Review on load balancing techniques in cloud computing environment. Int.
J. Sci. Res. (2015)
4. H. Bhatt, H. Bheda, An overview of load balancing technique in cloud computing environment.
Int. J. Eng. Comput. Sci. (2019)
5. http://www.loadbalancing.org/
6. S. Gibbs, Cloud computing, international journal of innovative research in engineering and
science (2012)
7. P.J. Patel, H.D. Patel, P.V. Patel, A survey on load balancing in cloud computing. IJERT (2012)
8. N. Pasha, A. Agarwal, R. Rastogi, Round robin approach for VM Load Balancing Algorithm
in cloud computing environment. Int. J. Adv. Res. Comput. Sci. Softw. Eng. (2014)
9. S. Wang, W. Liao, Towards a load balancing in a three level cloud computing network, in IEEE
International Conference and Computer Science and Information Technology, Sept 2016
10. B. Wickremasinghe, Cloud Analyst—a cloud sim based visual and modeler for analyzing cloud
computing environment and applications (IEEE, 2010)
11. XFZ, RXT, A load balancing strategy based on the combination of static and dynamic in
database technology and application (IEEE, 2010)
12. G.A. Chopra, Dynamic Round Robin for load balancing in cloud computing. Int. J. Comput.
Sci. Mob. Comput. (2013)
13. R. Dubey, R. Choubey, A survey on cloud computing security. Challenges and threats Int. J.
Comput. Sci. Eng. IJCSE (2011)
14. A. Singh, M. Korupolu, D. Mohapatra, Server storage virtualization: integration and load
balancing in data centers. J. Res. Dev. (2008)
15. S. Tayal, Task Scheduling Optimization for the cloud computing systems. IJAEST (2011)
Dielectric Resonator Antenna
with Hollow Cylinder for Wide
Bandwidth
Gaurav Kumar and Rajveer Singh Yaduvanshi
Abstract This paper presents a stacked dielectric resonator antenna with a drilled
hollow cylinder. The antenna has thirteen dielectric layers with different permittivity
(Er1 = 12 and Er2 = 4.4) on a FR4 epoxy substrate having a dielectric constant of
4.4 with 0.8 mm thickness. It uses one hollow cylinder of diameter 0.8 mm which
is drilled at right bottom corner in a design of the proposed antenna. The analysis
is performed on 3D EM simulator high-frequency structure simulator. The wide
improvement in bandwidth of the DRA with a drilled hollow cylinder is presented
with the help of the proper excitation and selection of the resonator parameters based
on a −10 dB reflection coefficient.
Keywords Dielectric resonator antenna (DRA) · Three dimensional (3D) ·

Dielectric resonator antenna (DRA) · Decibels (dB) · Electromagnetic (EM) ·
Radio detection and ranging (RADAR) · Scattering parameters (S-parameters) ·
Voltage standing wave ratio (VSWR)
1 Introduction
Nowadays, antenna plays a significant role in our day to day life. In the branch of
antenna’s dielectric resonator, antenna is a category of antenna which allows the
transmission of waves ranging from microwaves to millimetre waves with poky
losses [1]. Due to numerous reasons dielectric resonator, antenna has achieved great
importance in the field of radio frequency engineering [2]. For desired application,
different geometry of the antenna is applied. Features like high gain, wide bandwidth,
high radiation efficiency, and low losses evoke radio frequency engineers towards
the use of DRAs [3]. Detailed study on dielectric resonator antennas (DRAs) were
earliest done by Longer al. [4].
G. Kumar (B)
Guru Gobind Singh Indraprastha University, Delhi, India
R. S. Yaduvanshi
Netaji Subhas University of Technology, Delhi, India
442 G. Kumar and R. S. Yaduvanshi
After that, a lot of study is being done on DRAs on their various geometries like
cylindrical, spherical or rectangular, etc. [5] Rectangular dielectric resonator antenna
has various advantages over spherical and cylindrical geometries [6]. This is due to
because modification can be done in the rectangular form of the dielectric resonator
antenna and various advantages can be achieved. Like, degeneration of mode can be
avoided by properly choosing the three dimensions of the antenna. It is known that
degeneration of modes always exists in the case of a spherical DRA [7]. Higher order
modes can be achieve having same antenna dimensions, which can make antenna
to work on the same frequency and also in the manifestation of hybrid modes of
a cylindrical DRA [8]. Advanced for of DRA was proposed in order to increase
the various parameter of the antenna. One of the ideas was using a stacked form
of rectangular DRA [9], i.e. instead of using whole volume of single material, also
there are number of methods to enhance the bandwidth of the antenna either by using
different material having different material or different shape the antenna [10]. One
of the methods is to use different material having different permittivity in rectangular
shape placed on each other called stacked form of DRA [11–15].
2 Antenna Design
The design here we took is in the form of stacked rectangular slabs of equal dimen-
sions (except top most slab) placed on one above with one drilled hollow cylinder
at right bottom corner. There is two different kind of material used place alterna-
tively which means consecutively there is two different permittivity material attach
together. The substrate is of length = 50 mm, breadth = 50 mm, and height = 1.6 mm
made up of FR4_epoxy having permittivity of 4.4. There are 13 slabs placed on one
another. Each slab is having the dimension of length = 6 mm, breadth = 6 mm, and
height = 0.8 mm except the top most slab which has height of 0.4 mm. The height
and diameter of hollow cylinder are 10 mm and 0.8 mm, respectively.
So that the complete height of the antenna is 10 mm having material of the
slab is FR4_epoxy with permittivity 4.4, and TMM 13i of permittivity 13 is placed
alternatively as shown in Fig. 1.
Antenna is made to work on 13 GHz. Now, the antenna design analysis is done in
which two slabs are removed, i.e. one from top and another is from bottom is removed
and air took their place and various antenna parameters are recorded. Again, 3rd slab
from bottom and 11th slab from bottom are removed and air is introduced instead of
those slabs and again variations in antenna parameters are recorded. Same work is
done by removing middle slab or 7th slab from the bottom of the antenna.
Top view of the antenna is being shown in Fig. 2, which shows the position of the
hollow cylinder, drilled in the stacked dielectric antenna.
Dielectric Resonator Antenna with Hollow Cylinder … 443
Fig. 1 Proposed antenna design with specified permittivity
Fig. 2 Top view of the proposed antenna
3 Result and Discussions
Antenna parameter return loss (S11 ) signifies the amount of power which has been
delivered to antenna relative to impedance matching of proposed antenna with respect
to source. The plot of return loss for proposed antenna is shown in Fig. 3. This plot
includes all S11 parameters for their corresponding slab removal states. In this, return
losses for seven different slab removal states such as removal of slabs (1, 13), slabs
(2, 12), slabs (3, 11), slabs (4, 10), slabs (5, 9), slabs (6, 8), slab (7), and one with all
slabs have been introduced.
The antenna parameter gain is known as the ratio of power radiated in the direction
of maximum radiation to that of the power radiated by hypothetical lossless isotropic
Fig. 3 S-parameters for different states of slabs removal
Fig. 4 Total gain of proposed antenna
antenna. As shown in Fig. 4, the simulated total gain of the antenna is found to be
6.83 dB.
The radiation pattern of the stacked rectangular dielectric resonator antenna at
frequency of 12.7 GHz shown in Fig. 5.
In Fig. 6, retotal shows intensity of electric field at 12.7 GHz in three dimensional.
Red colour shows highest electric field intensity, whereas blue colour shows lowest
electric field intensity.
4 Conclusion
The proposed antenna shows good results in range of frequency from 12.00 to
18.00 GHz. In this antenna design, as we replaced number of slabs by air medium
including drilled hollow cylinder, we obtained wide impedance bandwidth and high
gain on removal of middle (7th) slab. Thus, this antenna can be used potentially in
Dielectric Resonator Antenna with Hollow Cylinder … 445
Fig. 5 Radiation pattern of the stacked rectangular dielectric resonator antenna
Fig. 6 3D polar plot of proposed antenna
designated X-band applications like military applications, radar applications such as

weather forecasting and remote sensing and satellite applications.
Acknowledgements We would like to thanks, Mr. Chandra Prakash, lab assistant of microwave
lab, AIACTR, GGSIPU for arranging and granting a liberal access to the lab and has been extremely
cooperative and helpful throughout the research work.
References
1. S.M. Shum, K.M. Luk, Stacked annular ring dielectric resonator antenna excited by axi-
symmetric coaxial probe. IEEE Trans. Antennas Propag. 43(8), 889–892 (1995)
2. A. Petosa, A. Ittipiboon, Y.M.M. Antar, D. Roscoe, M. Cuhaci, Recent advances in dielectric
resonator antenna technology. IEEE Antennas Propag. Mag. 40(3), 35–48 (1998)
3. A. Petosa, Dielectric Resonator Antenna Handbook (Artech House, Norwood, MA, 2007)
4. G. Kumar, M. Singh, S. Ahlawat, R.S. Yaduvanshi, Design of stacked rectangular dielectric
resonator antenna for wideband applications. Wirel. Pers. Commun. 109(3), 1661–1672 (2019)
5. A. Petosa, A. Ittipiboon, Dielectric resonator antennas: a historical review and the current state
of the art. IEEE Antennas Propag. Mag. 52(5), 91–116 (2010)
6. A.A. Kishk, B. Ahn, D. Kajfez, Broadband stacked dielectric resonator antennas. Electron.
Lett. 25(18), 1232–1233 (1989)
7. K.M. Luk, K.W. Leung (eds.), Dielectric Resonator Antennas (Research Studies Press,
Baldock, England, 2003)
8. Y.-X. Guo, Y.-F. Ruan, X.-Q. Shi, Wide-band stacked double annular-ring dielectric resonator
antenna at the end-fire mode operation. IEEE Trans. Antennas Propag. 53(10), 3394–3397
(2005)
9. A.A. Kishk, et al., Numerical analysis of stacked dielectric resonator antennas excited by a
coaxial probe for wideband applications. IEEE Trans. Antennas Propag. 51(8), 1996–2006
(2003)
10. A. Sangiovanni, J.Y. Dauvignac, Ch. Pichot, Stacked dielectric resonator antenna for multifre-
quency operation. Microw. Opt. Technol. Lett. 18(4), 303–306 (1998)
11. K.M. Luk, K.W. Leung, K.Y. Chow, Bandwidth and gain enhancement of a dielectric resonator
antenna with the use of a stacking element. Microw. Opt. Technol. Lett. 14(4), 215–217 (1997)
12. Y. Ge, K.P. Esselle, T.S. Bird, A wideband probe-fed stacked dielectric resonator antenna.
Microw. Opt. Technol. Lett. 48(8), 1630–1633 (2006)
13. R.S. Yaduvanshi, H. Parthasarathy, Rectangular DRA Theory and Design (Springer, Berlin,
2016)
14. Y.M. Pan, S.Y. Zheng, A low-profile stacked dielectric resonator antenna with high-gain and
wide bandwidth. IEEE Antennas Wirel. Propag. Lett. 15, 68–71 (2015)
15. W.J. Sun, W.W. Yang, H. Tang, P. Chu, J.X. Chen, Stacked dielectric patch resonator antenna
with wide bandwidth and flat gain. J. Eng. 2018(6), 336–338 (2018)
Recent Techniques in Image Retrieval:
A Comprehensive Survey
K. D. K. Ajay and V. Malleswara Rao
Abstract In recent days of image processing, retrieval of images (IR) is very popular,
important, and rapidly developing area of research in multimedia technology. There
is a rapid increase in image transactions in the digital computer world. For various
activities, most of the digital equipment generates images. This creates a massive
picture archive. In recent years, a large amount of visual content from various
fields, such as social media sites, medical images, and robotics, has been created
and shared. Searching databases for similar information, i.e., content-based image
retrieval (CBIR), is a long-established area of study, and real-time retrieval involves
more effective and accurate methods. There are enormous methods of image retrieval.
One of the approaches for obtaining low-level image characteristics is CBIR. Color,
shape, texture and spatial position are some of the features. We have done extensive
survey to understand CBIR, various retrieval techniques, image attributes, standard
image datasets aimed at promoting a global view of the CBIR sector.
Keywords CBIR · Image attributes · Image datasets · Retrieval techniques
1 Introduction
The problem of searching for semantically matched or similar images in a broad

image gallery by analyzing their visual content, given a query picture that defines
the needs of the user, is content-based image retrieval (CBIR). In the computer
vision and multimedia community [1, 2], CBIR has been a longstanding research
subject. With the current increasingly growing amount of image and video data, it is
hardly necessary to establish suitable information systems to handle such vast image
collections effectively, with image search being one of the most important techniques
used to communicate with visual collections. There is therefore almost infinite scope
for CBIR applications, such as individual re-identification [3], remote sensing [4],
K. D. K. Ajay (B) · V. Malleswara Rao

ECE, GITAM Institute of Technology, Visakhapatnam 530045, India
K. D. K. Ajay
ECE, Malla Reddy College of Engineering and Technology, Hyderabad 500100, India
448 K. D. K. Ajay and V. Malleswara Rao
medical image search [5], and online marketplace shopping recommendation [6],
among many others. CBIR can broadly be grouped into instance-level retrieval and
category-level retrieval. In the instance-level image retrieval, a query image of a
specific object is given, and the goal is to be able to find images for the same object
or scene that can be captured under different viewpoints, illumination conditions,
based on occlusions [7, 8].
It may require a search of thousands, millions, or even billions of images to
find the desired image. Searching effectively is therefore as important as searching
correctly, to which continued efforts have been devoted [7, 8]. Compact but rich
feature representations are at the heart of CBIR to enable accurate and effective
retrieval of large image collections.
Two fundamental difficulties are occurred in content-based image retrieval. Those
are intent gap and semantic gap. The intent gap refers to a user’s difficulty accurately
communicating the desired ocular content through a query at hand, such as a sketch
map or an illustration image.
The semantic gap emerges from the difficulty of explaining a low-level visual
function of a high-level semantic concept. Extensive attempts from both academia
and industry have been made to close the gaps. Content-based image retrieval utilizes
an image’s visual information, such as shape, color, spatial layout, and texture, for
indexing the images.
For the database images in traditional image retrieval systems based on content
(Fig. 1), multidimensional feature vectors are extracted, and their quality metrics
will be identified. A database of features represents the feature vectors of database
images. In order to get the images, researchers need to develop the system for retrieval
with sample data.
Such examples will then be translated by the device to extract the features. Using
these feature vectors, distance is measured between them. Thereafter, indexing is
done among the images to perform retrieval process. An effective way of searching
for the image database is given by the indexing scheme. Recent retrieval methods
are using the relevance feedback technique to get the efficient and improved retrieval
results in a more meaningful way.
2 Different Types of Image Retrieval
Researchers have developed many image retrievals approaches, and among those
significant and widely using different types of image retrieval methods are shown in
Fig. 2.
Recent Techniques in Image Retrieval: A Comprehensive Survey 449
Fig. 1 Image retrieval system based on content of an image
Fig. 2 Different types of retrieval methods of an image

2.1 Text-Based Retrieval of Images
Image retrieval based on text is also called image retrieval based on description.
For a particular multimedia query, in retrieving the XML documents that contain
images based on textual information, text-based image retrieval is used. The ocular
content of images is represented in the TBIR by manually assigning tags/keywords to
address CBIR restrictions. As a textual query, it helps a user to present their need for
information and to find the required images based on the match between the textual
query and the annotations of the image manual.
2.2 Content-Based Retrieval of Images
In content-based image retrieval, images are searched and retrieved using image
characteristics based on the similarity of their visual content to the query image. A
feature extraction module is used for extracting low-level image features from the
images in the series. Commonly extracted image features include color, texture, and
shape.
2.3 Multimodal Fusion Retrieval of Images
Data fusion and machine learning algorithms are used in multimodal fusion image
retrieval. Fusion of data, also termed as the merging the evidences, is a method of
integrating different evident sources. The chorus effect, skimming effect, and the
dark horse effect can be learned by using several modalities [9].
2.4 Semantic-Based Retrieval of Images
Using semantic understanding of visual data, several researchers are currently

exploring image retrieval. This kind of retrieval method is used to decrease the
semantic difference. This can be achieved by using the two fundamental methods.
Those are annotation of keyword images or segments of image with automated
annotating of images or implementation of semantic Web initiatives.
2.5 Retrieval-Based on Relevance Feedback of Images
The semantic gap in CBIR systems is called the difference between the information
needs of the user and the representation of the image. The intrinsic semantic gap
is basically due to the limited precision of the image nuclear systems for retrieval.
Relevance feedback is very useful in the CBIR for reducing distance. The funda-
mental concept behind relevant feedback is to incorporate the subjectivity of human
experience into the questionnaire and include users to determine the effects of the
retrieval.
3 Different Features of Image
3.1 Color Features
Image’s most common visual feature is color. Eyes of the human are more sensitive
than gray-level images to color images. RGB color intensities are used in the RGB
color space. There are different methods for extraction of color features such as color
moments, auto-correlogram, and color-histogram. A color histogram is a representa-
tion of color distribution in an image. Histogram of an image indicates the gray-level
spectrum from 0 to 255. To minimize these values, the spectrum of color is split into
discrete intervals. Color moments are steps that can be used to distinguish images
according to their color characteristics. Once measured, these moments provide a
color similarity measurement between images. These similarity values can then be
compared to image values indexed in a database for tasks such as image recovery. A
color correlogram is a three-dimensional table that expresses how the spatial correla-
tion of color changes with distance in a stored image, indexed by color and distance
between pixels. To differentiate an image from other images in a database, a color
correlogram may be used.
3.2 Shape Feature
Shape features in an image are used to determining the determination of the edge.
Shape features are represented as:
1. The exterior form from the boundary.
2. Regions of a sort.
To test the categories mentioned above, Fourier descriptor and invariant methods
of the moment are used.
3.3 Texture Feature
The image texture function describes the periodic reproduction around a part or on
a surface pattern. The texture factor determines the structural selection of the plane
and its interaction with the adjacent area. Using gray-level co-occurrence matrix
(GLCM), wavelets, Fourier transform, entropy, and correlation methods, the texture
feature can be extracted. The transformation of Gabor and wavelet gives the statistical
image distribution. Roughness, directionality, coarseness, regularity, line likeness,
and the contrast are the six texture properties.
4 CBIR Benchmark Dataset
In order to interact with the imaging system, various types of datasets are accessible
online. This paper explores a total of eleven datasets and those are described in Table
1 [10–12].
Table 1 Dataset description

Data set Description
Wang dataset Used in extraction of features
Corel dataset Thousands of different groups. Each group contains a 100 JPEG
images
GHIM dataset Twenty categories are only included. Each one contains a 500
JPEG images
CIFAR-10 It includes images from 10 different groups
UW database In these 18 types, for instance “spring flowers”
ZuBuD database It includes images of buildings at different points of day and night
from
FIGRIM dataset 9428 images are included in the fine-grained image memorability
dataset. The images were classified into 21 groups. 300 images of
700 × 700 pixels are included in each category
MIRFLICKR-25000 It contains 25k images from the Flickr Web site for social
photography
IRMA-10000 database It consists of ten thousand radiographs. In that nine thousand are
guided and thousand are trail-based images
INRIA PERSON database Used to recognizing people who are standing
Caltech pedestrian dataset It is a benchmark of pedestrian identification. Six training sets
have training data, and five sets have tests
Table 2 Different techniques of evaluation

Techniques Description
Euclidean distance Represents the distance of two pixels apart
Chi-square distance Same as the Euclidean distance but gives weighted distance
Geodesic distance The minimum length of the path between two pixels on the surface
Bhattacharya distance Gives the similarities of two distributions of probability
Mahalanobis distance Useful for measuring the distance between two groups
Precision, recall Precision gives the outcome percentage that is relevant, recall is
properly categorized as the percentage of the entire relevant outcome
F-measure Precision and recall are used
Hausdorff distance Two subsets of a space-metric are used for this kind of distance
Chessboard distance Distance among the eight linked proximity of picture elements
Chamfer distance The least of the associated overheads for all the paths from one point
to another is the distance between two points
Manhattan distance The distance between the two points measured along the axes at the
right angle
5 Techniques of Evaluation
Using utility as well as competency, the CBIR system is evaluated. These character-
istics mainly describes the precision and efficiency for image retrieval. In Table 2,
various techniques of evaluation used in different CBIR systems are listed.
6 Literature Survey
Based on different characteristics, such as form, texture, color, and shape, CBIR is
the method of extracting and retrieving features. This section discusses the litera-
ture survey for different CBIR techniques. Content-based image retrieval (CBIR) is
currently a daunting problem due to the large scale of the image database. It is also
difficult to identify images, handle big image files, and the total time for retrieval. The
survey mainly focuses on newly established CBIRs, useful to give accurate image
retrieval results in many different applications. There are various types of images in
each application, and each image has different characteristics. It is therefore more
important to pick an image’s functionality for a specific application and also to
calculate the similarity between the images. Table 3 gives some of the techniques for
retrieval [14, 15].
In general, the content of the image is analyzed based on the texture, shape,
and color attributes. Those attributes are popularly, extensively practiced in image
processing. The main observations which are found in this review are
Table 3 Different methods of texture, color, and shape feature-based CBIR systems
Author and year Method and features Future work
Lingadalli et al. [16] GLCM for texture In the future, it can be improved by
using maximum attributes. The best
result was obtained by combining
multiple attributes
Shaker et al. [17] Principle component analysis In the future, it can be improved by
(PCA) with cloud using maximum attributes
Singh et al. [18] LBP for color images In the future, analyzing the pixel
color as a vector having m-parts
and structure a hyperplane
Qazanfari et al. [19] Color difference histogram Our future work will use the
method weighting of the features and use
relevant feedback to obtain a more
efficient image retrieval system
Du et al. [20] Pulse-coupled neural network Future work might want to research
increasingly complex plans to
decide the heaviness of fusion
similarity part
Wei et al. [21] Intensity variation descriptor There are some potential
applications of the proposed IVD
method, and it can be explicated to
texture recognition
Papushoy et al. [22] Earth mover’s distance (EMD) Need to improve the recognition
rate
Akram et al. [23] Region-oriented segmentation of Enhances efficiency of the
images particular system
Memon et al. [24] Integrated region matching In future, robust object discovery in
method the mixed class dataset are used
Latif et al. [25] Various types of color histogram Not to classify the dataset into an
unnoticed learning process
Singh et al. [26] Bi-layer content-based image The proposed system will be
retrieval expanded in future work with
convolution-based image features
based on neural networks that can
further enhance Bi CBIR
performance
• The methods used to obtain color characteristics are color correlogram, color
histogram, color—coherence vector, color moment, HSV histogram, HMMD and
color descriptor, etc. [13].
• For extracting texture functions, Haar wavelet transform, Gabor wavelet trans-
form, discrete wavelet transforms, GLCM [13], etc., are used.
• To extract shape features, Canny edge detection, edge detection histogram, and
edge-based histogram descriptor, etc. [13], are used.
• The Euclidean and Chi square distance, wavelet depose, Naïve Bayes, and K-
means clustering [13] are used to test similarity.
7 Conclusion
In this paper, we studied the basic IR model based on content useful in retrieval of
the image features. We addressed three distinct basic features here, various types
of CBIR benchmark datasets with their features, tools useful for feature extraction,
and methods used for the feature extraction. Each approach aims to solve the current
challenges facing the system of image retrieval. Various factors are responsible for
influencing the system’s efficiency. To maximize the efficiency of the system, the
variables that positively affect the system can be combined.
References
1. A.W. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-based image retrieval at

the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)
2. M.S. Lew, N. Sebe, C. Djeraba, R. Jain, Content-based multimedia information retrieval: state
of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 2(1),
1–19 (2006)
3. L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, Q. Tian, Scalable person re-identification: a
benchmark, in ICCV, 2015, pp. 1116–1124
4. U. Chaudhuri, B. Banerjee, A. Bhattacharya, Siamese graph convolutional network for content
based remote sensing image retrieval. Comput. Vis. Image Underst. 184, 22–30 (2019)
5. G.S. Naveen Kumar, V.S.K. Reddy, High-performance video retrieval based on spatio-temporal
features, in Microelectronics, Electromagnetics and Telecommunications (Springer, Singapore,
2018), pp. 433–441
6. Z. Liu, P. Luo, S. Qiu, X. Wang, X. Tang, Deepfashion: powering robust clothes recognition
and retrieval with rich annotations, in CVPR, 2016, pp. 1096–1104
7. A. Babenko, V. Lempitsky, Aggregating local deep features for image retrieval, in ICCV, 2015,
pp. 1269–1277
8. L. Zheng, Y. Yang, Q. Tian, SIFT meets CNN: a decade survey of instance retrieval. IEEE
Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2018)
9. R.A. Alghamdi, M. Taileb, M. Ameen, A new multimodal fusion method based on association
rules mining for image retrieval, in 17th IEEE Mediterranean Electrotechnical Conference
(MELECON) (2014), pp. 493–499
10. A. Mishra, T. Kasbe, A comprehensive survey on content based image processing techniques,
in IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS) (2019),
pp. 396–401. ISBN:978-1-7281-4656-0
11. K. Shubhankar Reddy, K. Sreedhar, Image retrieval techniques: a survey. Int. J. Electron.
Commun. Eng. 9(1), 19–27 (2016)
12. A. Varma, K. Kaur, Survey on content-based image retrieval. Int. J. Eng. Technol. 7(4.5),
471–476 (2018)
13. M. Thilagam, K. Arunish, Content-based image retrieval techniques: a review, in 2018 Inter-
national Conference on Intelligent Computing and Communication for Smart World, 2018
Recognition, vol. 68 (2017), pp. 1–13
14. G.S. Naveen Kumar, V.S.K. Reddy, Detection of shot boundaries and extraction of key frames
for video retrieval. Int. J. Knowl. Based Intell. Eng. Syst. 24(1), 11–17 (2020)
15. L.R. Nair, K. Subramaniam, G. Prasanna Venkatesan, A review on multiple approaches to
medical image retrieval system, in Intelligent Computing in Engineering, vol. 1125 (2020),
pp. 501–509
16. R.K. Lingadalli, N. Ramesh, Content based image retrieval using color shape and texture
features. IARJSET. 2(6) (2015)
17. S.H. Shaker, N.M. Khassaf, Methods of image retrieval based cloud. Int. J. Innov. Technol.
Explor. Eng. (IJITEE), 9(3), 2278–3075 (2020)
18. C. Singh, E. Walia, K. Kaur, Color texture description with novel local binary patterns for
effective image retrieval. Pattern Recogn. 76 (2018)
19. H. Qazanfari, H. Hassanpour, K. Qazanfari, Content-based image retrieval using HSV color
space features (2019)
20. A. Du, L. Wang, J. Qin, Image retrieval based on colour and improved NMI texture features.
Automatika 60, 491–499 (2019). https://doi.org/10.1080/00051144.2019.1645977
21. Z. Wei, G.H. Liu, Image retrieval using the intensity variation descriptor. Math. Probl. Eng.
(2020)
22. A. Papushoy, A.G. Bors, Content based image retrieval based on modelling human visual
attention, in Computer Analysis of Images and Patterns. CAIP 2015, Lecture Notes in Computer
Science, vol. 9256, ed. by G. Azzopardi, N. Petkov (Springer, Cham, 2015)
23. F. Akram, J.H. Kim, C.G. Lee, K.N. Choi, Segmentation of regions of interest using active
contours with SPF function. Comput. Math. Methods Med. 710326 (2015). https://doi.org/10.
1155/2015/710326
24. I. Memon, Q. Ali, N. Pirzada, A novel technique for region-based features similarity for content-
based image retrieval. Mehran Univ. Res. J. Eng. Technol. 37 (2017). https://doi.org/10.22581/
muet1982.1802.14
25. A. Latif, A. Rasheed, U. Sajid, A. Jameel, N. Ali, N.I. Ratyal, B. Zafar, S. Dar, M. Sajid, T.
Khalil, Content-based image retrieval and feature extraction: a comprehensive review. Math.
Probl. Eng. (2019)
26. S. Singh, S. Batra, An efficient bi-layer content based image retrieval system. Multimed. Tools
Appl. (2020)
Medical Image Fusion Based on Energy
Attribute and PA-PCNN in NSST
Domain
K. Vanitha, D. Satyanarayana, and M. N. Giri Prasad
Abstract Medical image fusion framework takes quite prominence place in iden-
tification of tumors, finding diseases, treatment of disorders. Acquisition of the
complementary data into a composite image is quite essential task, named multi-
modal image fusion. A new energy attribute-based activity measure and parameter
adaptive-PCNN for merging the medical modalities with NSST is presented. Firstly,
the NSST decomposition is used for input images, then low-pass sub-band coef-
ficients are selected using energy attribute function. The band-pass sub-bands are
selected using PA-PCNN. Finally, inverse NSST on fused coefficients provides final
fused image. This fused image is taken as reference for diagnosticians in order to
assess the disorder level of disease and planning treatment. Our proposed method
proved its robustness in finding the disorders using both quantitative and subjective
assessments.
Keywords Multimodal medical image fusion · Energy attribute · Parameter

adaptive-PCNN · NSST
1 Introduction
Multimodal medical imaging, a research field has been gaining great attention in
scientific era, especially because of the prominence in computer vision, disorder diag-
nosis, and medical image analysis [1]. Medical image fusion (MIF) task plays a most
prominence for facing biggest challenges in this bio-medical field. The major thing to
be considered is how to merge the information optimally from multi-modalities, such
as positron emission (PET), computed tomography (CT), single-photon emission
computed tomography (SPECT), and T1, T2-weigthed magnetic resonance imaging
(MRI) [2]. This MIF process mainly consists of several techniques and research areas,
K. Vanitha (B) · M. N. Giri Prasad

JNTU College of Engineering and Technology, Ananthapuramu, India
D. Satyanarayana
RGM College of Engineering and Technology, Nandyal, India
458 K. Vanitha et al.
with the goal of developing very accurate medical diagnosis and medical decision-
making with great efficiency [3, 4]. Since in the most of cases, directly we are merging
the pixel intensities into composite image so called pixel-level, which has been widely
preferable for MIF task. Multi-scale analysis transform (MSAT) methods have been
categorized as famous and most reliable. The three steps involved in each MSAT
scheme are firstly, the source modalities have been transformed into corresponding
MST domain. Then, properly coefficients have been chosen using measures in order
to give composite image. Finally, these selected coefficients have been applied with
inverse MST to reconstruct an output image. The MST methods like discrete wavelet
transform (DWT) [5], curvelet transform (CVT) [6], non-subsampled contourlet
transform (NSCT) [7], and the non-subsampled shearlet transform (NSST) [8]. These
tools provide fused modality with blocky artifacts, color distortion, and inconsisten-
cies in some regions. To overcome the above presented disadvantages, some more
fusion measures have been preferred in the MST schemes [9, 10]. They are spatial
frequency (SF), modified spatial frequency (MSF), local variance (LV), directive
contrast (DC), the energy of image gradient (EIG), weighted local energy (WLE),
and weighted sum-modified-Laplacian (WSML) [11]. However, these measures does
not give accurate results. The edge-preserving filtering-based (EPF) MSAT has been
most preferably used as MST methods. EPF-MST-based decomposition disintegrates
imaging modality into three layers—one base and two scale. Suitable strategies are
used to merge these layers. Lastly, an input image is reconstructed using inverse
process. Generally, EPF methods use filters such as bilateral (BF), Gaussian (GF),
and curvature (CF), and co-occurrence (CoF) [12–14].
The basic improvements of the proposed MIF framework are listed below:
1. For an effective fusion task, a new energy attribute-based merging strategy is
presented.
2. The application of energy attribute function to merge low-pass sub-bands
preserves the most of modalities energy into fused.
3. The application of parameter adaptive-PCNN to merge band-pass sub-bands
extracts all structural details by properly estimating the prominence of coeffi-
cients.
Experiments have been taken place on diseases such as glioma, Alzheimer’s, and
metastatic bronchogenic carcinoma. This fused image is taken as reference for diag-
nosticians in order to assess the disorder level of disease and planning treatment. Our
proposed method proved its robustness in finding the disorders using both quantita-
tive and subjective assessments. In Sect. 2, the most prominent works used for MIF
task have been given. Section 3 contains the step-wise algorithm of proposed NSST-
EA-PAPCNN. In Sect. 4, proposed has been evaluated using metrics and compared
to analyze its performance. Lastly, in Sect. 5, the basic conclusions of proposed work
have been presented.
Medical Image Fusion Based on Energy Attribute and PA-PCNN … 459
2 Preliminaries
The NSST decomposition and new energy attribute-based strategy for MMIF scheme
are explained here.
2.1 NSST Decomposition
NSST is one of the MSAT, which has been presented by Easley. It merges the
non-subsampled pyramid transform with distinct shearing filters, which leads to
producing multi-scale and directionality characteristics. Because of its basic func-
tion superiority, NSST outperforms the most popularly used MSAT and so most
frequently used in MMIF. For n = 2, the shearlet function is satisfied [15].
For our convenience, NSST and its inverse are represented using two functions
as follows:
{L n , Hn } = nsst_de(I ) (1)
F = nsst_re(L n , Hn ) (2)
where nsst_ de( ) and nsst_re( ) indicate NSST decomposition and reconstruction,
respectively.
I and F denote input and fused images;
L n and H n indicate low-frequency and high-frequency sub-bands, respectively.
2.2 Energy Attribute (EA)
The low-pass sub-bands are merged using energy function (EA), calculated using
the steps as mentioned [11].
1. First the mean (M) and median (med) values of each LP sub-bands LP1 , LP2
are calculated and denoted as M 1 , M 2 , med1 , med2 , respectively.
2. The intrinsic property values of corresponding bands are given as:
IP1 = M1 + med1 (3)
IP2 = M2 + med2 (4)
3. The energy attribute function E 1 and E 2 are measured as:
E 1 (a, b) = exp(αL|P1 (a, b) − IP1 | (5)

E 2 (a, b) = exp(αL|P2 (a, b) − IP2 | (6)
where α = modulation parameter and exp() = exponential operator.
3 Proposed Method
The present proposed MMIF algorithm steps and its block diagram are shown as in
Fig. 1:
1. Read the two multimodal imaging modalities which have been considered for
MMIF process and are aligned, with 256 rows and columns.
2. Then, each input is decomposed using NSST providing a set of lowband-pass
sub-bands denoted as {LP1 , BP1 } and {LP2 , BP2 }.
3. The low-pass sub-bands are merged using the new energy-based attribute EA
given in Eq. ().
LP1 (a, b) ∗ EA1 (a, b) + LP2 (a, b) ∗ EA2 (a, b)

LP F (a, b) = (7)
EA1 (a, b) + EA2 (a, b)
4. The band-pass sub-bands are merged using PA-PCNN and given mathematically
in detail, see [15, 16]:

BP1 (a, b), if N1 [T ] ≥ N2 [T ]
BP F (a, b) = (8)
BP2 (a, b), Otherwise
where T = whole iterations count, N 1 , N 2 are the corresponding firing times of

each band-pass sub-bands.
5. Applying inverse for LPF (a, b) and BPF (a, b) gives the fused output IF(a, b).
Low-pass
Energy Attribute
sub-bands
(EA) rule
Source Fused
Images NSST Image
IF(a, b)
Band-pass PAPCNN
sub-bands
Fig. 1 Proposed MMIF block diagram

The experiments are carried on modalities which have been affected with Alzheimer
disease (MRI-SPECT), (MRI-PET), (T1, T2-weighted MRI images, (CT-MRI),
all [17] are given in Fig. 2. Our algorithm was validated in dataset with modali-
ties of several diseases, namely glioma, Alzheimer’s, and metastatic bronchogenic
carcinoma. The five methods such as NSCT-SF-PCNN [18], NSCT-RPCNN [19],
NSCT-IT2FS [20], LLF-IOI [21], and CSMCA [22] are considered as comparative
methods, and results of these schemes are shown in Figs. 3a–f, 4a–f, 5a–f and 6a–f.
Metrics for each method have been shown in Tables 1, 2, 3 and 4. Here, the values
which are observed as the highest among all other have been bolded. Feature-based
mutual information with no reference (NFMI) measures the transfer of features into
composite in an effective manner [23, 24]. Xydeas and Petrovic proposed QG to
find edges preserving in output image. Zhao et al. and Liu et al. proposed QP to
measure features using phase congruency. Normalized MI (QMI ) measures informa-
tion contained by the fused image, which is taken from merging of modalities. QNCIE
gives the degree of dependence of output data on input modalities. QW measures
similarities based on considering the structural details and loss of details. QCB , visual
perception-based metric which measures retention of salient, contrast, etc [25]. STD
is the most familiar metric to assess the image quality, easy to measure and its value
should be high for better results [26].
The above results affirm that fused image of SF-PCNN has poor quality by reason
of less contrast. NSCT-RPCNN fails in taking out more details from source modali-
ties, resulting in lack of significant details. Edges of modalities are not preserved, so
output image is so blurred and not visibly good. Noise-like artifacts are observed in
the output of LLF-IOI, due to its failure in preserving the structural details. CSMCA
(a) CT (b) MR-T1 (c) MRI (d) MRI
(e) MRI (f) MR-T2 (g) PET (h) SPECT
Fig. 2 a, e CT and MRI, b, f CT and MR-T2, c, g MRI and PET, d, h MRI and SPECT
(a) CT (b) MRT2 (c) SF-PCNN (d) NSCT-RPCNN
(e) NSCT-IT2FS (f) LLF-IOI (g) CSMCA (h) Proposed
Fig. 3 a CT, b MR-T2, c SF-PCNN, d NSCT-RPCNN, e NSCT-IT2FS, f LLF-IOI, g CSMCA, h

Proposed
(a) MR-T1 (b) MR-T2 (c) SF-PCNN (d) NSCT-RPCNN
(e)NSCT-IT2FS (f) LLF-IOI (g) CSMCA (h) Proposed
Fig. 4 a MR-T1, b MR-T2, c SF-PCNN, d NSCT-RPCNN, e NSCT-IT2FS, f LLF-IOI, g CSMCA,

h Proposed
method normally performs ably, but still intensity inconsistencies have been seen in
some regions of the output image. Our method gives accurate results by preserving
most of input modalities details, energy, and reducing the noisy artifacts.
STD 85.628, QMI 0.7051, QG 0.6412, and QW 0.8165 are holding first place for
our method, remaining metrics are in almost second and third places with respect
(a)MRI (b)PET (c) SF-PCNN (d)NSCT-RPCNN
(e) NSCT-IT2FS (f)LLF-IOI (g)CSMCA (h)Proposed
Fig. 5 a MRI, b PET, c SF-PCNN, d NSCT-RPCNN, e NSCT-IT2FS, f LLF-IOI, g CSMCA, h

Proposed
(a) MRI (b) SPECT (c) SF-PCNN (d) NSCT-RPCNN
(e) NSCT-IT2FS (f) LLF-IOI (g) CSMCA (h) Proposed
Fig. 6 a MRI, b SPECT, c SF-PCNN, d NSCT-RPCNN, e NSCT-IT2FS, f LLF-IOI, g CSMCA,

h Proposed
to comparative methods. These bolded metrics represent that transfer of edges,

information, structural details is more.
The results for T1, T2-weighted MRI images are similar as CT with MRI.
Spatial frequency-motivated PCNN, reduced PCNN in NSCT domain methods are
not performed up to the mark, so giving less quality output. Interval type2 fuzzy
sets-based method output is blurred have not enough good details. LLF-IOI gives
Table 1 Objective assessment of distinct fusion schemes for CT and MR-T2

Metrics [18] [19] [20] [21] [22] Proposed
STD 83.20 0.349 0.255 0.369 76.83 85.628
NFMI 0.882 0.871 0.878 0.851 0.876 0.8618
QNCIE 0.806 0.808 0.808 0.805 0.807 0.8063
QMI 0.687 0.673 0.653 0.563 0.680 0.7051
QG 0.286 0.059 0.314 0.167 0.599 0.6412
QP 0.357 0.280 0.232 0.179 0.494 0.299
QW 0.431 0.436 0.997 0.440 0.792 0.8165
QCB 0.559 0.272 0.161 0.531 0.628 0.6173
Table 2 Objective assessment of distinct fusion schemes for MR-T1 and MR-T2
STD 73.74 0.305 0.229 80.89 69.37 75.345
NFMI 0.861 0.862 0.861 0.851 0.865 0.849
QNCIE 0.8091 0.809 0.8081 0.8093 0.808 0.8088
QMI 0.7781 0.807 0.7065 0.8114 0.759 0.7759
QG 0.385 0.058 0.3674 0.3617 0.744 0.6939
QP 0.2568 0.312 0.3165 0.1527 0.524 0.3264
QW 0.4998 0.506 0.9979 0.547 0.825 0.8294
QCB 0.6332 0.242 0.1771 0.6204 0.696 0.6754
Table 3 Objective assessment of distinct fusion schemes for MRI and PET
STD 0.4988 8.909 0.218 0.271 0.211 68.19
NFMI 0.778 0.806 0.865 0.834 0.851 0.852
QNCIE 0.804 0.806 0.807 0.807 0.804 0.808
QMI 0.4869 0.761 0.737 0.736 0.124 0.771
QG 0.4624 0.529 0.579 0.382 0.532 0.511
QP 0.1948 0.031 0.414 0.225 0.125 0.263
QW 0.9621 0.656 0.998 0.517 0.994 0.515
QCB 0.3995 0.618 0.145 0.656 0.667 0.681
poor quality details, fails in providing noisy free output. CSMCA fails in preserving
edges and structures. Our method performs very well with respect to preservation of
structural, edges information without any artifacts.
Only two values such as STD 75.345, QG 0.6939 are holding first place for our
method. The metrics such as QP , Qw have been standing in second place, and still,
Table 4 Objective assessment of distinct fusion schemes for MRI and SPECT
STD 0.4975 10.137 0.202 0.277 0.182 64.538
NFMI 0.8012 0.8094 0.870 0.823 0.856 0.8708
QNCIE 0.8047 0.8075 0.808 0.806 0.804 0.8088
QMI 0.6364 0.8109 0.733 0.705 0.102 0.804
QG 0.4906 0.4995 0.565 0.382 0.499 0.483
QP 0.284 0.0223 0.404 0.352 0.279 0.4203
QW 0.9713 0.5989 0.998 0.492 0.989 0.4915
QCB 0.431 0.6499 0.217 0.71 0.59 0.699
remaining are in third rank. However, objectively results are good and subjectively
very good as it achieves high robustness.
The above results affirm that fused image of SF-PCNN has been produced good
details with less color distortion. From NSCT-RPCNN output, details are not at
all extracted in good manner, so quality is worst, and this method does not work for
integration of color images. NSCT-IT2FS method extracts good amount of structural,
spatial information, functional details from MRI and PET, still in some white regions
are having visual inconsistencies. LLF-IOI suffers from color fidelity issue, so result
is so poor. CSMCA fails in avoiding the artifacts and color distortion problems. Our
method achieves almost high visual quality with respect to both color preservation
and less color distortion among other methods (Fig. 5a–h).
The values of our method are as STD 68.19, QMI 0.771, QNCIE 0.808, QCB 0.681,
indicating that fused has good quality based on human perception, good amount of
color information, and less artifacts with good visible consistency.
Figure 6 has shown the MR and SPECT fusion results, observed that SF-PCNN,
NSCT-RPCNN, NSCT-IT2FS does not preserves color fidelity in good manner. The
LLF-IOI method makes to lose the prominent functional information of SPECT as it
enhances overly the anatomical details in MRI. CSMCA fails in giving fused image
with color details, functional, and anatomical information. Our method achieves
great visual consistency in all regions with good color preservation and almost no
artifacts, color distortion.
The metrics except QMI , QG , and Qw are bolded for proposed with STD 64.538,
NFMI 0.8708, QNCIE 0.8088, QP 0.4203, QCB 0.699. These values depicts that our
method takes first rank for almost all metrics. Thus, our method achieves great visual
consistency in all regions with good color preservation and almost no artifacts, color
distortion.
5 Conclusion
A new energy attribute-based activity measure and PA-PCNN for merging the
medical modalities with NSST is presented. Firstly, the NSST decomposition is
used for input images, then low-pass sub-band coefficients are selected using energy
attribute function. The band-pass sub-bands are selected using PA-PCNN. Finally,
inverse NSST on fused coefficients provides final fused image. This fused image is
taken as reference for diagnosticians in order to assess the disorder level of disease
and planning treatment. Our proposed method proved its robustness in finding the
disorders and proved its robustness by giving good performance with respect to both
quantitatively and subjectively.
Acknowledgements Ethics Approval and Consent to Participate Not applicable.
Human and Animal Rights No animals/humans were used for studies that are the basis of this
research.
Consent for Publication Not applicable.
Availability of Data and Materials The authors confirm that the data supporting the findings of
this research are available within the article.
Funding None.
Conflict of Interest The authors declare no conflict of interest, financial or otherwise.
Acknowledgements Declared none.
References
1. A.P. James, B.V. Dasarathy, Medical image fusion: a survey of the state of the art. Inf. Fusion
19(1), 4–19 (2014)
2. Fatmael-Zahra, Ahmedel-Gamal, Current trends in medical image registration and fusion.
Egypt. Inform. J. 17(1), 99–124 (2016)
3. S. Li, X. Kang, L. Fang, Pixel-level image fusion: a survey of the state of the art. Inf. Fusion
33(1), 100–112 (2017)
4. Tirupal, B. Chandra Mohan, S. Srinivas Kumar, Multimodal medical image fusion techniques—
a review. Curr. Signal Transduction Ther. (2020)
5. R. Vijayarajan, S. Muttan, Discrete wavelet transform based principal component averaging
fusion for medical images. AEU 69(6), 896–902 (2015)
6. R. Srivastava, O. Prakash, A. Khare, Local energy-based multimodal medical image fusion in
curvelet domain. IET Comput. Vis. 10(6), 513–527 (2016)
7. G. Bhatnagar, Q.M.J. Wu, Z. Liu, Directive contrast based multimodal medical image fusion
in NSCT domain. IEEE Trans. Multimedia 15(5), 1014–1024 (2013)
8. G. Guorong, X. Luping, F. Dongzhu, Multi-focus image fusion based on non-subsampled
Shearlet transform. IET Image Process. 7(6), 633–639 (2013)
9. V. Bhateja, H. Patel, A. Krishna, A. Sahu, A. Lay-Ekuakille, Multimodal medical image sensor
fusion framework using cascade of wavelet and contourlet transform domains. IEEE Sens. J.
15(12), 6783–6790 (2015)
10. K. Vanitha, D. Satyanrayana, M.N. Giri Prasad, A new hybrid approach for multi-modal medical
image fusion. JARDCS 12(3), 221–230 (2018)
11. W. Huang, Z. Jing, Evaluation of focus measures in multi-focus image fusion. Pattern Recogn.
Lett. 28(4), 493–500 (2007)
12. D.P. Bavirisetti, R. Dhuli, Fusion of MRI and CT images using guided image filter and image
statistics. Int. J. Imaging Syst. Technol. 27(3), 227–237 (2017)
13. W. Tan, P. Xiang, J. Zhang, H. Zhou, H. Qin, Remote sensing image fusion via boundary
measured dual-channel pcnn in multiscale morphological gradient domain. IEEE Access 8,
42540–42549 (2020)
14. Z. Zhou, B. Wang, S. Li, M. Dong, Perceptual fusion of infrared and visible images through
a hybrid multi-scale decomposition with Gaussian and bilateral filters. Inf. Fusion 30, 15–26
(2016)
15. K. Vanitha, D. Satyanarayana, M.N. Giri Prasad, Medical image fusion algorithm based on
weighted local energy motivated PAPCNN in NSST domain. JARDCS 12(3), 960–967 (2020)
16. M. Yin, X. Liu, Y. Liu, X. Chen, Medical image fusion with parameter-adaptive pulse coupled
neural network in non subsampled shearlet transform domain. IEEE Trans. Instrum. Meas.
68(1), 49–64 (2018)
17. www.med.harvard.edu/AANLIB/
18. X.B. Qu, J.W. Yan, H.Z. Xiao, Z.Q. Zhu, Image fusion algorithm based on spatial frequency-
motivated pulse coupled neural networks in non subsampled contourlet transform domain. Acta
Autom. Sin. 34(12), 1508–1514 (2008)
19. S. Das, M.K. Kundu, A neuro-fuzzy approach for medical image fusion. IEEE Trans. Biomed.
Eng. 60(12), 3347–3353 (2013)
20. Y. Yang, Y. Que, S. Huang, P. Lin, Multimodalsensor medical image fusion based on type-2
fuzzy logic in NSCT domain. IEEE Sens. J. 16(10), 3735–3745 (2016)
21. J. Du, W. Li, B. Xiao, Anatomical-functional image fusion by information of interest in local
Laplacian filtering domain. IEEE Trans. Image Process. 26(12), 5855–5866 (2017)
22. Y. Liu et al., Medical image fusion via convolutional sparsity based morphological component
analysis. IEEE Signal Process. Lett. 26(3), 485–489 (2019)
23. C.S. Xydeas, V. Petrovic, Objective image fusion performance measure. Electron. Lett. 36(4),
308–309 (2000)
24. M.B.A. Haghighat, A. Aghagolzadeh, H. Seyedarabi, A non-reference image fusion metric
based on mutual information of image features. Comput. Elect. Eng. 37(5), 744–756 (2011)
25. z. Liu et al., Objective assessment of multi resolution image fusion algorithms for context
enhancement in night vision: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell.
34(1), 94–107 (2012)
26. P. Jagalingam, A.K. Hegde, A review of quality metrics for fused image. Aquat. Procedia 4,
133–142 (2015)
Electrical Shift and Linear Trend
Artifacts Removal from Single Channel
EEG Using SWT-GSTV Model
Sayedu Khasim Noorbasha and Gnanou Florence Sudha
Abstract Electrical shift and linear trend (ESLT) artifact is often present in recorded
electroencephalogram (EEG) due to electrode fluctuations or a sudden drop in skin
touch, which contrarily affects the exact estimation of cerebrum activities in brain-
computer interfacing (BCI) applications. In this work, a novel model was proposed
by combining stationary wavelet transform (SWT) with group sparsity total variation
(GSTV) filter, denoted SWT-GSTV to remove the ESLT artifacts. SWT method was
used to decompose the interfered single channel EEG into several frequency bands.
The contaminated sub-band signal is applied to GSTV filter to estimate the artifact
signal. This estimated artifact signal is subtracted from the contaminated sub-band
signal, and it gives the filtered sub-band signal. Then, the filtered sub-band signal was
added back to the remaining decomposed components of SWT, which produce the
final denoised EEG signal. Matlab simulations were demonstrated that the proposed
method outperformed the existing methods by exhibiting a high CC, low RRMSE,
and least MAE in α band.
Keywords EEG · ESLT artifact · Group sparsity total variation
1 Introduction
EEG signals are commonly used to analyze the neurological diseases like sleep disor-
ders, epilepsy, and applications of BCI [1]. Either the physiological artifacts (motion,
electrocardiogram (EOG) and electrocardiogram (ECG), etc.) or non-physiological
artifacts (surrounding high-frequency noise and power line interference, etc.) are
often involved in the recorded EEG signals [2]. The involvement of these artifacts
contrarily affects the exact estimation of cerebrum activities [3]. Several algorithms
S. K. Noorbasha (B) · G. F. Sudha

Department of Electronics and Communication Engineering, Pondicherry Engineering College,
Puducherry 605014, India
e-mail: sayedukhasim@pec.edu
G. F. Sudha
e-mail: gfsudha@pec.edu
470 S. K. Noorbasha and G. F. Sudha
[4–7] are proposed in literature to suppress the physiological artifacts from the inter-
fered EEG. But, there are not much focused algorithms in literature to remove the
non-physiological artifacts, specifically electrical shift and linear trend (ESLT) [8].
The reasons to ESLT artifacts might be electrode sudden drop in skin touch, or fluc-
tuations in electrode or transient currents drifts due to triggered effects etc., [9–11].
In [9], Fast ICA, Infomax, and second-order blind identification (SOBI) were imple-
mented for EEG denoising with satisfactory ESLT elimination. Following that, two
more fully automated methods for ESLT artifact removal are presented: automatic
wavelet-ICA (AWICA) [10] and enhanced AWICA (EAWICA) [11].
Recently, stationary wavelet transform (SWT)—kurtosis-based method for ESLT
artifact removal has been proposed in [12]. In this, SWT with thresholding is used to
remove the ESLT artifacts, and a kurtosis-based strategy is used to select the optimal
decomposition level of SWT to reach the artifact components. The limitation is that
thresholding leads to some loss of wanted EEG in the reconstruction of the signal due
to its non-stationary behavior [4], which is not acceptable for biomedical applications
like BCI.
In this work, a novel model was proposed by combining stationary wavelet trans-
form (SWT) with group sparsity total variation (GSTV) filter, denoted SWT-GSTV
to remove the ESLT artifacts. SWT method was used to decompose the interfered
single channel EEG into several frequency bands. The contaminated sub-band signal
is applied to GSTV filter to estimate the artifact signal. This estimated artifact signal
is subtracted from the contaminated sub-band signal, and it gives the filtered sub-
band signal. Then, the filtered sub-band signal was added back to the remaining
decomposed components of SWT, which produce the final denoised EEG signal.
The orientation of the paper shall be described as follows. A short description of the
methods and the databases used was discussed in Sect. 2. The experimental results
are discussed in Sect. 3. The conclusion of the paper is provided in Sect. 4.
2 Methods and Data for Experimentation
2.1 SWT
The orthogonality property of the wavelet transform makes it an important method

for performing filtering. The regular discrete wavelet transform (DWT) is fails to
be translation-invariant; hence, SWT is used instead of DWT [13]. The functioning
of SWT is same as the regular DWT, except that the SWT does not perform the
decimation process. The SWT coefficients are evaluated as in [14]

N
Ci, j = y(k)i, j (k) (1)
k=1
Electrical Shift and Linear Trend Artifacts Removal from Single … 471
where i, j (k) is the SWT indicated as

i, j (k) = 2−( 2 ) 0,0 2−i (k − j)
i
(2)
The approximate, Ai, j (k) and detail, Di, j (k) coefficients are
Ai, j (k) =↑ 2i−1 L 1 ∗ Ai−1, j (3)
Di, j (k) =↑ 2i−1 H 1 ∗ Di−1, j (4)
where ↑ 2i−1 L 1 = L i (k) and ↑ 2i−1 H 1 = H i (k) are the oversampling of the low-
pass filter L i−1 (k) coefficients and the oversampling of the high-pass filter H i−1 (k)
coefficients, respectively.
2.2 GSTV
Total variation (TV) filter is an effective tool, which is used for several applications
like, decomposition, deconvolution, and denoising [15]. However, the drawback of
TV filter is that the filtered signals contain stair case components. To avoid this
drawback, group sparsity (GS) technique is combined with TV filter to yield the
GSTV filter. From Fig. 1, let us consider the approximation component of sixth
level, A6 (k) = g(k) + h(k), where g(k) is unknown artifact signal and h(k) is the
wanted signal. The estimation of unknown artifact signal by GSTV filter is given as
in [16],
Fig. 1 Flowchart representation of proposed SWT-GSTV model


1
GSTV(X filter ) = Min ||A6 (k) − g(k)||22 + λDg(k) (5)
2
where λ and D indicate the regularization parameter and the first-order difference
matrix, whose size is (N − 1) * N, respectively. Let us consider the group size of
sparsity as N-point and its vector, R is represented as in [16],
R K ,N = [R(K ), . . . , R(K + N − 1)] (6)
where K is index number of group size of sparsity. The sparsity function, in terms
of N-point vector is indicated [16] as below.
N −1

(R) = |R(n + K )| 2
(7)
K n=0
2.3 Proposed Model
The flowchart representation of proposed model is shown in Fig. 1.

In this, SWT method is used to decompose the interfered single channel EEG,
y(k) into several frequency bands. The contaminated approximation sub-band signal,
A6 is applied to GSTV filter to estimate the artifact signal, X filter (k). This estimated
artifact signal, X filter (k) is subtracted from the contaminated approximation sub-band
signal, A6 , to give the filtered sub-band signal, X (k), as follows:
X (k) = A6 − X filter (k) (8)
Then, the filtered sub-band signal, X (k) is added back to the remaining decom-
posed components of SWT, which produces the final denoised EEG signal, x̃(k)
as,
x̃(k) = D1 + D2 + D3 + D4 + D5 + D6 + X (k) (9)
where D1 , D2 , D3 , D4 , D5 , and D6 are the detailed coefficients of SWT.
2.4 Database Used
For the purposes of experimentation, both simulated and real-time databases are
considered, from CHB-MIT scalp EEG [17] and EEG-LAB [18], respectively. The
sampling frequencies of these databases are 256 Hz and 128 Hz, respectively.
2.5 Performance Measures
To measure the effectiveness of the proposed SSA-GSTV method, the following

performance metrics are used
1. The relative root mean square error (RRMSE) [5, 6] is defined as
RMS(a − x̃)
RRMSE = (10)
RMS(a)
where a and x̃ are clean EEG and denoised EEG signals, respectively.
2. Mean absolute error (MAE) is given as [5, 6]
m
| pc (k) − pe (k)|
MAE = k=n
(11)
n−m
where pc (k) indicates power spectrum of the denoised EEG; pe (k) represents the
power spectrum of interfered EEG, and n −m indicates the range of frequencies.
This parameter should be as minimum as possible. To identify the capability
of restoring the original EEG in the denoised signal, another metric correlation
coefficient (CC) is also computed between clean EEG, a(k), and denoised EEG,
x̃(k).
3 Results
3.1 Simulated Data Results
The pure EEG, ESLT artifact and their summation produces the interfered EEG are
shown in Fig. 2a–c, respectively.
Fig. 2 a Pure EEG, b ESLT artifact, and c interfered EEG

Simulations were carried out in Matlab. The interfered EEG is applied to the
SWT stage of proposed method with six level decomposition. In this paper, the
number of decomposition levels for SWT are considered six, because most of the
ESLT artifact appears in the sixth level of approximation sub-band, A6 as compared
to other decomposition levels. The parameters of GSTV filter, the size of group
sparsity, N = 3, and regularization parameter, λ = 1.5 are considered for effective
performance.
The SWT decomposed detailed and approximation sub-bands are shown in Fig. 3.
The approximation sub-band, A6 is applied to the GSTV filter to extract the ESLT
artifact. The extracted ESLT artifact by GSTV filter is shown in Fig. 4b. This artifact
is subtracted from the approximation sub-band, A6 which produce the residue of the
wanted EEG component as shown in Fig. 4c. Then, this residue of EEG is added
Fig. 3 SWT decomposed detailed and approximation sub-bands. a D1 , b D2 , c D3 , d D4 , e D5 , f

D6 , and g A6 , respectively
Fig. 4 a Approximation sub-band, A6 , b estimated ESLT artifact by GSTV filter, c residue of EEG,
and d denoised EEG
Fig. 5 Averaged performance measures with respect to SNR (dB). a RRMSE, b CC
back to the remaining detailed sub-bands of SWT, which yield the final denoised
EEG as shown in Fig. 4d.
Both proposed and existing methods are applied on ninety-five records of ESLT
artifact interfered EEG by varying different SNR values. The performance measures
RRMSE and CC are calculated and the averaged performance measures with their
standard deviations are plotted as shown in Fig. 5.
3.2 Real-Time Data Results
Five number of real-time records with five seconds duration are shown in Fig. 6.
Similar to simulated data results, the real-time data records are applied to both
proposed and existing methods and denoised signals are obtained. To evaluate the
performance of these methods, power spectral density (PSD) plots are drawn for
Fig. 6 Real-time EEG signals. a Record 1, b Record 2, c Record 3, d Record 4, and e Record 5
Fig. 7 PSD plots of a interfered EEG (blue), b denoised EEG by EAWICA (black), c denoised
EEG by SWT-kurtosis (magenta), and d denoised EEG by proposed method (green)
Table 1 MAE of proposed

Record EAWICA [11] SWT-kurtosis Proposed
and existing methods in the α
number [12] SWT-GSTV
band (8–12 Hz)
1 0.1035 0.0717 0.0112
2 0.1986 0.1517 0.0491
3 0.0654 0.0455 0.0191
4 0.3194 0.1768 0.0527
5 0.0827 0.0538 0.0091
Average 0.1539 0.0999 0.0282
denoised signals and calculated the mean absolute error (MAE) in the α band (8–
12 Hz). The PSD plot of the interfered EEG and denoised EEGs of proposed and
existing method with respect to Record 1 are shown in Fig. 7.
From Table 1, it is noticed that the averaged MAE of the proposed method is least
value compared to the existing methods. It means the recovery of α band (8–12 Hz)
component in the denoised EEG by the proposed method is satisfactory compared
to existing methods, which is very crucial for BCI applications.
In the existing EAWICA [11] process, the DWT is first applied to decompose the
interfered signal, and then, these decomposed signals are fed into ICA for artifact
removal. The disadvantage of this approach is that it needs certain predefined artifact
markers to distinguish the correct artifact. In the method [12], SWT with thresholding
is used to remove the ESLT artifacts, and a kurtosis-based strategy is used to select the
optimal decomposition level of SWT to reach the artifact components. The limitation
is that the thresholding leads to some loss of wanted EEG in the reconstruction of
the signal due to its non-stationary behavior [4]. The proposed approach overcomes
the drawbacks of the existing algorithms, EAWICA [11] and SWT—kurtosis [12]
and has overall enhanced performance measure MAE with the decrement of 0.1257
and 0.0717, respectively.
4 Conclusion
In this work, a novel SWT-GSTV model has been proposed to remove the ESLT
artifact from the interfered EEG. In this, SWT method was used to decompose the
interfered single channel EEG into several frequency bands. The contaminated sub-
band signal is applied to GSTV filter to estimate the artifact signal. This estimated
artifact signal is subtracted from the contaminated sub-band signal, to give the filtered
sub-band signal. Then, the filtered sub-band signal was added back to the remaining
decomposed components of SWT, which produces the final denoised EEG signal.
Simulation results demonstrate that the proposed method outperforms the existing
methods with low RRMSE, MAE, and high CC.
References
1. M. Gorgoni, S. Scarpelli, F. Reda, L. De Gennaro, Sleep EEG oscillations in neurodevelop-

mental disorders without intellectual disabilities. Sleep Med. Rev. 49 (2020)
2. N. Alharbi, A novel approach for noise removal and distinction of EEG recordings. Biomed.
Signal Process. Control 39, 23–33 (2018)
3. S.K. Noorbasha, G.F. Sudha, Removal of EOG artifacts and separation of different cerebral
activity components from single channel EEG—an efficient approach combining SSA-ICA
with wavelet thresholding for BCI applications. Biomed. Signal Process. Control 63 (2021)
4. S.K. Noorbasha, G.F. Sudha, Removal of motion artifacts from EEG records by overlap
segmentation SSA with modified grouping criteria for portable or wearable applications, in
Soft Computing and Signal Processing. Advances in Intelligent Systems and Computing, ed. by
V.S. Reddy, V.K. Prasad, J. Wang, K.T.V. Reddy, vol. 1325 (Springer, Singapore, 2021). http://
doi.org/10.1007/978-981-33-6912-2_36
5. S.K. Noorbasha, G.F. Sudha, Removal of EOG artifacts from single channel EEG—an efficient
model combining overlap segmented ASSA and ANC. Biomed. Signal Process. Control 60
(2020)
6. S.K. Noorbasha, G.F. Sudha, Joint singular spectrum analysis and generalized Moreau envelope
total variation for motion artifact removal from single channel EEG signals. Biomed. Signal
Process. Control 68 (2021)
7. X. Chen, A. Liu, J. Chiang, Z.J. Wang, M.J. McKeown, R.K. Ward, Removing muscle artifacts
from EEG data: multichannel or single-channel techniques? IEEE Sens. J. 16(7), 1986–1997
(2016)
8. N. Bajaj, J.R. Carrión, F. Bellotti, R. Berta, A.D. Gloria, Automatic and tunable algorithm for
EEG artifact removal using wavelet decomposition with applications in predictive modeling
during auditory tasks. Biomed. Signal Process. Control 55, 1–13 (2020)
9. A. Delorme, T. Sejnowski, S. Makei, Enhanced detection of artifacts in EEG data using higher-
order statistics and independent component analysis. Neuroimage 34, 1443–1449 (2007)
10. N. Mammone, F.L. Foresta, F.C. Morabito, Automatic artifact rejection from multichannel
scalp EEG by wavelet ICA. IEEE Sens. J. 12, 533–542 (2012)
11. N. Mammone, F.C. Morabito, Enhanced automatic wavelet independent component analysis
for electroencephalographic artifact removal. Entropy 16, 6553–6572 (2014)
12. M. Shahbakhti et al., SWT-kurtosis based algorithm for elimination of electrical shift and linear
trend from EEG signals. Biomed. Signal Process. Control 64 (2021)
13. R.R. Coifman, D.L. Donoho, Translation invariant denoising, in Lecture Notes in Stoiisrles,
vol. 101 (1995), pp. 125–150
14. M. Meraha, T.A. Abdelmalika, B.H. Larbic, R-peaks detection based on stationary wavelet
transform. Comput. Methods Programs Biomed. 121(3), 149–160 (2015)
15. A. Chambolle, An algorithm for total variation minimization and applications. J. Math. Imaging
Vis. 20, 89–97 (2004)
16. I.W. Selesnick, P.-Y. Chen, Total variation denoising with overlapping group sparsity, in IEEE
ICASSP, May 26–31, 2013, Vancouver, Canada
17. A. Shoeb, Application of machine learning to epileptic seizure onset detection and treatment,
Ph.D. Thesis (2009)
18. A. Delorme, S. Makeig, EEGLAB: an open-source toolbox for analysis of single-trial EEG
dynamics including independent component analysis. J. Neurosci. Methods 134, 9–21 (2004).
Available: http://sccn.ucsd.edu/eeglab/
Forecasting Hourly Electrical Energy
Output of a Power Plant Using
Parametric Models
Ch. V. Raghavendran, G. Naga Satish, Vempati Krishna, and R. V. S. Lalitha
Abstract Parametric models such as linear regression, polynomial regression, linear

support vector machine are used to model numerous systems that have number of
features. The aim of this paper is to apply these models on the combined cycle
power plant to predict the electrical power output. This paper explains the stochastic
behavior of the parametric models and also validates the models with tenfold
cross validation method. For the linear regression model, we made the assump-
tions—normality, linearity, homoscedasticity, independence, and multicollinearity
to validate the results of the model.
Keywords Regression · Linear regression · Polynomial regression · Metrics ·

Support Vector Machine
1 Introduction
In a combined cycle power plant, the electricity is generated by gas and steam
turbines. This kind of plants generates more than 50% than the traditional power plant
[1, 2]. Electricity generated by the power plant oscillates due to number of reasons
including environmental conditions. Traditional mathematical models require high
number of parameters to predict the actual system output [3, 4]. Instead of the math-
ematical models, machine learning (ML) models can be used for better predictions
even with few parameters [5].
The concept of ML is to make the computers to learn themselves by adopting a
model instead of acting according to a program written by a programmer [6]. The
arrival of new data makes the customized ML models to understand, modify, and
improve themselves.
Ch. V. Raghavendran (B) · R. V. S. Lalitha

Aditya College of Engineering & Technology, Surampalem, Andhra Pradesh, India
G. Naga Satish
BVRIT HYDERABAD College of Engineering for Women, Hyderabad, Telangana, India
V. Krishna
TKR College of Engineering and Technology, Hyderabad, Telangana, India
480 Ch. V. Raghavendran et al.
In ML, supervised learning is a method in which the machine learns through

labeled data. Regression is a supervised learning method, and it is modeled to predict
a continuous feature. A multivariate regression problem is special case of regression
problem with more than one input. In this paper, we have implemented three para-
metric models—linear regression, polynomial regression, and linear support vector
machine on the power plant dataset. The models accuracy is verified with tenfold
cross validation techniques to test whether they are overfitted or underfitted. In the
literature, we find few research papers which implemented ANN model to forecast
the electrical energy of the power plants [7, 8].
This paper is having five sections. Parametric models are discussed in Sect. 2,
and dataset and visualizations are presented in Sect. 3. ML models were applied in
Sect. 4, and the results are examined in Sect. 5. Section 6 concludes the paper.
2 Parametric Models
A learning model that précises data with a set of parameters of fixed size is termed as a
parametric model. Irrespective of the volume of the data given to a parametric model,
it won’t change its attention on the number of parameters it requires. Examples of
parametric learners include linear models like—linear regression, logistic regression,
and Linear Support Vector Machine (LSVM).
2.1 Linear Regression
Linear Regression determines a plane to minimize the sum of squared errors (SSE)
among the observed and predicted response [9]. Now, we model linear regression to
predict the target variable PE. We will start with simple linear regression which is
used for forecasting continues result.
• Simple linear regression works on only one independent variable
• Multiple linear regression works on more than one independent variable.
The following equation is used by the linear regression to calculate target feature
y = β0 + β1 x1 + β2 x2 + · · · + βn xn (1)
where y is response; β 0 is intercept; β 1 is coefficient for x 1 , and β n is coefficient for

xn .
Forecasting Hourly Electrical Energy Output of a Power Plant … 481
2.2 Polynomial Regression
A variation to linear regression that makes regression models constructed on poly-

nomial equations is called polynomial regression. In this regression, a polynomial is
modeled based on the association between the input features and predictor variable.
The following equation is kth order polynomial model with one variable
y = β0 + β1 x + β2 x 2 + · · · + βk x k + ε (2)
If x j = x j , j = 1, 2, 3, … k, the model is multiple linear regression model in

k descriptive variables x 1 , x 2 , x 3 , …, x k . From this, we can understand that linear
regression model y = xβ + ε also contains the polynomial regression model.
2.3 Support Vector Machine (SVM)
This is a discriminative algorithm formally defined by a separating hyperplane. In

SVM, a hyperplane in a high dimensional space is built for classification and regres-
sion. The objective of the SVM algorithm is to generate the finest line or decision
borderline that can discrete n-dimensional space into classes so that we can simply
put the novel data points in the correct class in the future. This concluded borderline
is called a hyperplane. SVM selects the extreme points that help in generating the
hyperplane. These extreme points are termed as Support Vectors, and because of this,
the algorithm is named as support vector machine.
2.4 Metrics
Evaluation metrics for classification problems, such as accuracy, are not useful for
regression problems. The metric accuracy used as a metric to evaluate classifica-
tion problem is not used for regression problems. These problems need metrics for
comparing continuous values. The three common evaluation metrics for regression
problems are as follows:
Mean Square Error—the mean of the squared error and the formula is
1 2
MSE = yactual − ypredicted (3)
n
Root Mean Square Error—the square root of the MSE, and this states us how
intense the data is around the line of the best fit.

yactual − ypredicted 2
RMSE = (4)
n
R2 Score—proportion of the variance in the dependent variable that is foreseeable

from the independent variable and the formula is
2
yactual − ypredicted
R =
2
(5)
(yactual − ymean )2
3 Dataset and Visualizations
3.1 Understanding Dataset
The dataset for this analysis is gathered from combined cycle power plant. The dataset
is a collection of 6 years of power plant data with 9568 records. The dataset has four
independent features and one target feature. The description of the features is hourly
average temperature—AT, ambient pressure—AP, relative humidity—RH, exhaust
vacuum—V, net hourly electrical energy output of the power plant—PE.
The descriptive information of the dataset are shown in the Table 1. From Table 1,
it is evident that all the features are continuous, but ranges are varying. The data are
to be normalized to 0–1 before applying any machine learning model to overcome
the variations in the mean and standard deviation.
Table 1 Statistics of the dataset

AT V AP RH PE
Count 9568.000000 9568.0000000 9568.0000000 9568.0000000 9568.0000000
Mean 19.651231 54.305804 1013.259078 73.308978 454.365009
Std 7.452473 12.707893 5.938784 14.600269 17.066995
Min 1.810000 25.360000 992.890000 25.560000 420.260000
25% 13.510000 41.740000 1009.100000 63.327500 439.750000
50% 20.345000 52.080000 1012.940000 74.975000 451.550000
75% 25.720000 66.540000 1017.260000 84.830000 468.430000
Max 37.110000 81.560000 1033.300000 100.160000 495.760000
Fig. 1 Boxplots of all features
3.2 Visualization
Data visualization plays a vital role to analyze datasets and to understand the insights
of collected dataset. Data visualization will make it easier to understand through
various plots—boxplot, scatter plot, distribution plot, heat map, correlation chart,
pair plot, etc.
A Boxplot is a regular way of presenting the scattering of data built on a five
number summary. The boxplots of all the continuous features is presented in the
Fig. 1.
Distribution plot is the most convenient way to present the univariate distribution
of a feature, which show data in histogram and fit a kernel density estimate (KDE).
A histogram is a tool that represents the distribution of data in the form bins along
the range of the data. The KDE is useful for plotting the shape of a distribution. The
bell-like curve in the plot shows the density which is a smooth form of the histogram.
The y-axis is in terms of density, and the histogram is stabilized by default so that
it has the same y-scale as the density plot. The distribution plots of the continuous
features in comparison with target feature are shown in Fig. 2.
Scatter plot presents the relationship between two continuous features. It illustrates
how one feature affects the other feature in every fraction of the value of the dataset.
The scatter plots of all the features are presented in Fig. 3.
3.3 Overview
From the plots, it is evident that the data in all the features are almost equally
distributed except for feature AP. For this feature, it is observed that it has some
Fig. 2 Distribution plots of all features with KDE
Fig. 3 Regression plots for all features with target feature (PE)
outliers on left side of the plot in Fig. 1. Observing fourth plot in Fig. 1, it is clear
that the data for the RH feature is skewed left side. In the implementation part, we
have consider these observations.
4 Implementation
4.1 Dataset Partition
The original dataset with dimension (9568, 5) is partitioned randomly into train set
and test set in the 75:25 ratio. So, the train and test set dimension becomes (7176,
5) and (2392, 5), respectively. The following commands will partition the dataset
into four partitions as x_train, y_train, x_test, and y_test. x_train and x_test contain
the all independent features of train and test set. y_train and y_test contain only the
dependent feature of train and test set.
x = data[[‘AT’, ‘V’, ‘AP’, ‘RH’]]
y = data[‘PE’]
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1)
4.2 Linear Regression
According to Eq. 1, the predicted feature for the dataset is calculated using the
following equation:
y = β0 + β1 × AT + β2 × V + β3 × AP + β4 × RH (6)
The β values are termed the model coefficients. These values are “learned” through
the model fitting phase by means of the “least squares” measure [10–12]. Then, the
fitted model can be used to create predictions. The coefficients and slope values of
the fitted model are as follows:
Coefficient = 447.06297098687327
Slope values = [−1.97376045, −0.23229086, 0.0693515, −0.15806957]
The results in Table 2 show that the model is giving approximately 93% accuracy
for train, test data and also in 10 folds cross validation.
Table 2 Linear regression

Train data (75%) Test data (25%)
results
Mean squared error 20.9997 20.0804
(MSE)
Root mean squared error 4.5825 4.4811
(RMSE)
r2 score 0.9276 0.9317
Cross validation mean 0.9275
value
Fig. 4 Residuals of linear regression
Linear regression is a parametric model. A parametric model is a model that

creates some expectations around the data for the purpose of modeling. We need to
check the assumptions for linear regression to obtain good results with the data. The
assumptions of linear regression are as follows:
1. Normality
2. Linearity
3. Homoscedasticity
4. Independence
5. Multicollinearity.
4.2.1 Normality
In a linear regression model, the errors or residuals are generally distributed. If the
resultant plot is linear, then the residuals are usually scattered. The plot in Fig. 4
shows a linear plot, and this proves that the residual are distributed normally.
4.2.2 Linearity
This is an association of independent variables, and response or target variable is

linear. The linearity hypothesis can be verified using scatter plots. The plots in Fig. 5
show that the data in all the features are almost linear to the target feature PE.
4.2.3 Homoscedasticity
This is to check for variability in the response variable is the same at all levels of
exploratory variables. The residuals should be constant or with identical variance.
Fig. 5 Scatter plots for dependent features to target feature
Fig. 6 Homoscedasticity plot
If it is not constant across the error terms, then there is a chance of heteroscedas-
ticity. Because of the existence of outliers, the non-constant variance across the error
terms occurs. If there is a funnel shaped distribution, then consider a non-constant
variance, i.e., heteroscedasticity. Figure 6 shows the homoscedasticity plot between
the predicted value and residuals. If there is no funnel shape distribution in the plot,
then this is an indication of homoscedasticity.
4.2.4 Independence
For linear regression, it is required that the residuals have very slight or no auto-
correlation in the data. If the residuals are dependent on each other, then auto-
correlation happens. Error (i + 1) term is dependent of error (i) term. This indicates
that the current residual value is dependent on the previous residual value. Presence
Fig. 7 Auto-correlation function (ACF)
of auto-correlation considerably decreases the R square value and increase the error
of the model. We check auto-correlation using ACF (auto-correlation function) plot
shown in Fig. 7.
4.2.5 Multicollinearity
Collinearity indicates whether the two features are highly correlated and having
related information about the variance in a dataset. Correlation matrix is used to
identify collinearity among features. But, multicollinearity is more difficult to detect
because it emerges when three or more features in the dataset are highly correlated.
So, this is used to check whether the independent features are highly correlated with
each other or not. The variance inflation factor (VIF) is metric of collinearity among
predictor features in multiple regression. The following Fig. 8 shows the initial VIF
of all the predictor features and VIF after removing features with high VIF. Finally,
all the four features are to be included in the model.
Fig. 8 a VIF of all features. b VIF values after removing V. c VIF values after removing AP
Table 3 Polynomial regression and support vector machine results

Polynomial regression Support vector machine
Train data Train data Test data Train data
MSE 15.3973 16.0625 21.1649 20.1994
RMSE 3.9239 4.0078 4.6005 4.4944
r2 score 0.9469 0.9454 0.9271 0.9313
CV mean value 0.9397 0.9269
4.3 Polynomial Regression
Polynomial regression is considered to be a special case of linear regression. For the

dataset, the polynomial regression model is implemented with a degree of nine, and
the results for train and test data are presented in Table 3 along with cross validation
mean. The measures that are considered for linear regression are applicable for this
also.
4.4 Linear Support Vector Machine (SVM)
Support vector machine (SVM) is a tool used for both regression and classification.
SVM uses ML theory to maximize predictive accuracy, and at the same time, it
automatically avoids overfit to the data. The above Table 3 shows the values of
metrics.
5 Result Analysis
The values of the metrics obtained for three models—linear regression, polynomial
regression, and linear support vector machine are analyzed. This indicates that the
polynomial regression is resulting lower values for MSE and RMSE comparing with
the other two models. But for R2 Score, these three models are giving almost same
accuracy. The R2 score is verified with cross validation test with 10 folds and is
almost matches with train and test R2 score values. So, the models are not either
overfit or underfit.
6 Conclusion
In this paper, we have consider the combined cycle power plant dataset to predict
the hourly electrical energy output of a power plant using three parametric models—
linear regression, polynomial regression, and linear support vector machine with
tenfold cross validation. All the three models have given approximately 92–94% of
accuracy polynomial regression has given the highest accuracy for both train and test
data. We have also validated the linear regression with five assumptions—Normality,
Linearity, Homoscedasticity, Independence, and Multicollinearity. This paper can be
further extended by applying the other parametric models and also the non-parametric
models such as decision tree, random forest, and KNN.
References
1. L.X. Niu, X.J. Liu, Multivariable generalized predictive scheme for gas turbine control in
combined cycle power plant, in 2008 IEEE Conference on Cybernetics and Intelligent Systems
(2008), pp. 791–796. http://doi.org/10.1109/ICCIS.2008.4670947
2. V. Ramireddy, An overview of combined cycle power plant (2015). http://electricalengineer
ing-portal.com/an-overview-of-combined-cycle-power-plant
3. U. Kesgin, H. Heperkan, Simulation of thermodynamic systems using soft computing
techniques. Int. J. Energy Res. 29, 581–611 (2005)
4. A. Samani, Combined cycle power plant with indirect dry cooling tower forecasting using
artificial neural network. Decis. Sci. Lett. 7(2), 131–142 (2018)
5. P.R. Norvig, S.A. Intelligence, A modern approach. Manuf. Eng. 74, 111–113 (1995). http://
doi.org/10.1049/me:19950308
6. B. Lakshmi Sucharitha, C.V. Raghavendran, B. Venkataramana, Predicting the cost of pre-
owned cars using classification techniques in machine learning, in Advances in Computational
Intelligence and Informatics. ICACII 2019. Lecture Notes in Networks and Systems, vol. 119,
ed. by R. Chillarige, S. Distefano, S. Rawat (Springer, Berlin, 2020)
7. H. Moayedi, D. JahedArmaghani, Optimizing an ANN model with ICA for estimating bearing
capacity of driven pile in cohesionless soil. Eng. Comput. 34, 347–356 (2018). https://doi.org/
10.1007/s00366-017-0545-7
8. M. Khandelwal, A. Marto, S.A. Fatemi et al., Implementing an ANN model optimized by
genetic algorithm for estimating cohesion of limestone samples. Eng. Comput. 34, 307–317
(2018). https://doi.org/10.1007/s00366-017-0541-y
9. G. Naga Satish, Ch.V. Raghavendran, M.D. Sugnana Rao, Ch. Srinivasulu, House price predic-
tion using machine learning. Int. J. Innov. Technol. Exploring Eng. 8(9), 717–722 (2019). http://
doi.org/10.35940/ijitee.I7849.078919
10. C.V. Raghavendran, G.N. Satish, V. Krishna, S.M. Basha, Predicting rise and spread of COVID-
19 epidemic using time series forecasting models in machine learning. Int. J. Emerg. Technol.
11(4), 56–61 (2020)
11. Ch.V. Raghavendran, G. Naga Satish, T. Rama Reddy, B. Annapurna, Building time series
prognostic models to analyze the spread of COVID-19 pandemic. Int. J. Adv. Sci. Technol.
29(3), 13258 (2020). Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/
31524
12. K. Helini, K. Prathyusha, K. Sandhya Rani, Ch.V. Raghavendran, Predicting coronary heart
disease: a comparison between machine learning models. Int. J. Adv. Sci. Technol. 29(3),
12635–12643 (2020). Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/
30385
SOI FinFET-Based 6T SRAM Design
V. Vijayalakshmi and B. Mohan Kumar Naik
Abstract This paper describes the design and implementation of 6T SRAM cell
by considering sub 20 nm FinFET model and the circuit performance like read
0, 1, write 0, 1 and leakage power dissipation are evaluated along with transistor
sizing for device stability. Due to their exceptional characteristics such as enhanced
channel controllability, high I ON /I OFF , diminished short channel effects, completely
depleted SOI FinFET devices are introduced as a promising nanoscale replacement
for traditional bulk CMOS devices. Read and write operation of 6T SRAM are
confirmed by H-Spice simulation.
Keywords FinFET · CMOS · Inverter · H-Spice · 6T SRAM
1 Introduction
The continuous downscaling of bulk CMOS brings into being major outflow due to its
stipulation in process technology and primary essential material. The contamination
in the semiconducting channel is the key impediment to CMOS-based design [1–3].
The device’s best output can be accomplished by reducing the threshold voltage and
scaling down the supply voltage to improve leakage [4, 5]. The leading hindrance to
the downscaling of CMOS devices down to 20 nm and at the lower nodes is to incor-
porate second-order effects like subthreshold leakage, short channel effects which
results in low throughput [6]. According to the International Technology Roadmap
for Semiconductors (ITRS), multi-gate MOS devices will serve to minimize leakage
and channel duration [7].
CMOS IC technologies turned out to be steadily downscaled to the sub-nanometer
region over the last three decades. The classical structures of devices are holding out
their scaling limits and “end-of- roadmap” substituting devices are considered for the
study. Among all the device parameters, multi-gate MOSFETs are extensively being
studied in a recent study [8, 9], such as double-gate MOSFET, tri-gate MOSFETs
which are also called as FinFETs, and gate all around MOSFETs (surrounding gate
V. Vijayalakshmi (B) · B. Mohan Kumar Naik

New Horizon College of Engineering, Bengaluru, India
492 V. Vijayalakshmi and B. Mohan Kumar Naik
MOSFETs) [10, 11]. The ITRS has identified the significance of these devices and
called them out as advanced CMOS devices.
FinFET-based transistors have become a popular option and viable alternative
for CMOS design technology with downgraded device scaling limits in recent times
[12–14]. The impact of short channel duration could be regulated in these system
structures by restricting off-state leakage currents. Furthermore, FinFETs are more
superior to classical devices for limiting short channel effects, and they have greater
control over lower leakage current and higher yield among many other benefits that
help address scaling challenges [15]. When a cut-off voltage (V t ) is less than the
applied supply potential, all the controlling gates of the novel device set in motion
for the electrons from the source region to the drain region. The applied potential
from all three gates impacts the channel potential and diminishes the drain induced
barrier lowering (DIBL) providing increased swing for FinFET-based design.
This FinFET transistor has a better power-to-delay ratio. Memory requirements
have risen dramatically in many VLSI designs, from industrial applications to
consumer products. It emphasizes the importance of using nanometer technology
to improve memories on a single chip [16, 17]. Nanotechnology, especially SRAM
cells, has a wide range of applications and has improved integrated memories. SRAM
cells are evolving as a critical circuit component in very large-scale integrated (VLSI)
circuits such as FPGAs and microprocessors, whereas SRAM-dependent memories
(also known as caches) influence the processor’s space, timing, control, and schedule
yield. Due to this outcome, SRAM is expected to occupy >90% of the die’s surface
area.
This research paper is organized as follows. In Sect. 2, a short depiction
of the device description and characteristics of FinFET is delivered. Section 3
describes the circuit performance of the developed FinFET model, a basic inverter
circuit is modeled using the look-up table-based FinFETs device characteristics.
Section 4 states the operation of the conventional FinFET-based 6T SRAM cell,
illustrating significant design constraints and read/write functioning of a 6T SRAM
cell. Section 5 explains the inferences recorded during simulation followed by the
conclusion of the research work in Sect. 6, respectively.
2 Device Model
Figure 1a, b shows the 3D structure and cross-sectional view of FinFETs. The FinFET
model structure consists of the following device parameters channel length (Lg), fin
height (H Fin ), fin width (W Fin ), gate oxide thickness (t ox ) and source (N s ), drain (N d ),
and channel (N c ) doping concentrations which are mentioned in Table 1. FinFET-
based model—The FinFET was originally known as a folded channel MOSFET
because of the wafer’s short vertical fins. In FinFETs, the gate width is normally
twice that of the fin height. FinFETs are the most cost-effective instruments to use
instead of CMOS in less than 20 nm technology due to their low processing costs.
To reduce IOFF leakage, the geometric core parameters of FinFETs are important.
SOI FinFET-Based 6T SRAM Design 493
Fig. 1 a 3D view of tri-gate FinFET. b 2 dimensional view of FinFET
Table 1 Device parameters

S. No. Parameter Symbol Value
for FinFET
1 Fin length L Fin 20 nm
2 Fin width W Fin 30 nm
3 Fin height H Fin 20 nm
4 Oxide thickness t ox 1 nm
5 Buried oxide thickness t box 20 nm
6 Gate work function ϕG 4.7 eV
7 Source/drain work function ϕSD 4.1 eV
8 Source/drain doping NS/D 1019 cm3
9 Channel doping NC 5 × 1017 cm3
Based on the model parameters explained above, the primary functioning of the
FinFET device is identified in terms of transfer characteristics. Figure 2 compares the
transition characteristics of bulk FinFETs and SOI-based FinFETs, demonstrating
that SOI-based systems work better for future circuit applications with lower off
current and a higher ION-IOFF ratio. Another strength of FinFETs is the ability to
control the threshold voltage (V t ) in a stack of the high-k/metal gate. The current
flowing from the drain to the source is largely determined by the operation, temper-
ature, and voltage. As the voltage or temperature rises, the mobility carrier and
threshold voltage fall, due to a reduction in drain current.
Fig. 2 Comparison plot for

the subthreshold current of n
channel and p channel for
bulk and SOI FinFET device
3 Inverter Design
3.1 LUT-Based H-SPICE Design Flow
In this section, an inverter circuit is implemented using the proposed FinFET model.
Since the circuit-level performance of the defined device is examined by evaluating
the 2D lookup tables of I DS and gm as a function of V GS and V DS employing the
raw data acquired by the Silvaco TCAD ATLAS tool. By using this 2D lookup table,
the circuit performance is investigated with the help of the Synopsis H-spice tool,
Verilog-A model is used to define a net-list for FinFET-based inverter as a portion
of designing semiconductor memories. The flowchart for device-circuit simulation
by using H-Spice is represented in Fig. 3 pictorially.
The simulation of the inverter circuit is investigated using H-Spice by Synopsys.
It is an optimized device-circuit simulator, which is utilized to simulate electrical
parameters of VLSI circuits in the transient/DC/AC domain. H-spice is more adapted
for speedy, accurate analysis, and performance of any VLSI circuit.
Finite element method-based numerical simulations or TCAD simulation tools
are beneficial for the design exploration of model-dependent circuits. The FinFET
device is carried in this work with 20 nm technology.
To design any digital circuit, a thorough understanding of simple CMOS inverter
operation and properties is required. Understanding the operation and properties of
any inverter circuit will aid in the development of digital logic and semiconductor
memories. A simple inverter circuit is the most crucial consideration to acknowledge
while evaluating any VLSI circuit; it is the fundamental circuit that is best suited to
examining the device’s circuit efficiency for the specific technology node.
Figure 4 depicts a basic inverter circuit as shown; the structure consists of a simple
combination of pMOS FinFET at the top, the source is associated with V DD and an
Fig. 3 Simulation flow of

device to circuit-level
implementation
Fig. 4 A TFET-based
inverter circuit
nMOS FinFET at the bottom with the source connected to the ground. Gate terminal
of both the transistor is connected to the V in and the drain of both the transistor are
connected to the V out terminal.
3.2 VTC Curve
The voltage transfer characteristics of the digital inverter are calculated as an inverted
phase function, which shows the exact switching between on and off states. VTC
signifies that a lower input voltage results in a higher output voltage. The transition
region’s slope is a function of how well steep slopes produce accurate switching.
Figure 5a shows the voltage transfer curve for the FinFET-based inverter or DC
characteristics of an inverter. Figure 5b represents the input and output characteristics
of the proposed 20 nm FinFET-based inverter circuit. As for low input, the value reads
a high output value. The delay of the circuit is characterized by the rise time and fall
time of the pulse.
4 SRAM Design
A memory cell’s primary function is to accumulate a single bit of data using a pair
of inverters that are cross-coupled and a pair of access transistors. Since an inverter
is a fundamental component of every circuit simulation, spice modeling is used to
construct the SRAM cell, which is then simulated using the H-SPICE tool.
SRAM design methodology—The SRAM cell is a promising and reliable appli-
cation of FinFET-based architecture, Fig. 6, shows the logical cell of the 6 transistor
SRAM architecture based on the developed FinFET model, where each of the gates
in the FinFET device is controlled independently. 6T SRAM cell is composed of
2 cross-coupled inverters (P1, D1, and P2, D2), each of which has its output fed
into the other; the loop is employed to maintain the states of respective inverters.
Access transistors (A1 and A2) and WL/BL referring word line and bit line which
are utilized to operate write and read cycles, from each cell. By rendering the low
word line in the halt mode, the access transistor will be turned off, thus the inverter
output will be complementary.
The left inverter’s PMOS transistor will be turned-on, showing higher perfor-
mance, and the second inverter’s PMOS is turned off. The term line drives the
controlling gates of transistor that links the bit line and the word lines. The SRAM
cell is detached from bit lines by keeping the word line short. It is important to
do appropriate transistor sizing of each cell about the specifications for better cell
activity.
The architecture and execution of FinFET-based 6T SRAM cell. The basic oper-
ation and implementation of static RAM, as well as transistor sizing for system
stability, are discussed. Simulation is used to study and validate the read and write
operations.
Figure 6 shows the devices are connected to form a cross-coupled inverter and
implementing a static RAM. It consists of two pull-up pMOS transistors P1 and P2
and two pull-down transistors D1 and D2 which are also called drive transistors from
cross-coupled inverters followed by two access transistors A1 and A2, connected to
Fig. 5 a VTC curve for inverter. b Input and output voltages for an inverter
Fig. 6 Logic cell of a 6T

SRAM
Table 2 Operation of 6T
Read operation Write operation
SRAM
1.Charging bit, bit_b high 1. Charging bit high
2. Both bit and bit_b float 2. Let the bit to float
3. Pull up the word line 3. Pull down bit_b to ground
Bit will contain the data value 4. Pull up the word line
Qholds a high value
the nodes which contain the stored bit and its complement to the bit lines. Table 2
shows the operation of 6T SRAM.
4.1 Device Constraints
The relative size of the transistor is a significant aspect of 6T SRAM. To achieve

the desired functionality, the transistor sizes must be maintained correctly during
reading and write operations.
In the first case (the read function), the bit line which is linked to the low Q-node
will be drawn to the ground. When this takes place, the charge is stored in the-line
transferred along with access transistors (A) and a drive transistor (D). The current
induces a small raise in the Q node’s voltage, which could reach an excessively high
level if the rise is substantial enough. To anticipate an unwanted flip, D should be
superior to A (the access transistor should be narrower or longer than the driver). In
the subsequent stage (write function), the bit line which is drawn to the ground should
be able to minimize the voltage of an adjoining high Q-node to the point where the
node can be shifted to low. The flip is countered by the opposite P transistor, which
Table 3 Device geometry

S. No. Transistor Device parameter (W/L)
constraints for all the
transistors 1 D1 (n-MOS transistor) 80 nm/20 nm
2 D2 (n-MOS transistor)
3 P1 (p-MOS transistor) 30 nm/30 nm
4 P2 (p-MOS transistor)
5 A1 (Access transistor) 40 nm/20 nm
6 A2 (Access transistor)
connects the Q-node to V dd . As a result, P must be frailer than A for the write to be
prosperous.
These two conditions (D > A and A > P) gives the basic essential relationships
between the transistor pairs for correct operation. D transistor must be stronger, and
P must be the weakest, with a transistor in between and used D-8/2, A-4/2, and P-3/3
which are tabulated in Table 3.
4.2 Read/Write Operation of SRAM Cell
When the word line is high, the-NMOS performs and binds the inputs of the inverter
and two vertical bit lines are invoked to the outputs. During a read operation, both
of the inverters impel the stored current within the cell of memory into the bit line
and invert the value-over the flipped bit line, these data produces the SRAM cell’s
output-value.
Input drivers activate the powerful bit lines to write any data into the memory
during writing operations. A short-circuit condition may occur depending on the
current value, thus in SRAM the value is overwritten.
5 Results
Synopsis spice tool is used to develop the circuit performance of the 6T SRAM
cell by using previously modeled 20 nm LUT (Lookup-table)-based FinFET device
model. It consists of values for parameters related to 20 nm technology, as well as
a 20 nm gate length to synthesize the result for a 6T SRAM cell. In Table 3, the
transistor size for the SRAM cell has also been measured and defined.
Figure 7 shows the write HIGH function achieves by shifting q from LOW to
HIGH by flipping the bit. It is also shown that around 0.3 ns after the word line stops
rising, q hits a voltage that’s within 10% of HIGH.
Similarly, Fig. 8 shows that as the write LOW feature succeeds, the bit is flipped
and q is changed from HIGH to LOW. Here, Fig. 8 is virtually similar to Fig. 7,
during the write LOW q drops down to 0 V, before the word line completes its rising.
Fig. 7 Write high operation in 6T SRAM
Fig. 8 Write low operation in 6T SRAM
Although it may seem that writing a LOW is quicker than writing a HIGH, that is
not the case, looking at qb, it only takes 3 ns for the word line to reach within 10% of
1 V after it completes rising. Whatever data value is being recorded, the cell should
be stable for the same amount of time.
To arbitrate the completion of the read functions, it is mandatory to interpret the
bit lines and the q-nodes. For the functioning shown in Fig. 9, since q is high, the
read should keep the bit high. In particular, the bit remains HIGH while bit_b drops
to LOW, as anticipated. Another crucial aspect for a good read is that the operation
does not result in the bit being flipped in an unintended manner. In this case, qb rises
marginally in voltage but does not exceed 0.3 V. The read was effective because the
bit stayed constant when the value was read.
Fig. 9 Read high operation in 6T SRAM
Figure 10 shows the bit lineand q-nodes are reversed. Where, q is LOW, and the
read should outcome in dropping the bit to LOW, i.e., to 0 V while the bit stays stable
for a successful read operation.
It is important to note that the falling bit line for the read function might not
fall inside 10% of 0 V until around 3 ns just after the word line has finished rising.
The FinFET-based 6T SRAM shows that the leakage power consumption in standby
mode is 219 pW. It shows low leakage power for FinFET-based SRAM than conven-
tional COMS-based SRAM design. Using FinFETs certainly outperforms the CMOS
counterpart for SRAM cell in terms of reliability, power consumption, and robustness.
Fig. 10 Read low operation in 6T SRAM

6 Conclusion
SOI-based FinFETs are being extensively used in VLSI circuits and semiconductor
memories. The basic device is modeled using Silvaco TCAD atlas for various perfor-
mance characteristics. Based on the lookup table evaluated, circuit characteristics are
studied using h-spice-based Verilog implementation. Fully depleted SOI FinFET-
based 6 transistors static RAM cell is designed and evaluated. This research work
demonstrates the necessary operations like read 0, read 1, write 0, write 1, and
leakage power is evaluated using h-spice. The improved data stability and low leakage
power show FinFET-based SRAMs shows promising candidates for implementation
of semiconductor memories in microprocessors and semiconductor applications.
Acknowledgements The authors wish to thank New Horizon College of Engineering, Bengaluru
for supporting this work.
References
1. T. Skotnicki, J.A. Hutchby, T.-J. King, F. Boeuf, The end of CMOS scaling toward the intro-
duction of new materials and structural changes to improve MOSFET performance. IEEE Circ.
Devices Mag. 21(1), 16–26 (2005)
2. A. Chin, S.R. McAlister, The power of functional scaling: beyond the power consumption
challenge and scaling roadmap. IEEE Circ. Devices Mag. 21(1), 27–35 (2005)
3. S.E. Thompson, R.S. Chau, T. Ghani, K. Mistry, S. Tyagi, M.T. Bohr, In search of “forever,”
continued transistor scaling one new material at a time. IEEE Trans. Semicond. Manuf. 26–35
(2005)
4. Y. Taur, CMOS scaling beyond 0.1 [mu] m: how far can it go?, in Proceedings of 1999
International Symposium on VLSI Technology, Systems, Applications (1999), pp. 6–9
5. D.J. Frank, R.H. Dennard, E. Nowak, P.M. Solomon, Y. Taur, W.H.-S. Philip, Device scaling
limits of Si MOSFET’s and their application dependencies, in Proceedings of IEEE, vol. 89
(2001), pp. 259–288
6. A. Keshavarzi et al., Leakage and process variation effects in current testing on future CMOS
circuits. IEEE Des. Test Comput. 19(5), 33 (2002)
7. International Technology Roadmap for Semiconductors (ITRS)
8. T. Mizuno, N. Sugiyama, T. Tezuka, T. Numata, S. Takagi, High performance CMOS operation
of strained-SOI MOSFET’s using thin film SiGe-on-insulator substrate, in 2002 Symposium
on VLSI Technology. Digest of Technical Papers (2002), pp. 106–107
9. R. Vaddi, R.P. Agarwal, S. Dasgupta, Compact modeling of a generic double-gate MOSFET
with gate S/D underlap for subthreshold operation. IEEE Trans. Electron Dev. 59(10) (2012)
10. S. Jha, S.K. Choudhary, Impact of device parameters on the threshold voltage of double-gate,
tri-gate and gate-all-around MOSFETs, in 2018 IEEE Electron Devices Kolkata Conference
(EDKCON)
11. R. Ramamurthy, N. Islam, et al., The tri-gate MOSFET: a new vertical power transistor in
4H-SiC. IEEE Electron Dev. Lett. 42(1)
12. E.J. Nowak et al., A functional FinFET-DGCMOS SRAM cell, in IEDM Technical Digest
(2002), pp. 411–414
13. S.S. Rathod, A.K. Saxena, S. Dasgupta, A proposed DG-FinFET based SRAM cell design with
RadHard capabilities. Microelectron. Reliab. 50(8), 1181–1188 (2010)
14. R.V. Joshi et al., FinFET SRAM for high-performance low-power applications, in ESSCIRC
(2004), pp. 211–214
15. M. Ishida et al., A novel 6T-SRAM cell technology designed with rectangular patterns scalable
beyond 0.18μm generation and desirable for ultra high speed operation, in IEDM Technical
Digest (1998), pp. 201–214
16. F. Moradi, G. Panagopoulos, G. Karakonstantis, D. Wisland, H. Mahmoodi, J.K. Madsen, K.
Roy, Multi-level word line driver for low power SRAMs in nano-scale CMOS technology, in
IEEE 29th International Conference on Computer Design (ICCD), 9–12 Oct 2011, pp. 326,
331
17. A.B. Sachid, C. Hu, Denser and more stable SRAM using FinFETs with multiple fin heights.
IEEE Trans. Electron Dev. 59, 2037–2041 (2012)
Cataract Detection Using Deep
Convolutional Neural Networks
Aida Jones, K. Abisheek, R. Dinesh Kumar, and M. Madesh
Abstract A cataract is one of the important reasons for blindness globally, resulting
in more than 50% of blindness. Early revelation and medication of cataracts can mini-
mize the risk of blindness. We propose a cataract detection model using deep convo-
lutional neural networks (DCNN) based on GoogLeNet architecture, i.e., incept ion
module (award-winning architecture of ILSVRC 2014) it uses 22 layers deep network
which makes it the most reliable architecture. To bring high accuracy while training,
we have used deeper GoogLeNet architecture which comes under the category of
CNN. It uses a convolutional layer, activation layer, fully connected layer, SoftMax
layer, inception layer, and finally Max pooling layer for bringing high accuracy and
efficient training. The best accuracy our method has achieved 86.9 as overall training
accuracy and 35.8 as validation accuracy. We have trained this module using 66
images of three different categories, normal, severe, and mild, and preprocessed it to
1452 image samples. This method has the feasibility to be applied for the detection
of many diseases.
Keywords Cataract · Blindness · Neural networks · Inception · ILSVRC 2014
1 Introduction
A cataract is a misty area within the membrane of the eyeball that results in a
shrinkage of vision. Cataracts will elevate slowly and can affect both eyes. Symptoms
of cataracts are murky colors, blur vision, difficulty in facing luminous objects, and
difficulty in seeing at night. This may result in inconvenience with driving, seeing,
and perceiving faces. Poor vision caused by cataracts may also result in an increased
risk of depression. Cataracts result in high chances of blindness and visual disability
worldwide. It increases slowly and finally intrudes with the vision. We may turn out
with cataracts in both eyes, but cataracts won’t form at the concurrent time. They are
common in older people.
A. Jones (B) · K. Abisheek · R. Dinesh Kumar · M. Madesh

Department of ECE, KCG College of Technology, Chennai, India
506 A. Jones et al.
1.1 Causes and Effects of Cataract
Cataract mostly occurs in old people and results in foggy view. They may also form
due to other genetic disorders. Cataracts can also be caused due to past eye surgeries,
diabetes, etc. The following is an enumeration of some short-term and long-term
effects of cataracts. Short-term effects of cataracts are diplopia or miscellaneous
images in the beginning stage. A nimbus sort of effect may develop around lights.
Near-vision blurring, sensitivity to bright lights. Some long-term effects are near
vision becomes much poor after preliminary improvement. Vision becomes very
bleary or unclear; this can affect activities of daily living; driving, reading. Colors
appear much more illuminated than before. In very rare cases, uncured cataracts can
cause glaucoma or blindness.
2 GoogLeNet Architecture
We have trained this model using GoogLeNet Architecture (Award-winning archi-

tecture of ImageNet Large-Scale Visual Recognition Challenge 2014) [1]. This archi-
tecture pipelines the output of the previous stage as the input of the next stage and
finally, 3 × 3 Max pooling is performed. When compared with other award-winning
architectures GoogLeNet uses 1 × 1, 3 × 3, and 5 × 5 convolutions at the same
time which compresses the image details and makes it easier for training [2]. Red,
Green, and Blue colors represent the RGB layer of an image. In each step, three
convolution operations are performed, and the outputs of each layer are combined
and given as an input to the next layer. In each layer, respective convolution, Soft
Maxing, and inception operations are performed. GoogLeNet architecture comes
under the category of convolutional neural networks which provides efficiency in
training and validation. It analyzes the interrelationship statistics of previous layers
and conglomerates them into groups with a high link [1]. These conglomerates form
the unit of the previous layer and are connected to the units of the subsequent layer.
Each unit from the previous layers is comparable to some range of an input picture
and these units are combined into clear banks. Hence, we might end up with a lot
of collections focused on one region. This GoogLeNet architecture shown in Fig. 1
was designed with computational efficiency. It can be run on a stack of devices with
different statistical operations irrespective of the types and versions of the machine.
3 Training Methodology
In this training process, we are going to train the model by performing image augmen-
tation first. This image augmentation will help us to train the model effectively, by
preprocessing all the images present in the data set and converting each image into
Cataract Detection Using Deep Convolutional Neural Networks 507
Fig. 1 GoogLeNet architecture. Source Google images
multiple images of different categories [3]. This process is carried on till imple-
menting the GoogLeNet architecture for training. Various steps involved in our
training are, performing image to array preprocessing, AAP, rotation, flipping, etc.
After performing all these preprocessing operations, the dataset is ready to be trained,
and the deep GoogLeNet architecture is implemented. The GoogLeNet architecture
contains various layers such as convolutional layer, activation layer, max pooling
layer, fully connected layer, and SoftMax layer as given below [4]. The preprocessed
images will be affected on each appropriate layers according to their predefined
operations. Each layer is interconnected with the other, and the output from previous
layer will be given as an input for next layer [3]. Finally, after the training process,
the model is tested using some sample testing images of three different categories as
normal, severe, and mild, and the prediction output is noted from the graph obtained.
3.1 Importing Necessary Libraries
We have imported the required libraries for training the model. Some libraries like
NumPy, Matplot, OS, Adam, and SGD are imported and the process is continued
[2]. Google Colaboratory is one of the user-friendly platforms for developers, rather
than downloading each package like NumPy, pandas, and Matplot separately, Google
Colab provides us with pre-installed packages.
3.2 Performing Image Augmentation
Here, we are converting a single image into multiple images of different categories
by rotating, flipping, contrast adjustment, inverting [5], etc.
508 A. Jones et al.
1. Rotating: Rotating the image to different angles of 200, 400, 600, 800, 1000,
1200, 1400, 1600, 1800, 2000 etc. Rotation is required to train the module to
an extent that it should detect the exact category of an image despite different
angles and positions
2. Sharpness: The image is sharpened using a Python code and edges are given a
high concentration
3. Brightness: The hue and background contrast of the image are adjusted and
brightness is increased
4. Flipping: Flipping the images into two different categories as horizontal flipping
and vertical flipping
5. Finally Max Pooling, Activation, and SoftMax layers are included and clustered
together to produce the output.
3.2.1 Aspect Aware Preprocessing
During the preprocessing and training of images, the area of cataract in the image
should not be affected by any other operations like cropping, flipping, rotating,
resizing, etc. So, to maintain the original features of an image, we are doing aspect-
aware preprocessing. This preprocessing methodology comes under OpenCV library
of Python’s machine learning module. The OpenCV (computer vision) is most
commonly used in case of image processing operations and for developing image
editing applications [6]. On using this preprocessing technique, image’s original
features can be maintained accurately without losing the required details [6]. It is
advisable to use this technique along with image augmentation operation because
the ultimate aim of this technique is to protect the images from losing its important
details by various augmentations.
3.3 Performing Image to Array Preprocessing
The pixels of each image contain a 0:255 aspect ratio and 8 bit data. If we give
this huge value for training, it will not be trained accordingly [6]. So, to avoid this
accuracy problem, we are reducing the aspect ratio from 0:255 to 0:1 by importing
image to array function from Python’s TensorFlow library. So that the training will
be more efficient. 0:1 in the sense we are reducing it to 0:42, 0:35 [2]. Use the SI as
the primary unit. English units are used as secondary units.
3.4 Using Deeper GoogLeNet Architecture
It is one of the architectures of convolutional neural networks. It contains a convo-

lution module and an inception module. It contains 1 × 1 convolution at the median
of architecture. The global average pooling is used at the cusp of the architecture
instead of using fully connected (FC) layers. The idea of implementing these two
techniques is from a paper “Network In Network [7].”
3.5 Convolution Module or Convolution Layer
Convolution means to filter; single image is the combination of many pixels, here
we are taking one of the convolutional filters of size 3 × 3 or 5 × 5, placing and
replacing it in all the areas of an image [8] and extracting the features of each existing
layers.
3.6 Activation Layer
The activation layer manually inserts higher-order functions to the features taken
from the convolution layer to increase data complexity activation layer is nothing
but just the output of a function [8]. You will feed input to one of the activation
functions and it will give one output, and this operation we call a layer. Yes, just like
we have a different function in mathematics, where you feed so me matrix or values
to a function and it will give output.
3.7 Max Pooling Layer
The Max pooling layer defines the characteristics present in an area generated by a
convolution layer. So, additional operations are performed on defined characteristics
instead of largely initiated characters produced by the convolution layer [9]. This
makes the model more resilient to variations in the position of the characters in the
input image. It calculates the maximum value for each part of the feature map.
3.8 Fully Connected Layer and SoftMax Layer
From the overall features obtained, they cluster the categories of normal, mild, and
severe to their respective classes [5]. When the image is given, it will automatically
detect the category of the image and match it with the appropriate cluster.
510 A. Jones et al.
Fig. 2 Naive version inception module
3.9 Inception Module
The inception module with 1 × 1 convolution is defined below as shown in Fig. 2.

Now, 1 × 1 convolution, 3 × 3 convolution, 5 × 5 convolution, and 3 × 3 max
pooling are performed at the single-layer, and their outputs are stacked together as
the cluster which forms the output. Max pooling compares each layer and keeps only
the necessary features and discards the unnecessary features [9].
3.10 Global Average Pooling
Global average pooling was designed to substitute fully connected layers in convo-
lutional neural networks [10]. It reduces the parameters in a model for controlling
overfitting; there are no parameters for optimizing in global average pooling and
thus, overfitting can be avoided here as shown in Fig. 3.
Fig. 3 Global average pooling

Setting the number of weights as zero, moving from the fully connected layers
(FC) to average pooling improves the accuracy to 0.6%. This idea is from network
in network [7] which can be less prone to overfitting.
3.11 Training Classifiers
There are some intermediate SoftMax branches at the middle, they are used for
training only. These branches are auxiliary classifiers as in Tables 1 and 2.
Table 1 Training classifier

5 × 5 average pooling
1 × 1 convolution containing 128 filters
1024 fully connected layers (FC)
1000 fully connected layers (FC)
SoftMax layer
Table 2 Training classifier

5 × 5 average pooling It calculates the average values for the patches of feature
map by down sampling the 5 × 5 square of the feature map
and sample it to average value in the square. It comes under
the Keras library of python
1 × 1 convolution 1 × 1 convolution is often used to reduce the depth of the
image’s feature map. In other words, it is used to reduce the
count of channels [10]. It will map the input pixel of its
channel to the output pixel and mainly used for dimension
reduction
1024 fully connected layers (FC) The fully connected layer will obtain its input from the
convolutional layer which is flattened. Here, the input from
one layer will be connected to the activation unit of next
layer [9]. 1024 represents the number of neurons present in
this layer; every neuron from the previous layer is connected
to the current layer
1000 fully connected layers (FC) The concept is same in this case, instead of using 1024
neurons here there are 1000 neurons from the previous layer
connected to this current layer
SoftMax layer The SoftMax layer is implemented just before the output
layer. It will have same number of nodes as the output layer
[9]. It’s an activation function that sums up the probabilities
of each class to one. Softmaxing is the binary classifier
function
512 A. Jones et al.
3.12 Label Binarizer
The label binarizer covers the name of labels such as normal, severe, and mild into
Binary format so that the machine can understand it and we can get accurate output.
As the machine cannot understand the human language, it is advisable to convert the
human language to binary [11].
3.13 Training the Model Using Transfer Learning
First, giving an ADAM optimizer [12] and training it up to 80 epochs. Additionally,

the SGD optimizer layer is introduced for adjusting the learning write and adding
it as a layer to obtain an accuracy boost. The two levels of fine tuning are given in
Fig. 4.
First, on reading the trained model and giving the input image, the machine will
identify the exact category of an image and matches it with the cluster obtained and
throws the exact output.
4 Related Works
This section describes the works and findings made so far related to machine
learning modules and algorithms, convolutional neural networks-based architecture
for cataract detection [13]. Using Google Colaboratory, Google organization’s online
development platform for programming purposes, and python as the programming
language, this paper contains research about various types of cataracts in the human
eye, causes of cataracts, and collect ion of datasets from various sources. Method of
classification of cataract mainly comprises four parts namely preprocessing, feature
extraction, feature selection, and classifier or model [14]. The primary purpose of
any classifier is achieved based on the input provided to it. Input for cataract clas-
sification is a dataset of eye images of three different classes called normal, mild
cataract, and severe cataract. For many years, research on Fundus image analysis
Fig. 4 Levels of fine tuning

has been conducted. Fundus images are obtained by Fundus camera which clearly
distinguishes cataracts from the normal eye [1]. The five convolution layers in deep
learning are used to separate the features in Fundus images [15]. Some of the features
of eye images have been extracted such as texture, sketch, color, wavelet, acoustical,
spectral parameters, and so on [1]. One of the other methods for cataract classifica-
tion and detection is the GoogLeNet model. The collected dataset is preprocessed for
removing the noise in images. The training model is built using a convolutional neural
network. Transfer learning is used where a model developed here is used as the input
for another model [7]. In few other models, image preprocessing is done with the
help of the maximum entropy method. The features are collected automatically using
Caffe. The extracted features must be identified and compared. In this case, SSVM
is used for classification. For the collected dataset, four different classifications are
done. But SoftMax gives better accuracy [15].
5 Implementation Methodology
The implementation methodology of our paper is shown in Fig. 5. The required

datasets are collected; the model is created by starting with preprocessing followed
by training, and the classification is made using the trained model into three different
Fig. 5 Implementation
methodology
514 A. Jones et al.
categories as normal, mild, and severe. Finally, the model is tested by giving input
images of three different categories.
This proposed methodology gets the input from the user directly and detects the input
image using the trained model and throws the accurate output.
The image is given as the input, the trained model will perform 2D convolution,
3 × 3 Max Pooling, flattening of pixels, aspect-aware preprocessing, etc., and throws
the accurate result the convolution operations are performed at respective layers, and
finally, 3 × 3 Max Pooling is performed [11]. The pixels in each image are flattened
to reduce the size of an image, and dense operations are performed in Fig. 6. Finally,
the output is obtained. The parameters used for evaluation are as follows.
6.1 Accuracy
Accuracy is the proportion of rightly anticipated observation to the complete obser-

vation. We may think that our model is better when the accuracy is good, but accuracy
is a great measure when our dataset contains symmetric data where the false-positive
(FP) and false-negative (FN) values are the same.
Accuracy = TP + TN/TP + TN + FP + FN
Fig. 6 Proposed methodology

6.2 Macro Average
Macro average takes the average of precision and recalls the machine on various
datasets. This can be used when we need to analyze the overall system performance
on the given dataset. It will be useful when our dataset varies in size.
6.3 Weighted Average (or) Micro Average
The weighted average or micro average will epitomize the individual true positives
(TP), false positives (FP), and false negatives (FN) for different images in the dataset
and implement them to obtain statistics.
Micro Average precision = (TP1 + TP2)/(TP1 + FP1 + TP2 + FP2)

Micro Average Recall = (TP1 + TP2)/(TP1 + FN1 + TP2 + FN2)
6.4 Precision
Precision is the ratio of rightly anticipated observations to the overall predicted

positive observation. The formula for calculating the precision is given as
Precision = TP/TP + FP
6.5 Recall
It is defined as the ratio of correctly anticipated observation to all observation in

actual class the formula is represented as
Recall = TP/TP + FN
6.6 F1 Score
The F1 Score can be represented as the biasing of precession and recall. Where this
F1 score reaches its best value of 1and the worst value of 0. The formula of F1 score
516 A. Jones et al.
Table 3 Comparison of proposed methodology with existing methodology

Existing methodology Proposed methodology
• Many of the existing methods uses the lower • This methodology uses GoogLeNet
version of architecture such as Alex Net, architecture for training, which increases the
ZF-Net, rather than using GoogLeNet accuracy of output
architecture
• Many of the medical organizations still uses • Provides the instant output as normal, mild,
some of the external machines for cataract or severe without any external machineries
detection and medical equipment
• Still, many of the existing methods used • Used Google Colaboratory for development
Anaconda notebook and MS Visual Studio which does not require any separate
which requires separate installation of installation of packages and libraries. And it
libraries. This method occupies lots of space provides inbuilt GPU to provide fastest
in our device’s external memory which runtime with internal storage in drive
reduces the overall performance speed
is
F1 = 2 × [Precision × Recall]/[Precision + Recall]
6.7 Support
The support denotes the actual occurrences of class in a given dataset. The structural
weakness in our architecture may due to the imbalanced support. Support value
indicates the need for rebalancing or sampling. Table 3 shows comparison of proposed
methodology with existing methodology.
7 Result
Hence, we have obtained the following result on running our programing Google
Colaboratory. Our trained model gives an accuracy of around 89% as training
accuracy and 86% as validation accuracy (val_acc).
Figure 7 shows the training loss (train_loss) and training accuracy (train_acc) of
cataract, which contains four categories of training results as training loss (train_loss),
training accuracy (train_acc), validation loss (val_loss), and validation accuracy
(val_acc) represented by four different colors, and Table 4 shows the weighted
average, accuracy, and macro average, and Table 5 shows expected output versus
obtained output.
Fig. 7 Training loss and

accuracy plot
Table 4 Fine tuning after evaluation

Category Precision Recall F1-score Support
Accuracy 0.98 89
Macro average 0.96 0.97 0.97 89
Weighted average 0.98 0.98 0.98 89
Table 5 Expected output versus obtained output

Input images Expected output Obtained output
Normal Normal
Normal Normal
Severe Severe
Mild Mild
Mild Normal
518 A. Jones et al.
8 Conclusion
Hence, this paper will provide simple diagnostic mechanisms and reduce the usage
of high-end machinery. The detection work of ophthalmologists is made easier, and
with the limitation of the dataset high accuracy and training results are obtained. This
proposed methodology paves an easier path for cataract detection.
References
1. Szegedy et al., Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), Boston, MA, USA (2015), pp. 1–9. http://doi.org/10.1109/
CVPR.2015.7298594
2. H. Li et al., Computerized systems for cataract grading, in 2009 2nd International Conference
on Biomedical Engineering and Informatics, Tianjin, China (2009), pp. 1–4. http://doi.org/10.
1109/BMEI.2009.5304895
3. N. Sokolova, M. Taschwer, S. Sarny, D. Putzgruber-Adamitsch, K. Schoeffmann, Pixel-based
iris and pupil segmentation in cataract surgery videos using mask R-CNN, in 2020 IEEE 17th
International Symposium on Biomedical Imaging Workshops (ISBI Workshops), Iowa City, IA,
USA (2020), pp. 1–4. http://doi.org/10.1109/ISBIWorkshops50223.2020.9153367
4. S. Kasiviswanathan, T.B. Vijayan, L. Simone, G. Dimauro, Semantic segmentation of conjunc-
tiva region for non-invasive anemia detection applications. Electronics 9, 1309 (2020). http://
doi.org/10.3390/electronics9081309
5. S. Kasiviswanathan, T.B. Vijayan, S. John, Ridge regression algorithm based noninvasive
anaemia screening using conjunctiva images. J. Ambient. Intell. Humaniz. Comput. (2020).
https://doi.org/10.1007/s12652-020-02618-3
6. S. Hu et al., Unified diagnosis framework for automated nuclear cataract grading based on
smartphone slit-lamp images. IEEE Access 8, 174169–174178 (2020). https://doi.org/10.1109/
ACCESS.2020.3025346
7. M. Lin, Q. Chen, S. Yan, Network in network. arXiv: 1312.4400 v3 [cs.NE] (2014)
8. M.K. Behera, S. Chakravarty, A. Gourav, S. Dash, Detection of nuclear cataract in retinal
fundus image using radial basis function based SVM, in 2020, Sixth International Conference
on Parallel, Distributed and Grid Computing (PDGC), Waknaghat, India (2020), pp. 278–281.
http://doi.org/10.1109/PDGC50313.2020.9315834
9. Y. Dong, Q. Zhang, Z. Qiao, J. Yang, Classification of cataract fundus image based on deep
learning, in 2017 IEEE International Conference on Imaging Systems and Techniques (IST),
Beijing, China (2017), pp. 1–5. http://doi.org/10.1109/IST.2017.8261463
10. M.T. Islam, S.A. Imran, A. Arefeen, M. Hasan, C. Shahnaz, Source and camera independent
ophthalmic disease recognition from fundus image using neural network, in 2019 IEEE Inter-
national Conference on Signal Processing, Information, Communication & Systems (SPIC-
SCON), Dhaka, Bangladesh (2019), pp. 59–63. http://doi.org/10.1109/SPICSCON48833.2019.
9065162
11. S. Sadasivam, S. Karthick Ramanathan, Effective watermarking of digital audio and image
using Matlab technique, in 2009 Second International Conference on Machine Vision. IEEE
(2009)
12. A.S.V. Ptraneel, T. Srinivasa Rao, M. Ramakrishna Murthy, A survey on accelerating the
classifier training using various boosting schemes within cascades of boosted ensembles, in
International Conference with Springer SIST Series, vol. 169 (2019), pp. 809–825
13. L. Zhang, J. Li, H. Han, B. Liu, J. Yang, Q. Wang, Automatic cataract detection and grading
using deep convolutional neural network, in 2017 IEEE 14th International Conference on
Networking, Sensing and Control (ICNSC), Calabria, Italy (2017), pp. 60–65. http://doi.org/
10.1109/ICNSC.2017.8000068
14. N. Hnoohom, A. Jitpattanakul, Comparison of ensemble learning algorithms for cataract detec-
tion from fundus images, in 2017 21st International Computer Science and Engineering Confer-
ence (ICSEC), Bangkok, Thailand (2017), pp. 1–5. http://doi.org/10.1109/ICSEC.2017.844
3900
15. S. Bhat, S. Mosalagi, T. Balerao, P. Katkar, R. Pitale, Cataract eye prediction using machine
learning. Int. J. Comput. Appl. 176(35) (2020). 0975-8887
Comparative Analysis of Body Biasing
Techniques for Digital Integrated
Circuits
G. Srinivas Reddy, D. Khalandar Basha, U. Somanaidu, and Rollakanti Raju
Abstract In VLSI, the sequential circuits depend upon the clock. High speed and
the low power consumption are the two major goals in every circuit design. Different
biasing techniques are applied to shift registers and analyzed them by calculating
the amount of power consumed and the delay of the circuit. In this paper, a 4-
bit shift register designed using multiplexers has been extended into 8-bit register.
Four biasing techniques have been applied, namely standard, V DD /2, 3V DD /4, V DD /4
biasing, and gate level body biasing to know their characteristics. The entire designing
and analysis of the shift register have been done using the Cadence Virtuoso tool in
180 nm technology. The circuit is designed to obtain 1.8 V (full swing voltage).
Keywords Body biasing · D flip-flop · Delay · Multiplexer · Power · Shift register
1 Introduction
This paper is broadly divided into two parts, the first part deals with the design of the
4-bit and 8-bit registers and the second part deals with biasing techniques applied to
them. The Cadence tool with Virtuoso has been used for the designing part and the
power and delay calculations. The circuits have been designed with 180 nm tech-
nology with 1.8 V applied as the VDD to the circuits. The basic blocks were inverter,
2 × 1 multiplexer, 4 × 1 multiplexer, 8 × 1 multiplexer, and D flip-flop. Multiplexer
is designed by transmission gates [1]. The D flip-flop is also designed with multi-
plexer [2]. In shift registers, the data can be shifted or rotated by the required number
of bits [3]. The difference between proposed shift registers and the barrel shifter is
circuit depends upon clock. In the project, four types of biasing techniques have been
applied. Dynamically, changing the transistor threshold voltage is called as the body
G. Srinivas Reddy
Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India
D. Khalandar Basha (B) · U. Somanaidu
Institute of Aeronautical Engineering, Hyderabad, Telangana, India
R. Raju
St. Peters Engineering College, Hyderabad, Telangana, India
522 G. Srinivas Reddy et al.
biasing. This threshold voltage directly affects the power consumption and the delay
of the circuit. This body biasing alters the characteristics like power consumption
and delay. So, this technique greatly helps in fetching the better characteristics for
the circuit. The body biasing techniques applied are standard biasing, gate level body
biasing, VDD/2 biasing, and 3VDD/4 and VDD/4 biasing.
2 Literature Survey
In this chapter, the functionality of multiplexer, D flip-flop design using multiplexer,

and biasing techniques were discussed.
2.1 Multiplexer
Multiplexer is a combinational circuit which has 2n inputs, n select lines, and single
output. Transmission gate-based multiplexer shown in Fig. 1 and truth table of it
shown in Table 1.
D flip-flop using 2 × 1 Mux
Figure 2 shows the D flip-flop using 2 × 1 multiplexers. In when the clock goes
low, it reads the input D through the first multiplexer and places it at the input of the
second multiplexer. When the clock goes high, the second multiplexer transmits the
input D to the output Q.
Fig. 1 Multiplexer
Table 1 Truth table of multiplexer

S M1 M2 M3 M4 Y
0 Off Off On On A
1 On On Off Off B
Comparative Analysis of Body Biasing Techniques … 523
Fig. 2 D flip-flop using 2 × 1 mux
Fig. 3 Standard biasing for

NMOS, PMOS
2.2 Body Biasing
Every CMOS will have four terminals; they are source, drain, gate, and body. The
voltage difference between the source and body will affect the threshold voltage of
the CMOS. This threshold voltage is responsible for the amount of power consumed
by the CMOS. Adjusting this threshold voltage of a CMOS dynamically to fetch
better results is called body biasing. In our project, we will apply many biasing
techniques and calculate the power and speed [4].
2.3 Standard Biasing
The most common biasing technique used is standard biasing. In this, the body
terminal of PMOS is connected to the VDD and NMOS is connected to the ground
as in Fig. 3.
2.4 VDD/2 Biasing
Instead of standard biasing, connect the substrates of NMOS and PMOS to VDD/2
as shown in Fig. 4.
Fig. 4 VDD/2 biasing
Fig. 5 3VDD/4, VDD/4

biasing techniques
2.5 3VDD/4 and VDD/4 Biasing Technique
In this biasing technique, the body terminal of NMOS and PMOS connected to
VDD/4 and 3VVD/4, respectively, as shown in Fig. 5.
2.6 GLBB
GLBB stands for the gate level body biasing [4]. The gate level biasing for any
CMOS circuit is shown in Fig. 6.
It is a simple dynamic body biasing technique, results high speed. This technique
is mainly aimed to overcome the disadvantages DTMOS technology. This technique
is fast, energy efficient in both sub threshold and near threshold region, and maintains
robustness against temperature and process variations [5]. The body biased generator
(BBG) circuit manages the body voltage of the circuit. It is a push pull amplifier
which acts as a voltage follower. This voltage follower circuit helps in decoupling
Fig. 6 Block diagram of

GLBB
the large body capacitances at the output node. When V out is equal to the 0 V, the
BBG transfers low voltage on VB, thus preparing pull up section for the faster logic
switching. When V out is equal to VDD, the BBG transfers high voltage on VB, thus
preparing pull down network for the faster logic switching [6]. BBG also has static
current flowing through it which causes the static power consumption [7].
3 Biasing Shift Register
3.1 4-Bit Register
The main aim of designing this circuit is to develop a shift register to shift/ rotate the
bits in a single cycle depending upon the clock pulse. For a 4-bit register, four 4 × 1
multiplexers and four D flip-flops will be used shown in Fig. 7 and Table 2 provides
the functionality.
Fig. 7 Block diagram of 4-bit register
Table 2 Truth table of 4-bit register

S Operation Q3 Q2 Q1 Q0
00 Reset 0 0 0 0
01 Load A3 A2 A1 A0
10 Left shift Q2 Q1 Q0 0
11 Right shift 0 Q3 Q2 Q1
Table 3 Truth table of 8-bit

S0 S1 S2 Operation
register
0 0 0 Reset
0 0 1 Load
0 1 0 Left shift by 1-bit
0 1 1 Left shift by 2-bits
1 0 0 Right shift by 1-bit
1 0 1 Right shift by 2-bits
1 1 0 Rotate right by 1-bit
1 1 1 Rotate left by 1-bit
3.2 8-Bit Register
The circuit of 4-bit shift register extended to 8-bit shift register. Three select lines
used to perform eight operations. Table 3 provides the functionality for the 8-bit
register shown in Fig. 8.
The 4-bit and 8-bit registers are developed with many basic blocks. The two
primary blocks used are 2 × 1 multiplexer and inverter. With help of symbols created
for these two blocks, every other circuit in this project has been designed. So, making
any changes in these two circuits will reflect in every circuit.
The schematic of inverter with standard bias and V2DD shown in Fig. 9, respectively.
The inverter with 3V4DD , V4DD , and GLBB biasing shown in Fig. 10. The schematic
Fig. 8 Block diagram of 8-bit register

VDD
Fig. 9 a Inverter schematic with standard biasing. b Schematic of inverter with 2 biasing
3V DD VDD
Fig. 10 a 4 , 4 biased inverter schematic. b GLBB biased inverter schematic
of 2 × 1 multiplexer using V2DD , standard biasing, 3V4DD and V4DD biasing, and GLBB
biasing shown in Figs. 11, 12, 13, and 14, respectively. The realization of 4 × 1
multiplexer using 2 × 1 is shown in Fig. 15, and the realization of 8 × 1 is shown in
Fig. 16. Schematic of 1-bit, 4-bit, and 8-bit register is shown in Figs. 17, 18, and 19.
Fig. 11 2 × 1 mux
schematic using VDD
2 biasing
Fig. 12 2 × 1 mux using

standard biasing
3VDD VDD
Fig. 13 2 × 1 mux using 4 and 4 biasing
The proposed techniques are implemented by using technology gdpk180. The simu-
lation was performed using Cadence spectra tool with supply voltage 1.8 V in 180 nm
technology. The proposed 4-bit and 8-bit registers are tested functionality is checked.
Fig. 14 Realization of 8 × 1
multiplexer
Fig. 15 Schematic of 1-bit

register
Fig. 16 2 × 1 mux schematic using GLBB bias
Then, applied different biasing techniques to them and tabulated the power consumed
and delay occurred.
For a sequential circuit, the output varies at only positive edge of clock pulse. In
the output waveform at 10 ns, the input to the S0 S1 is 0 0, so reset action takes place
and all the output states Q3, Q2, Q1, and Q0 are 0. At 30 ns, the input to the S0 S1
is 0 1, so the load action takes place and it reads the inputs provided to them, so Q3,
Q2, Q1, and Q0 are 0, 1, 0, and 1, respectively. At 60 ns, the input to the S0 S1 is 0
Fig. 17 Realization of 4 × 1 mux using 2 × 1mux
Fig. 18 Schematic of 4-bit register
Fig. 19 Schematic of 8-bit register
1, so left shift operation takes place and the output states Q3, Q2, Q1, and Q0 are 1,
0, 1, and 0, respectively.
At 80 ns, the input to the S0 S1 is 1 1, so the right shift operation takes place and
the output states of Q3, Q2, Q1, and Q0 are 0, 1, 1, and 0. At 70 ns, another left shift
operation takes places, so the states of Q3, Q2, Q1, and Q0 before 80 ns are 0, 1, 0,
and 0. Similarly, for 8-bit register for various operations. The simulation results are
Fig. 20 Simulation results of 4-bit shift register
shown in Figs. 20 and 21. The power and delay calculations by applying different
types of biasing techniques are shown in Tables 4 and 5.
5 Conclusions
In this paper, a 4-bit register and 8-bit register which can perform n-bit shifting and
rotating operations depending up on clock pulse have been proposed and the results
are simulated with Cadence spectra tool with 180 nm technology. Later, four biasing
techniques have been applied to them and power consumed and delay occurred were
calculated. With the tabulated results, it is observed that different biasing technique
yields different types of characteristics to the circuit.
Fig. 21 Simulation results of 8-bit shift register
Biasing technique 4-bit register 8-bit register
power consumed
Standard 4.99 mW 8.415 mW
VDD/2 12.95 W 43.75 W
3VDD/4 and VDD/4 2.214 mW 6.99 mW
GLBB 2.328 W 7.87 W
Biasing technique 4-bit register 8-bit register
delay
Standard 60.35 ns 70.48 ns
VDD/2 60.25 ns 70.42 ns
3VDD/4 and VDD/4 10.26 ns 70.45 ns
GLBB 60.32 ns 30.39 s
References
1. X. Chen, N.A. Touba, Fundamentals of CMOS Design, Electronic Design Automation (Morgan
Kaufmann, Burlington, 2009), pp. 39–95. ISBN 9780123743640
2. Shivali, S. Sharma, A. Dev, Energy efficient D flip-flop using MTCMOS technique with static
body biasing. Int. J. Recent Technol. Eng. (IJRTE) 8(1) (2019). ISSN: 2277-3878
3. M. Morris Mano, M.D. Ciletti, Digital Design, 6th edn. (Pearson, Los Angeles, 2018)
4. R. Taco, M. Lanuzza, D. Albano, Ultra-low-voltage self body biasing scheme and its application
to the basic athematic circuits (hindawi.com). Volume 2015, VLSI Design, Article ID 540482,
pp. 10 (2015). https://doi.org/10.1155/2015/540482
5. D. Khalandar Basha, A. Pulla Reddy, R. Raju, G. Srinivas Reddy, Gated body biased full adder.
Mater. Today Proc. 5(1), Part 1, pp. 673–679 (2018)
6. D. Khalandar Basha, S. Reddy, K. Aruna Manjusha, 2D symmetric 16*8 SRAM with reset. J.
Eng. Appl. Sci. 13(1), 58–63 (2018)
7. D. Khalandar Basha, B. Naresh, S. Rambabu, D. Nagaraju, Body biased high speed full adder,
in LNCS/LNAI/LNBI Proceedings (2017)
Optical Mark Recognition with Facial
Recognition System
Ronak Shah, Aryak Bodkhe, Sudhanshu Gupta, and Vinayak Gaikwad
Abstract Attendance check is vital as it helps to know students participation in class.

Taking attendance by calling out names or passing the attendance sheet is definitely
time-consuming and a tiring method for teachers and students. This traditional system
of attendance is for sure prone to frauds in attendance check too. During examinations
also, teachers need to check student’s attendance which could be time-consuming
and distracting while conducting exams. Checking the answer sheets manually is also
a lot time-consuming and requires a lot of hard work from teachers end. Binding this
two problems, alternatively, teachers can use the optical mark recognition and facial
recognition system (OMRFRS) which will be a photo-detected attendance system
and can also check students answer sheet via scanner. This shall be a time-saving and
easy method for teachers and students and will also avoid the chances of fraud. This
paper aims to showcase the technology behind the OMRFRS. The proposed system
was tested at the Mukesh Patel School of Technology and Management Engineering,
NMIMS University, and the results obtained were very satisfactory.
Keywords Attendance management system · Facial recognition · Image

processing · Low cost · Optical mark recognition · Signal processing
R. Shah (B) · A. Bodkhe · S. Gupta · V. Gaikwad

Electronics Telecommunication Engineering, NMIMS Deemed to be University, Mumbai,
Maharashtra, India
e-mail: ronak.shah31@nmims.edu.in
A. Bodkhe
e-mail: aryak.bodkhe06@nmims.edu.in
S. Gupta
e-mail: sudhanshu.gupta65@nmims.edu.in
V. Gaikwad
e-mail: vinayak.gaikwad@nmims.edu
536 R. Shah et al.
1 Introduction
Many school institutions are concerned with student engagement in schools because
student collaboration in the class leads to more successful understanding, learning,
and higher accomplishments [1]. Likewise, a greater level of interest in the curriculum
is a major element for teachers to create a favorable atmosphere for seriously partic-
ipating students [2]. Estimating participation in a course on a daily basis is the most
well-known tool for increasing participation. There will be two common strategies
for learning things about attendance. A few teachers enjoy calling students’ names
and assigning grades based on their attendance or absence. A document labeling
slip is spread around by various teachers. In wake of social affair the participa-
tion information through both of these two strategies, educators physically enter the
information into the current framework. Be that as it may, these non-mechanical
methods are incorrect because both are repetitive along with prone to misrepresen-
tation. The aim of this paper is to suggest a sharing model that integrates interaction
through innovative framework for certain upgrades. A face recognition and optical
mark recognition focused on a facial recognition foundation, and a smart device-
based attendance management method has been developed. A filtering system is
used in this expansion from the histogram of oriented gradients (HOG) algorithm.
Face recognition was used and plays out a continuous surveying measure, infor-
mation following and revealing. The information is stored on a cloud server and is
available at anytime from anywhere. In the other hand, the planned OMR method
would concentrate on the production of OMR for MCQs by using a new method-
ology, opening the way for the future studies into more effective OMR in terms of
speed and accuracy. The following is a breakdown of the paper’s structure. Section 2
contains a short review of the literature survey. The proposed system is introduced in
Sect. 3, and the implementation and results are discussed in Sect. 4. The main results
are discussed in the final part.
2 Literature Survey
Crime detectives now use face recognition as a valuable and routine forensic tech-
nique. On the opposite of automatic face recognition, forensic face recognition is
more difficult because it must account for facial gestures collected in less-than-ideal
environments because it carries a high degree of accountability for adhering to legal
procedures. The effect of recent developments in automated face recognition on the
forensic face recognition community is discussed. Improvements in forensic face
recognition will address facial aging, facial marks, forensic sketch detection, face
recognition in video, near-infrared face recognition, and the use of soft biometrics
[3]. A facial image can be represented by a position in an input image multiplied by a
number of facial images. In strategies based on similarity identification, the symbolic
ability of a face database is measured with how typical image data are chosen to allow
Optical Mark Recognition with Facial Recognition System 537
for potential model differences and also how many typical photographs or their local
features are usable [4]. A collection of models captures spatial and audio features in
videos. Convolutional neural networks, which have been pre-trained on large face
recognition datasets, capture spatial features. It shows that using strong industry-
level face recognition networks improves emotion recognition accuracy [5]. The
new technique can handle thin papers and answer sheets with low printing accu-
racy. The image scan, tilt correction, scanning error correction, regional deformation
correction, and mark recognition are among the system’s key tools and implementa-
tions. By analyzing the results of a large number of questionnaires, this approach has
proven to be reliable and efficient [6]. A mobile phone-based optical mark recog-
nition (OMR) system checks user response sheets automatically. It makes use of
previous knowledge of the OMR sheet layout, which aids in achieving high speed
and accuracy [7].
A new approach for gender classification and facial expression recognition is
based on the two expressions of anger and joy, as well as geometric and appearance.
Human–computer interaction, driving welfare, and other applications are among
the applications. The most common algorithms for detecting facial expression and
gender are principal component analysis, linear discriminant analysis, and linear
binary pattern algorithms [8]. A posture invariant three-dimensional (3D) outward
appearance acknowledgment technique utilizing distance vectors recovered from 3D
dispersions of facial component focuses to characterize general outward appearances.
Probabilistic neural network engineering is utilized as a classifier to perceive the
outward appearances from a distance vector got from 3D facial element areas. Pain,
disappointment, surprise, excitement, disgust, anxiety, and neutral facial expressions
are effectively recognized [9].
3 Proposed System
The proposed system will have two parts, first being the facial recognition system
and second being the optical mark recognition system. Project is mainly based on
Python and OpenCV.
3.1 Facial Recognition System
Referring Fig. 1, face detection is the first step in our process. Here, we are going
to locate the faces in a photograph. Most of this step would be done by encoding
a picture using the histogram of oriented gradients (HOG) algorithm to create a
simplified version of the image. We will select the portion of the image that most
closely resembles a generic HOG encoding of a face using this condensed image.
This algorithm would help in capturing the simple structure of the face [10].
538 R. Shah et al.
Fig. 1 Facial recognition

system flowchart
The second step in our pipeline is face alignment. We will use a method called
face hallmark estimation to do this. This would be done by figuring out the pose of
the face by finding the main landmarks in the face. Once the landmark would be
found, the algorithms will use them to warp the image so that the eyes and mouth in
the image would be aligned.
The third step in our pipeline is feature extraction. Here, we are going to train a
deep convolutional neural network to generate each face in an image. Now, passing
the aligned face image through this neural network from the last pipeline will help
us to extract features of the face, and these measurements would help us in matching
the data in the next step.
The final step is being the feature matching. In this, we are going to match the
image with its features. This could be done by using machine learning classification
algorithm SVM classifiers, which will use the database of known people to closest
measurement to our test image [11].
3.2 Optical Mark Recognition
From the below-shown Fig. 2, we can see the flow of steps required for the OMR.
Here, we are going to scan the answer sheets in an optical form. After that, we are
going to check whether the QR code location is being scanned properly; if not, then
we are going to rotate the image so that it gets scanned properly.
Fig. 2 Optical mark

recognition flowchart
After rotating the image, we have to find the allocation of the answer area properly
using the machine learning algorithm. After that, we must find and sort the bubble
area properly and compare the bubble to compute the results by comparing the right
answer; then, we are going to save that data properly and show the result.
Compiling the two parts of the project, we can get output as student’s attendance
report and also the marks obtained by him/her in the examination. This could be a
lot faster, cheaper, and accurate method to solve two problems at a single time [12].
4 Implementation and Results
The process of testing is focused after the successful and logical coding. The testing
phase can be divided into three segments: Face identification, face training, and face
recognition identification for attendance system. Similarly, scanned copy identifica-
tion, checking bubble answer, and comparing with the original answer for the OMR
part [13].
Dataset used—we collected images of our friends and classmates having face
alignment and used those images as our dataset. This dataset was in jpeg format.
Figure 3 shows the sample dataset used.
540 R. Shah et al.
Fig. 3 Sample dataset
The machine is ready to identify the person’s front face using the trained files and
OpenCV and txt by seeing Fig. 4. It identifies the face coming in the camera and
names it if the machine knows about it, otherwise states unknown. Positive encoded
output is shown in Fig. 5.
After finishing with the facial recognition, the system will check the answer sheet
by scanning. The output oriented is similar to the way shown in Fig. 6.
As a result, we conducted total of 100 samples and we got different results when
verified the images at different distances. First, we tried to scan the image at a distance
of 1 ft, and the facial detection was too quick, but when the distance increased, there
was a time delay in output. After a distance of 5 ft, system was unable to generate the
output data. Table 1 shows the data collected from various experiments performed.
The system also depends on the quality of camera used. We tried on Apple iPhone
X and on Mac Book Pro, and mobile could generate data on small distances, while
laptop with higher-efficiency camera generated data with maximum 5 ft of distance.
Fig. 4 Train image of data

Fig. 5 Positive encoded

output
Fig. 6 Sample of answer

sheet output after OMR
Table 1 Distance required

Dataset Distance from Output accuracy (%)
for scanning image
camera/scanner (ft)
10 1 100
45 1 86.66
10 5 90
90 4 85.5
542 R. Shah et al.
5 Conclusion
The purpose of this paper is to analyze a faster means of method for traditional
systems that are used in attendance marking and examination checking. This paper
has also showcased various vital topics like advantages and disadvantages, draw-
backs, and solutions for the OMRFRS, when used in different modes of environment.
This paper has also shown how two different aspects of technologies, namely optical
mark recognition and other facial recognition system, could be clubbed and used in
a more modified way. As a fact, there is nothing in the world that exists without any
flaws. This system also might have few drawbacks, but the outputs obtained were
found to resolve most of the issues in traditional approach of attendance and answer
checking system and were also a lot satisfactory.
References
1. L. Stanca, The effects of attendance on academic performance: panel data evidence for
introductory microeconomics. J. Econ. Educ. 37(3), 251–266 (2006)
2. P.K. Pani, P. Kishore, Absenteeism and performance in a quantitative module A quantile
regression analysis. J. Appl. Res. High. Educ. 8(3), 376–389 (2016)
3. A.K. Jain, B. Klare, U. Park, Face recognition: some challenges in forensics, in 2011 IEEE
International Conference on Automatic Face & Gesture Recognition (FG), Santa Barbara, CA,
USA (2011), pp. 726–733. http://doi.org/10.1109/FG.2011.5771338
4. S.Z. Li, J. Lu, Generalizing capacity of face database for face recognition, in Proceedings
Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan
(1998), pp. 402–406. http://doi.org/10.1109/AFGR.1998.670982
5. B. Knyazev, R. Shvetsov, N. Efremova, A. Kuharenko, Leveraging large face recognition data
for emotion classification, in 2018 13th IEEE International Conference on Automatic Face &
Gesture Recognition (FG 2018), Xi’an, China (2018), pp. 692–696. http://doi.org/10.1109/FG.
2018.00109
6. H. Deng, F. Wang, B. Liang,A low-cost OMR solution for educational applications, in 2008
IEEE International Symposium on Parallel and Distributed Processing with Applications,
Sydney, NSW, Australia (2008), pp. 967–970. http://doi.org/10.1109/ISPA.2008.130
7. R. Patel, S. Sanghavi, D. Gupta, M.S. Raval, CheckIt—a low cost mobile OMR system, in
TENCON 2015—2015 IEEE Region 10 Conference, Macao, China (2015), pp. 1–5. http://doi.
org/10.1109/TENCON.2015.7372983
8. A.V. Anusha, J.K. Jayasree, A. Bhaskar, R.P. Aneesh, Facial expression recognition and
gender classification using facial patches, in 2016 International Conference on Communi-
cation Systems and Networks (ComNet), Thiruvananthapuram (2016), pp. 200–204. http://doi.
org/10.1109/CSN.2016.7824014.09/CSN.2016.7824014
9. H. Soyel, H. Demirel, 3D facial expression recognition with geometrically localized facial
features, in 2008 23rd International Symposium on Computer and Information Sciences,
Istanbul, Turkey (2008), pp. 1–4. http://doi.org/10.1109/ISCIS.2008.4717898
10. P.N. Maraskolhe, A.S. Bhalchandra, Analysis of facial expression recognition using histogram
of oriented gradient (HOG), in 2019 3rd International conference on Electronics, Communi-
cation and Aerospace Technology (ICECA). http://doi.org/10.1109/ICECA.2019.8821814
11. M. Pantic, I. Patras, Dynamics of facial expression: recognition of facial actions and their
temporal segments from face profile image sequences. IEEE Trans. Syst. Man Cybern. Part B
(Cybernetics) 36(2), 433–449 (2006). http://doi.org/10.1109/TSMCB.2005.859075
12. K. Verma, A. Khunteta, Facial expression recognition using Gabor filter and multi-layer artifi-
cial neural network, in 2017 International Conference on Information, Communication, Instru-
mentation and Control (ICICIC), Indore (2017), pp. 1–5. http://doi.org/10.1109/ICOMICON.
2017.8279123
13. R. Samet, M. Tanriverdi, Face recognition-based mobile automatic classroom attendance
management system, in 2017 International Conference on Cyberworlds (CW), Chester (2017),
pp. 253–256. http://doi.org/10.1109/CW.2017.34
Evaluation of Antenna Control System
for Tracking Remote Sensing Satellites
A. N. Satyanarayana, Bandi Suman, and G. Uma Devi
Abstract Remote sensing is important for obtaining information associated with the
earth’s resources and its environment. The tracking of LEO satellites is increasing
rapidly. At ground station, on a daily basis, we track multiple remote sensing satellites
in different modes of tracking, and we acquire data from them. This paper brings out
existing tracking techniques and tracking LEO satellites at S and X bands. This article
represents the evaluation of the response of the closed-loop servo system considering
it as second-order system with autotrack error voltages as step input to the antenna
control system. Earth station antenna has been established for acquiring payload data
from low earth orbit satellites in X-band (8.2–8.4 GHz) (Delbert D. Smith, Commu-
nication via Satellite: A vision in Retrospect.; Lewis, Communications Services via
Satellite.). The tracking mode considered is X-autotracking.
Keywords X-band · Autotrack step response · Low earth orbit (LEO) · Autotrack
error voltages
1 Introduction
Evaluating second-order closed-loop servo system output with step input of ampli-
tude +0.14°, −0.14° (3 dB points off from peak/target) along both AZ and EL axis
for X-band, it is important to measure step response regularly for every ground station
antenna in order to verify the second-order system’s time-domain specifications
[1–4].
A. N. Satyanarayana (B) · B. Suman · G. Uma Devi

Department of Space, National Remote Sensing Centre, Indian Space Research Organization,
Government of India, Hyderabad, India
e-mail: satyanarayana_an@nrsc.gov.in
B. Suman
e-mail: suman_b@nrsc.gov.in
G. Uma Devi
e-mail: umadevi_g@nrsc.gov.in
546 A. N. Satyanarayana et al.
Fig. 1 Ground station block diagram
To avoid sluggish behavior of antenna servo system, frequent evaluation of step

response of system is necessary.
A typical ground station diagram is shown in Fig. 1.
A typical ground station diagram to evaluate autotrack step error voltage response
is shown in Fig. 2.
2 Description
The autotrack step response is that the time needed for autotrack error voltages to
drive antenna control system to the initial angles (peak/target/boresight) from where
step angle was given initially.
Transient response specifications of a second-order system for a unit step input
[5–8].
Delay time—Td, rise time—Tr, peak time—Tp, maximum overshoot—Mp,
settling time—Ts.
In our work, ground station emphasis is on rising time, maximum overshoot, and
bandwidth.
These specifications are shown graphically within the following Fig. 2.
In the Indian remote sensing satellite (IRS) ground station antenna servo control
system, we evaluate the rise time (tr), percentage overshoot, and settling time (ts).
Evaluation of Antenna Control System for Tracking … 547
Fig. 2 Typical ground station diagram
The range of the above parameters is:

Rise time = 0.3 to 0.6 s.
Settling time = 1.2 to 2.0 s.
Percentage overshoot ≤ 30%.
Transient Response: It is the output of a second-order system before reaching final
value, and this will be vanishing after reaching settling time. Following this, Fig. 3
shows sample output to a step input of amplitude unity.
Maximum percent overshoot = (C(tp) − C(∞))/C(∞) (1)
3 Evaluation Procedure
Tracking Gradient: Gradient gives the measurement of autoerror given to antenna

system per degree offset of receiving antenna.
Adjustment of tracking gradients: For a satellite of more altitude, it has less
azimuth velocity at peak elevation compared to with satellite of low altitude. From
this, we can conclude that satellites with less altitude should have more gradient
value to get a fast system response. Every time, we set a gradient value and check
the system response, and it should satisfy the rise time which must be less than 0.5 s,
and overshoot must be less than 30%.
Fig. 3 Transient responses [7]
X-autotrack Mode: It is one of the antenna closed-loop tracking modes. At the

IRS ground station, to track the satellite, we use the X-autotrack as a primary mode
of tracking the satellite thanks to the higher accuracy.
The autotrack step response is measured by pointing the ground station antenna
to a known source (bore-site antenna tower), and a step angle of input +0.14° and
−0.14° along azimuth and +0.14° and −0.14° along elevation axis are given for
X-band. The response for an input of an absolute magnitude of 0.14 along both axes
was plotted in the following graphs, where X-axis is time (seconds) and Y-axis is
error voltage (Volts). The following graph plots error voltage generated versus time
for the input of absolute magnitude of 0.14 along both axes.
The steps involved are:
• Sending test/desired signal to the known source (bore-site antenna tower)
• Configuring ground station antenna to the desired signal frequency.
• Ground station antenna is pointed to a known source (bore-site antenna tower)
• After acquiring the signal, a step angle of +0.14° (3 dB point from peak/target)
along the azimuth axis is given as an input to the ground station antenna, and the
error voltage (which is the output/response) generated is shown in Fig. 4
• After generating the error voltage, immediately, ground station antenna mode is
changed to X-auto-track to correct the error voltage generated.
• After acquiring the signal, a step angle of −0.14° (3 dB point from peak/target)
along the azimuth axis is given as an input to the ground station antenna, and the
changed to X-autotrack to correct the error voltage generated.
Fig. 4 Azimuth step input of +0.14°
• After acquiring the signal, a step angle of +0.14° (3 dB point from peak/target)
along the elevation axis is given as an input to the ground station antenna, and the
• After acquiring the signal, a step angle of −0.14° (3 dB point from peak/target)
along the elevation axis is given as an input to the ground station antenna, and the
error voltage (which is the output/response) generated is shown in Fig. 7.
After generating the error voltage, immediately, ground station antenna mode is
The autoerror voltage generated inside the closed-loop control system, when
moving away from the target/bore-site tower, will be approaching zero (i.e., ground
station antenna will be pointing to the target/peak) after changing antenna mode to
X-autotrack.
Fig. 5 Azimuth step input of −0.14°
The step response results (shown in Figs. 4, 5, 6, and 7) for the X-band along
the azimuth and elevation axis should meet our desired time-domain specifications;
otherwise, servo system behaves sluggishly.
Factors affecting tracking gradients: Aging of components, variation in tempera-
ture, rain phasing will be disturbed, and this will lead to changes in gradients.
4 Results
The graph plots time (seconds) versus error voltage (volts). The following graphs
were plotted when antenna mode changed to X-autotrack mode and the error voltage
generated inside the system will have characteristics similar to the second-order
system response to a unit step input, and the error will be corrected and will reach a
steady state (approaches zero) in X-autotrack mode.
Fig. 6 Elevation step input of +0.14°
5 Conclusions
Antenna servo system specifications are met. Evaluating autotrack step response for
a closed-loop servo control system is necessary to avoid sluggish behavior of servo
system; otherwise, tracking accuracy, real-time data may be lost.
Fig.7 Elevation step input of −0.14°
Acknowledgements It is indeed a great pleasure and my sincere gratitude to my center Director

Dr. Rajkumar NRSC and Dy. Director Ms. G. Uma Devi, for their encouragement for producing
this paper.
References
1. K. Ogata, Modern Control Engineering, 4th edn.

2. Ka-band Capabilities. Via Sat Inc., Satellite Ground Systems, and Norcross
3. Preliminary Design Review Information on GISAT Payload Applications, vol. 1
4. Satellite Applications for Polar and Geo Synchronous Satellites. EE4367 Telecommunication
Switching & Transmission
5. B. Kuo, Automatic Control Systems, 9th edn.
6. N. Nise, Control System Engineering, 6th edn.
7. Course material, on Satellite Communications, by ISRO
8. H.E. Hudson, Communication Satellites: Their Development and Impact
9. D.D. Smith, Communication via Satellite: A Vision in Retrospect
10. Lewis, Communications Services via Satellite
Face Recognition Using Cascading
of HOG and LBP Feature Extraction
M. Chandrakala and P. Durga Devi
Abstract Face recognition is efficiently used in computer vision, pattern recogni-

tion, biometric, law enforcement, surveillance, criminal investigations, and missing
person detection. However, face recognition is still a challenging task under variant
illuminations, pose, expression, occlusion, and background. In this paper, we
proposed a cascading of histogram of oriented gradients (HOG) and local binary
pattern (LBP) feature extraction method on ORL face dataset, which can improve
recognition rates while also addressing pose, scale, expression, and variant illumi-
nation issues. The facial recognition network was trained using various classifiers
such as with KNN, SVM, and random forest (RF). The results have been shown that
the accuracy rate of the combined LBP and HOG feature extraction method is better
than individual LBP or HOG features for ORL database face recognition.
Keywords Face recognition · Support vector machine (SVM) · K-nearest neighbor

(KNN)
1 Introduction
Face recognition performance can be affected by changes in illumination, occlu-

sion, pose variation, and background. The feature extraction and classification stages
are critical in improving the recognition rate. Several advanced studies in the areas
of facial expression, feature extraction mechanisms such as HOG, LBP, principle
component analysis with SVM, KNN, and MLP classifier have been conducted
[1]. Mittal and Shobhit Agarwal worked on multiple face recognition based on a
convolutional neural network [2].
Aviral Joshi et al. [3] presented face image classification algorithms such as
cloud forest, RF, and KNN on the ORL facial database and achieved recognition
M. Chandrakala (B) · P. Durga Devi

Department of ECE, Mahatma Gandhi Institute of Technology, Hyderabad, India
e-mail: mchandrakala_ece@mgit.ac.in
P. Durga Devi
e-mail: pdurgadevi_ece@mgit.ac.in
554 M. Chandrakala and P. Durga Devi
rates of 77.5%, 84.17%, and 63.89%, respectively, in an unconstrained environment.

Everyone who enters the framing camera will be detected and face recognized by
using the KNN and ANN algorithms [4]. The immediate objective of an appearance-
unalterable and 3D face recognition system is to recognize different expressions using
an SVM classifier [5]. The KNN classifier is used alone for color face recognition,
and then, it is combined with PCA to reduce the number of features [6].
Ghorbani and Targhi proposed HOG features extracted from a regular grid. Inte-
grating HOG features at different scales with LBP features allow for the detection
of important features for face recognition [7]. K-SVD would be used to extract the
sparse approximation of HOG and LBP features for pedestrian detection [8]. Opti-
mization techniques are effective for reducing the number of features while boosting
the recognition system. As a result, using SVM and particle swarm optimization, a
face recognition system is created [9]. Wei et al. [10] demonstrated HOG-based face
image classification by extracting multiscale feature vectors across multiple dimen-
sions. Experiments on the ORL face database gave a recognition accuracy of 92%.
Julina and Sharmila [11] described a system to classify ORL database face images
using decision KNN classifier based on HOG features and achieved a recognition
accuracy of 90%.
Bah and Ming [12] illustrated face recognition algorithm for a real-time attendance
system that incorporates LBP features with techniques such as the bilateral filter,
contrast adjustment, image blending, and histogram equalization. Ahonen et al. [13]
analyzed a face recognition system on FERET and ORL databases using PCA based
on LBP of various window sizes and achieved 79% recognition accuracy on the
ORL face image database. Chandrakala and Durga Devi [14] proposed two stage
face recognition classifier based on HOG features and achieved recognition rate of
95.2%.
Jayaraman et al. [15] presented challenges with face recognition systems such
as aging, multiple faces in group photos, facial disguises, occlusion, and cluttered
backgrounds, and suggested that an infrared face recognition system could have been
used to overcome these challenges.
Face recognition issues such as variant illumination, pose, and expression could
not be addressed by a single feature extraction-based system, whereas our proposed
cascading of LBP and HOG feature extraction-based classifiers could improve
recognition rates while also addressing pose, expression, and variant illumination
issues.
This paper framework was being structured as follows. Section 2 discusses prepro-
cessing, feature extraction, and the proposed face recognition system using cascading
of LBP and HOG features. Section 3 discusses the experimental results of the
proposed face recognition algorithm. Section 4 discusses the paper conclusion.
Face Recognition Using Cascading of HOG and LBP Feature … 555
Fig. 1 ORL dataset for two subjects
2 Recognition System
A face recognition algorithms are employed to distinguish a unique person’s face and
compare it to images stored within the ORL database [16]. The primary emphasis of
this paper is on correctly recognizing test images from database training images.
2.1 Preprocessing
This paper focuses on accurately distinguishing test images from database training
images. To achieve high recognition accuracy, the facial images in the ORL dataset
were normalized to 50 × 50 pixels from 92 × 112 pixels. As shown in Fig. 1, the
ORL dataset facial images vary for illumination conditions, expressions, and facial
details.
2.2 Approach to Facial Feature Extraction
Face detection and recognition depend primarily on feature extraction. The most
pertinent features were extracted from every face image.
2.2.1 Extraction of HOG Features
The face image is divided into connected grids called cells in HOG feature extraction
[17]. Each cell contains pixels, and from the pixels, gradient magnitude and angle
are computed. Four cells were grouped to form a block with 50% overlap.
Gradients of the cell are calculated by overlaying 1D derivative mask filters
[−1, 0, 1] and [−1, 0, 1]T applied at pixel values located at coordinate points (r, s).
Luminance value at coordinate points (r, s) is represented as V (r, s). Gradients in
x-direction Vx (r, s) and y-direction Vy (r, s) are calculated as
Vx (r, s) = V (r + 1, s) − V (r − 1, s) (1)
Vy (r, s) = V (r, s + 1) − V (r, s − 1) (2)
The gradient magnitude is then expressed as

M(x, y) = Vx (r, s)2 + Vy (r, s)2 (3)
The gradient orientation is then applied to the same pixel as

Vy (r, s)
θ (x, y) = arctan (4)
Vx (r, s)
Gradient angle θ (x, y) and magnitude M(x, y) have been used to generate
histogram. The evenly distributed orientation over (0, π ) is mapped into one of
the nine bins. The gradient magnitudes are accumulated and are mapped into their
angular bins.
For current work, every face image in ORL database resized to 50 × 50 pixels.
Face image is divided into 10 × 10 pixels called a cell. Block is formed by grouping
four 2 × 2 pixels. Feature vector computed from each cell. There are four such
blocks in each row and column of the image with 50% overlapping. Feature vector
computed as
Feature vector = {No. of blocks in a row of the image}

× {No. of blocks in each column of the image}
× {No. of cells in each block}{No. of bins}
Feature vector = 4 × 4 × 2 × 2 × 9 = 576 features.
2.2.2 Local Binary Pattern
The image is divided into blocks by the LBP feature extractor [18]. Each block is 3
× 3 pixels in size. The local binary pattern of the central pixel is then calculated as

S−1
LBP S,R = P(R S − RC )2 S (5)
S=0

1 if (R S − RC ) ≥ 0
(R S − RC ) = (6)
0 if (R S − RC ) < 0
Here, RC is the central pixel is used as threshold and R S is the gray value of a
neighboring pixel “P.” The values of all neighboring pixels are compared to the
value of the center pixel. If (R S − RC ) ≥ 0, then the corresponding R S pixel is
represented as “1” otherwise “0.” A histogram can be built based on these binary
values. Central pixel value can be computed by converting binary to a decimal value.
Extension of original LBP is the simple rotation invariant that can reduce the size of
the feature vector.
Patterns are classified into two types, uniform and non-uniform patterns. A
uniform pattern is an LBP that contains two bitwise transitions from 1 to 0 and 0 to 1.
More than two bitwise transitions are present in a non-uniform pattern. To represent a
histogram, each uniform pattern has its independent bin while non-uniform patterns
are combined into a single bin. There are 256 patterns when using the (8, S) neigh-
borhood, in which 58 patterns are uniform and one non-uniform pattern, resulting in
59 distinct patterns. For current work, each face image in the ORL database has an
LBP feature vector length of 59.
2.3 Proposed Algorithm Using Cascading of LBP and HOG

Features
Figure 2 depicts the proposed algorithm using cascading of LBP and HOG features.
All ORL database face image samples are preprocessed by size normalization to 50 ×
50 pixels, and LBP and HOG features are extracted and cascaded. Even after feature
Fig. 2 Proposed algorithm using cascading of LBP and HOG features

extraction, each face image is represented as a 1D feature vector. These feature values
are further classified as training and testing features. A recognition algorithm would
have been used to test the face image. We have evaluated various classifiers, such
as KNN, SVM, and RF, by training them on the face image database. The database
includes different facial images for 40 different individuals, with ten images for each
person. The model is trained on eight images and tested on two.
3 Experimental Results of Variant Classifiers
Experimental face recognition results were evaluated using the ORL face image
database. We have utilized different classifiers, namely KNN, SVM, and RF with
different feature extraction methods LBP, HOG, and the combination of LBP and
HOG.
3.1 Classifier Results Based on HOG Feature Extraction
In this experiment, the extracted features have been tested using KNN, SVM, and
RF classifiers. The grid search CV selection method can be used to pick up the best
hyperparameter for the highest accuracy rate. Based on HOG feature extraction, we
experimentally observed that KNN with k = 1, SVM with linear kernel, and RF
with 30 estimators achieved a maximum recognition rate for the test set. Accuracy
of different classifiers-based HOG features compared with training set size as shown
Fig. 3 Accuracy of different classifiers based on HOG features versus training set size
Fig. 4 Accuracy of different classifiers based on LBP features versus training set size
in Fig. 3. Based on HOG feature extraction, the better recognition rate for SVM
classifier was 92.58%.
3.2 Classifier Results Based on LBP Feature Extraction
For the test set, based on LBP feature extraction, we found that KNN with k = 1,
SVM with RBF kernel, and RF with 30 estimators performed effectively for the
highest recognition rate. As shown in Fig. 4, the accuracy of different classifiers with
LBP features compared with variant training set size. The better recognition rate for
KNN classifiers was 86.76% based on LBP feature extraction.
3.3 Classifier Results Based on Cascading of HOG and LBP

Feature Extraction
The accuracy of various classifiers using a combination of LBP and HOG features
is compared to the size of the training set in Fig. 5. Both KNN and SVM achieved
the highest recognition rate of 93.75%. As shown in Figs. 3, 4, and 5, increasing the
training data sample size improves recognition accuracy.
The classification performance of the KNN, SVM, and RF classifiers for various
feature extraction methods is shown in Table 1. We concluded that the cascading
of HOG and LBP feature extraction with KNN and SVM face recognition rates
significantly higher than HOG and LBP feature extraction methods individually.
Fig. 5 Accuracy of different classifiers based on cascading of LBP and HOG features versus
training set size
Table 1 Face recognition

Feature extraction method Classifier with recognition
system based on variant
accuracy (%)
feature extraction methods
KNN SVM RF
HOG 92.57 92.58 86.25
LBP 86.76 86.75 66.25
HOG + LBP 93.75 93.76 91.25
Table 2 Face recognition

Feature extraction + classifier Recognition accuracy (%)
methods comparison on ORL
dataset HOG + KNN [11] 90
LBP + PCA [13] 79
Proposed method based on KNN 93.75
On the ORL dataset, Table 2 compares the proposed algorithm’s recognition

accuracy to two different existing methods. We concluded that cascading HOG and
LBP feature extraction-based classifier recognition rates were significantly higher
than HOG and LBP feature extraction methods alone.
4 Conclusion
For face recognition, we used LBP, HOG, and a cascading of LBP and HOG
feature extraction methods, as well as KNN, SVM, and RF classifiers. According
to the experimental results, a cascading of HOG and LBP feature extraction with
KNN and SVM recognition rate is better than using HOG and LBP feature extrac-
tion methods individually. Experimentally, we concluded that recognition accuracy
enhances as the size of the training set grows. The proposed classifiers have the
ability to improve recognition rates while also addressing issues with pose, scale,
expression, and variant illumination. According to recent research, the hyper-spectral
or multi-spectral imaging system would be the future of human face recognition
systems.
References
1. H.I. Dino, Facial expression classification based on SVM, KNN and MLP classifiers, in 2019
International Conference on Advanced Science and Engineering (2019), pp. 70–75
2. Mittal, S. Agarwal, M.J. Nigam, Real-time multiple face recognition: a deep learning approach,
in ACM International Conference Proceeding Series (2018), pp. 70–76
3. V.A. Aviral Joshi, H.M. Surana, H. Garg, K.N. Balasubramanya Murthy, S. Natarajan, Uncon-
strained face recognition using ASURF and cloud-forest classifier optimized with VLAD.
Procedia Comput. Sci. 143, 570–578 (2018)
4. C. Panjaitan, A. Silaban, M. Napitupulu, J.W. Simatupang, Comparison K-nearest neighbors
(K-NN) and artificial neural network (ANN) in real-time entrants recognition, in 2018 Inter-
national Seminar on Research of Information Technology and Intelligent Systems ISRITI 2018
(2018), pp. 1–4
5. M.J. Leo, S. Suchitra, SVM based expression-invariant 3D face recognition system. Procedia
Comput. Sci. 143, 619–625 (2018)
6. C. Eyupoglu, Implementation of color face recognition using PCA and k-NN classifier, in
Proceedings of 2016 IEEE North West Russia Section Young Researches in Electrical and
Electronic Engineering Conference EIConRusNW 2016 (2016), pp. 199–202
7. M. Ghorbani, A.T. Targhi, M.M. Dehshibi, HOG and LBP: towards a robust face recognition
system, in 10th International Conference on Digital Information Management ICDIM 2015,
no. Icdim (2016), pp. 138–141. http://doi.org/10.1109/ICDIM.2015.7381860
8. W.J. Pei, Y.L. Zhang, Y. Zhang, C.H. Zheng, Pedestrian detection based on HOG and LBP, in
Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), vol. 8588. LNCS (2014), pp. 715–720
9. J. Wei, Z. Jian-Qi, Z. Xiang, Face recognition method based on support vector machine and
particle swarm optimization. Expert Syst. Appl. 38(4), 4390–4393 (2011). https://doi.org/10.
1016/j.eswa.2010.09.108
10. X. Wei, G. Guo, H. Wang, H. Wan, A multiscale method for HOG-based face, vol. 1 (2015),
pp. 535–545. http://doi.org/10.1007/978-3-319-22879-2
11. K.J. Julina, S. Sharmila, Facial recognition using histogram of gradients and support vector
machines, in ICCCSP 2017
12. S.M. Bah, F. Ming, An improved face recognition algorithm and its application in attendance
management system. Array 5, 100014 (2020)
13. T. Ahonen, A. Hadid, M. Pietikäinen, Face recognition with local binary patterns, in LNCS
3021 (Springer, Berlin, 2004), pp. 469–481
14. M. Chandrakala, P. Durga Devi, Two-stage classifier for face recognition using HOG features.
Mater. Today Proc. (2021)
15. U. Jayaraman, P. Gupta, S. Gupta, Recent development in face recognition. 408, 231–235
(2020). http://doi.org/10.1016/j.neucom.2019.08.110
16. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
17. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in Proceedings
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR
2005, vol. I (2005), pp. 886–893. http://doi.org/10.1109/CVPR.2005.177
18. T. Ojala, M. Pietikäinen, T. Mäenpää, Gray scale and rotation invariant texture classification
with local binary patterns, in Lecture Notes in Computer Science (including Subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1842 (2000), pp. 404–
420
Design of Wideband Metamaterial
and Dielectric Resonator-Inspired Patch
Antenna
Ch. Manohar Kumar, V. A. Sankar Ponnapalli, T. Vinay Simha Reddy,

N. Swathi, and Undrakonda Jyothsna
Abstract A 30 × 30 × 1.6 mm3 square ring surrounded by circular-shape patch

antenna for wideband applications is examined in this paper. The %BW of proposed
antenna is 40.1%; from 9.86 to 14.82 GHz, it covers upper part of X-band and lower
region of Ku band with S11 < −10 dB. Due to the insertion of metamaterial at bottom
of patch, we got maximum gain of 4 dB at 11.405 GHz. Innovatively, a cylindrical
resonator is placed above the antenna, and we got good impedance BW from 9.24
to 10.01 GHz and 10.93 to 13.74 GHz, and we got greater than 4 dB gain in all
resonating frequencies. The proposed antenna was verified experimentally.
Keywords Negative permittivity · Negative permeability · MTM · Radiation ·

SRR
1 Introduction
In present scenario, the usage of wideband antenna is increasing continuously in

radars, military systems [1], tracking and imaging [2], high data rate [3, 4]. Two
researchers achieve wideband nature following different techniques like fractal for
WPAN [5], defected ground structures [6], metamaterials [7, 8], slots [9], and some
researchers miniaturized the antenna by using artificial elements like metamaterials.
These metamaterials have unique nature of negative ε, negative μ. By using these
unique structure, researchers increased gain [10], miniaturization of antenna [11],
isotropic RP [12, 13].
Ch. M. Kumar (B) · N. Swathi

Gayatri Vidya Parishad College for Degree and PG Corses (A), Visakhapatnam, India
V. A. S. Ponnapalli
Sreyas Institute of Engineering and Technology, Hyderabad, India
T. V. S. Reddy
Malla Reddy University, Hyderabad, India
U. Jyothsna
GITAM Institute of Technology, Visakhapatnam, India
564 Ch. M. Kumar et al.
In this paper, initially, circular patch with radius r1 = 3 mm and outer square ring
with L = 16.5 mm, W = 20 mm is designed, and it generates triple band covering C,
X, Ku bands. After that, new metamaterial structure contains split-type square, and
circular rings were joined at bottom of FR4 epoxy substrate of t = 1.6 mm. Due to
insertion of innovative structure to the antenna, we got narrow band in S, wide band
in X, lower part of Ku band regions.
Finally, a cylindrical-shape resonator made up of alumina ceramic with εr = 9.9,
tan δ = 0.0001, with radius r = 7 mm, r = 9 mm is inserted on top of patch. Due to
this type of arrangement in antenna, resonant bands are shifted slightly.
1.1 Antenna Structure
We proposed three types of antennas; in Type 1, square ring is joined to circular

patch shown in Fig. 1a. In Type 2, artificial—ε, −μ cell is placed bottom of the plane
represented in Fig. 1b. In Type 3, metamaterial is added in ground plane and CDR
is added on upper part of patch shown in Figs. 1d, e.
Fig. 1 a Ant: 1 b Ant: 2 Top view c Ant: 2. Bottom view d E field of Ant: 3 e Dielectric resonator
antenna (Ant: 3) f Unit cell
Design of Wideband Metamaterial and Dielectric Resonator … 565
Here, we used FR4 as substrate. The dimensions of unit cell are 16 × 16 mm as

shown in Fig. 4. Here, S1 = 4 mm, S2 = 2 mm S3 = 2 mm, R1 = 6 m, R2 = 5 mm,
R3 = 3 mm, R4 = 2 mm, L1 = 25 mm, L2 = 25 mm, L3 = 16 mm, L4 = 16 mm.
1.2 Results and Discussions
Ant: 1 Results
Ant: 1 has good S11 at three bands ranging from 5.65–7.55 GHz, 9.80–14.16 GHz,
15.34–17.46 GHz. We got −28.18 dB, −41.5 dB, −25.24 dB lowest S11 at 6.51,
12.06, 15.77 GHz shown in Fig. 2a. Ant: 1 has less than 1.68 VSWR at all resonating
bands, especially from 5.74–7.38 GHz, 9.98–14.03 GHz, 15.45–16.17 GHz, and it
has good VSWR shown in Fig. 2b.
From Ant: 1, we got isotropic radiation pattern with gain 0.3 dB at 6.415 GHz,
greater than 3.5 dB gain at 12.067, 15.772 GHz shown in Fig. 3.
Fabricated Antenna Analysis (Ant: 2): The fabricated design has very small size
shown in Fig. 4.
From HFSS simulation tool, we got wide band at 9.86–14.82 GHz. We got wideband
experimentally from 10.72 to 14.20 GHz with slight difference with simulation results
shown in Fig. 5.
Fig. 2 a S11 of Ant: 1. b VSWR of Ant: 1
Fig. 3 Radiation plot of Ant: 1

Fig. 4 Fabricated antenna
Fig. 5 Practical and simulated results of Ant: 2
From Fig. 6, Ant: 2 has 2.4, 4.2, 4 dB gain at 3.035, 6.68, 11.405 GHz.
Fig. 6 Ant: 2 (RP)

Fig. 7 S11 of DRA (Ant: 3)
Dielectric Resonator Antenna

At 7 mm diameter of DRA, Ant: 3 resonates at 9.24–10.01 GHz, 10.93–13.74 GHz.
At 9 mm diameter of DRA, Ant: 3 has 8.78–9.60 GHz, 10.56–11.29 GHz,12.45–
14.73 GHZ resonating regions shown in Fig. 7.
After adding DRA, we get maximum gain of 5.2 dB at 14.79 GHz, 3.9 dB at
11.34 GHz,4.5 dB at 13.23GH shown in Fig. 8.
1.3 Unit Cell Analysis
The characteristics of metamaterial are analyzed using below equations [14]
1 i nk0 d

n= ln e + 2mπ − i ln ei nk0 d (1)
k0 d

1 + S11
2
− S21
2
z=± (2)
(1 − S11 )2 − S21
2
μ=nz (3)
n
ε= (4)
z
Here, n: refractive index, Z: impedance, μ: permeability, ε: permittivity.

The MTM has −ε at S, C, Ku bands and −μ at S, C, Ku bands.
From Table 1, the fabricated antenna has wide band and maximum gain of 4.2 dB
Fig. 8 Radiation plot of Ant: 3
Table 1 Comparison table

References Dimensions (W × Operating bands Technique Max. gain (dBi)
L) mm2 (GHz)
[4] 40 × 20 3–11 T-shaped ground 5
[7] 39 × 25 2.5–2.69, ZOR 2.35
3.3–3.69
[8] 16 × 30 2.2–4.6 Metamaterial 2.39
Proposed 30 × 30 9.86–14.82 Metamaterial 4.2
2 Conclusion
The main purpose of design was to get wideband antenna. Here, the wide band
width is obtained by using ring-type structures. The fabricated antenna resonates
at X, lower part of Ku bands. We got peak gain of 2.4, 4.2, 4 dB at 3.035, 6.68,
11.405 GHz. The average efficiency of antenna was >90%.The structure was fabri-
cated and compared with simulation results. In addition, the antenna performance
is investigated by adding deictic resonator, and we got maximum gain of 5.2 dB at

14.79 GHz.
References
1. A. Lewis, Wideband antenna technology in military applications, in International Workshop

on Antenna Technology (iWAT ) (Lisbon, Portugal, 2010), pp. 1–4
2. M. Ammann, M. John, G. Ruvio, Ultra-wideband antennas, in Handbook of Antenna
Technologies (2010)
3. N. Prombutr, P. Kirawanich, P. Akkaraekthalin, Bandwidth enhancement of UWB microstrip
antenna with a modified ground plane. Int. J. Microw. Sci. Technol. 2(7) (2009)
4. H. Yang, X. Xi, Y. Zhao, Y. Tan, Y. Yuan, L. Wang, Compact slot antenna with enhanced band-
edge selectivity and switchable band-notched functions for UWB applications. IET Microw.
Antennas Propag. 13, 982–990 (2019)
5. S. Mohandoss, R.R. Thipparaju, B.N.B. Reddy, Fractal based ultra-wideband antenna develop-
ment for wireless personal area communication applications. AEU Int. J. Electron. Commun.
93, 95–102 (2018)
6. N. Kishore, A.P. Vijay, S. Tripathi, A reconfigurable ultra-wide band antenna with defected
ground structure for ITS application. AEU Int. J. Electron. Commun. 72, 201–215 (2017)
7. B.D. Bala, M.K.A. Rahim, N.A. Murad, Dual mode metamaterial antenna for wideband
applications. Microw. Opt. Technol. Lett. 56, 1846–1850 (2014)
8. A.Y. Iliyasu, Wideband metamaterial antenna design, in 2018 IEEE Asia-Pacific Conference
on Antennas and Propagation (APCAP) (Auckland, New Zealand, 2018), pp. 1–2
9. L. Han, C. Wang, W. Zhang, R. Ma, Q. Zeng, Design of frequency- and pattern-reconfigurable
wideband slot antenna. Int. J. Antennas Propag. 7 (2018)
10. M. Saravanan, V. Beslin Geo, S.M. Umarani, Gain enhancement of patch antenna integrated
with metamaterial inspired superstrate. J. Electr. Syst. Inf. Technol. 5(3), 263–270 (2018)
11. M.I. Singh, V.S. Tripathi, S. Tiwari, Dual-band microstrip patch antenna miniaturization using
metamaterial. J. Eng. 5 (2013)
12. M.Z.M. Zani, M.H. Jusoh, A.A. Sulaiman, N.H. Baba, R.A. Awang, M.F. Ain, Circular patch
antenna on metamaterial, in 2010 International Conference on Electronic Devices, Systems
and Applications (2010), pp. 313–316
13. C.M. Kumar, N. Kumar Muvvala, Multi band metamaterial inspired L type slot patch antenna, in
2020 International Conference on Smart Technologies in Computing, Electrical and Electronics
(ICSTCEE), 34–38 (2020)
14. C.M. Kumar, N. Kumar Muvvala, A compact ultra-wide band rhombus shaped fractal antenna
with metamaterial in the ground plane. Int. J. Eng. Adv. Technol. (IJEAT) 8(6) (2019). ISSN:
2249-8958
Basic Framework of Different
Steganography Techniques for Security
Applications
R. Chinna Rao, P. V. Y. Jayasree, S. Srinivasa Rao, G. Srinivasa Yeshwanth,

K. R. S. Megana, K. Shreya, and K. Suprasen
Abstract The main aim of this article is to develop a basic framework for different
steganography techniques in real-time applications. Today, as the utilization of the
Internet expands, security is gaining momentum. Steganography is a strategy for
camouflage confidential data behind an innocent cover record, so that all subtitles
are not known. The most effective method to ensure data information to utilize the
idea of steganography. The aim of steganography is separated into a few types. In this
study report, we will examine main sorts of steganography, by taking a gander at the
cover record as initial step—basic to all classifications, we will talk about text, image,
speech, and video steganography. Steganography text is an interaction that utilizes at
text cover. We will go with line change, how to change names, syntactic strategy, and
favored technique for covering. Coming to speech steganography, it contains LSB
coding, encoding segment, spread spectrum, echo hiding techniques, and speech
steganography utilizing fast Fourier transforms. The idea of video steganography is
utilized to shroud data behind the scenes of video outlines. The concept of steganog-
raphy and cryptography security principles is given in information. Strategy used to
implant LSB information on paper. Video steganography can shroud a lot of informa-
tion in a basic and compelling manner, and we will utilize a wide range of stegano-
graphic techniques. Along these lines, the test proof demonstrates the adequacy of
the covering technique in speech steganography. In this way, we will actually want
to recover classified data for somewhat quality corruption utilizing steganographic
decoder concept.
Keywords Steganography · Text steganography · Image steganography · Speech

steganography · Video steganography · Spatial domain · LSB · JPEG · DFT ·
DCT · Line shifting · Word shifting · Echo hiding · Phase coding · Wavelet and
FFT using DWT · Spread spectrum and fast Fourier transforms
R. C. Rao · S. S. Rao · G. S. Yeshwanth (B) · K. R. S. Megana · K. Shreya · K. Suprasen

Department of Electronics and Communication Engineering, Malla Reddy College of Engineering
and Technology, Hyderabad, Telangana, India
P. V. Y. Jayasree
Department of EECE, GITAM University, Visakhapatnam, Andhra Pradesh, India
e-mail: jpappu@gitam.edu
572 R. C. Rao et al.
1 Introduction
The name of steganography is derived from Greek words “stegos” means cover,
“graphia” means writing, defined as “cover writing” [1]. Steganography is eluci-
dated as art and science of embedding a secret message in cover message, such
how that, none other than the sender and intended recipient only suspects the secret
message in the cover data. There are many types of steganographic techniques. In this
paper, we will be discussing closely at few different types of steganographic tech-
niques. Here is the simple classification, representing different techniques involved
in steganography. In recent years, steganography is playing vital role in the popular
social network communication channels such as Facebook, WhatsApp, and Twitter
and has larger significance in terms of security, privacy, and embedding capacity
parameters [2]. Also, its applications have been extended in sharing the informa-
tion of medical data, banking information, broadcasting, and military intelligence.
Steganography is a widespread hiding technique in present days due to the distor-
tion handling capacity, imperceptibility, and enormous base to embed the hidden
information. Steganography classified into various types as shown in Fig. 1.
Fig. 1 The classification of steganography techniques
Fig. 2 Basic frameworks for steganography

Basic Framework of Different Steganography Techniques … 573
2 Basic Frameworks for Steganography
All the stenographic approaches are performs the same operation as shown in Fig. 2.
The sender and receiver should follow some basic communication protocols before
performing the secret communication, which is widely related to.
• Cover sources
• Embedding and extraction algorithms
• Stego key sources to drive the embedding/extraction algorithms
• Messages sources
• Selection of channel to exchange the information.
The cover sources are usually digital speeches with more attributes such as speech
format, resolution, content type, and size. The secret communication is not possible
between sender and receiver if the cover object is formulated completely [3]. Nowa-
days, the steganography creates some fundamental assumption about the cover source
that facilitates conventional analysis. While considering the cover source as a random
variable, which involves information theory analysis. Here, we shall describe in-
detail, how does a steganography hides a message and how to recoup the secret
message. The following flowchart system of steganographic techniques.
So, the above flowchart describes in easier way to understand the concept of
steganography, which hides a secret message in cover data. In above pictorial repre-
sentation, the function f (x), f (m), f (k) are merged into a single function f (x, m, k).
Where the function f (x) denotes the cover file, f (m) signifies the secret message that
to be embedded and function f (k) denotes the key, (optional) which is addressed as
stego-key, used as password to hide and un-hide the message. Presently, we shall
examine each progression in-detail, with end goal that peruser and will get a total
outline of the idea of steganography, and its strategies. Consider, a cover file c(x)
without any secret data embedded into it. Now, we shall embed a secret message
s(m) into c(x) using steganographic encoder also called as “stego object” function
f (x, m, k). The resulting cover file while remains similar as cover file c(x), which was
used at initial step, and no changes are visible. To recoup the code, the stego object
should be sent to “steganographic decoder.” The secret data with slight degradation
in quality using steganographic decoder.
3 Literature Survey
This section deals with study and implementation of various steganography

approaches.
3.1 Text Steganography
Text steganography is a technique which uses text as cover. We will study some
basic steganographic techniques, namely line shifting method, word shifting method,
syntactic method, and selective hiding method.
3.1.1 Line Shifting Method
In this manner, the data are covered up by utilizing an alternate book design in the
content line by pointing it upwards somewhat to conceal the message [1, 2]. The
specific example used to create cover text is finished by evolving lines. Pieces like
0, 1, and −1 can highlight removable, raised, and switched lines [3, 4]. Deciding if a
line is exchanged up or down is made by estimating the centroid distance of checked
line and its control lines [5]. In the event that the content is re-composed or when the
calculation (OCR) is utilized, scrambled information will be tainted. Additionally,
distances can be identified utilizing exceptional distance test apparatuses [6]. The
main advantages include that when we want to print the text then only this message
can be used and just the collector can extract the secret information. The disadvantage
of this method is when OCE is applied, then the hidden data are lost.
3.1.2 Word Shifting Method
In this way, the information is hidden by changing the words horizontally, e.g., left
or option to address 0 or 1, respectively. For this reason, the document is coded
separately. This method is only applicable to those who texts with variable spaces
between adjacent words [6]. Flexible spaces in text documents are often used to
spread white space when preparing text. To determine the message, we must have
first message frame. It can be trimmed using a bitmap [7]. Such advantages include
that it cannot be recognized effectively on the grounds that the adjustment in the hole
between the words in normal in light of the fact that to fill the hole in line additional
dividings are utilized. The disadvantages of such case are that if anyone is capable
of this methods or algorithm of word shifting, then the message is not safe and also
retyping or using OCR destroys the stego message [8].
3.1.3 Selective Hiding Method
This shrouds the characters in the first (or a particular area) characters of the words.
Connecting those characters help separating the content [9, 10]. The main advantage
of this method is that this is featured security and capacity, the needed aspects of
steganography that make it useful in the hidden exchange of the information through
text documents and establishing secret messages.
Capacity—Limit is characterized as the capacity of a cover media to conceal privi-

leged Intel. The limit proportion is processed by partitioning the measure of covered
up bytes over the size of cover text in bytes.
Amount of data (hidden in bites)

Capacity Ratio = (1)
Size of cover text in bytes
3.2 Image Steganography
This secret messaging technique is called steganography. The name is derived from
the Greek word “σ τ εγ αυω” meaning “secret or combined writing.” In modern
times, steganography can be viewed as a study of the art and science of communi-
cation in a way that conceals the existence of communications in order to protect
information. Images are an excellent way to hide information because it offers a high
degree of distortion—meaning that there are many bits available that offer greater
accuracy than is needed in the use of an object (or display) [11, 12]. The basic frame-
work of image steganography presented in Fig. 3. Here, based on the embedding
approach, the steganographic strategies are divided into various domains as follows.
3.2.1 Compression
When working with high resolution images with high color depth, immature file size
can be large also hard to communicate through a standard Internet association. To
address this, compressed image formats have been developed, as you would guess,
Fig. 3 Image steganography techniques

compressing pixel details and keeping file sizes extremely small, making it possible
to transfer [12].
3.2.2 Spatial Domain Technique
Local domain strategies embed private message or payload with direct pixel power;
which means they update pixel data by inserting or inserting other pieces. Missing
images are good for these methods as compression will not change the embedded
data. These methods should know the image format to make the concealment details
into pointless evidences [13, 14].
3.2.3 LSB Substitution
This method converts a private message or paid upload into a small stream and
converts it into LSB (8th bit) for some or all bytes within the image. Changes occur
slowly the most important energy converter by +1 which is hard for the human eye
to detect. When using a 24-bit image, each part of the red, green, and blue object
is replaced. With a hardness of about 256 for each major color, changing the LSB
pixels brings a slight change in color intensity [15].
3.2.4 JPEG Steganography
To understand the concept how steganography works for JPEG files, we shall under-
stand the concept behind the question “how the raw data are compressed by JPEG
and then, how we could hide data in it?”
3.2.5 JPEG Compression Using Image Steganography
According to research, the natural eye is more touchy to changes in the brilliance
(light) of a pixel than to an adjustment in shading. We characterize light and shading
rather than the encompassing areas. The compression phase uses this understanding
and converts the image from RGB color to YCbCr representation—light separated by
color. In the representation of YCbCr, the Y-component corresponds to light (light–
black–white) and Cb (yellow–blue) and Cr (green–red) elements. Now, we discard
some color data by halving the sample in both vertical and vertical directions and
thus directly reducing the file size by two elements.
3.2.6 Discrete Fourier Transform
This adjustment is viewed as the main alteration used to complete Fourier investiga-
tion on many working frameworks. Tests can be the quantity of pixels in arrangement
or in the raster picture segment in the picture handling [16].
3.2.7 Discrete Cosine Transform (DCT)
This transformation uncovers a reliable grouping of information focuses in the feeling

of the absolute number of variable cosine capacities in different waves. DCTs are
significant in an assortment of designing and logical applications, for example, lost
sound decrease, for example, MP3 documents and pictures, for example, JPEG
records any place minuscule items are denied. Truth be told, the utilization of cosine
rather than sine capacities is significant in pressure, in light of the fact that a limited
quantity of cosine movement is needed to quantify the typical sign [17].
3.3 Speech Steganography
In speech steganography, a secret message is embedded in the speech signal which

converts the corresponding collection of speech document except that; it is simply
classified as a corresponding speech file, i.e., file cover c(x) as shown in Fig. 4.
Hiding private messages with digital audio or audio signal, a very complex process
compared to others such as image steganography; here are a few methods included
with in speech steganography:
1. Least significant bit encoding (LSB encoding)
2. Echo hiding
3. Wavelet and FFT using discrete wavelet transforms
4. Speech steganography using SS technique.
Fig. 4 Basic speech steganography

Fig. 5 LSB encoding, step 1
Fig. 6 LSB encoding process
3.3.1 LSB Encoding
The LSB algorithm is an easy way to insert data into a digital audio file [12]. The
sampling technique followed by quantization converts to a digital binary sequence
from analog audio signal for each computer generated audio file, replacing the binary
equivalent message [18]. We will include LSB for each sample point with a binary
message. The great advantage of using the LSB algorithm is that the LSB process
allows a large amount of data to be embedded/embedded in a message file. The
transfer rate is 1 KBPS per second (1) per second. In other words, we can simply
describe the LSB. Algorithm as the old stego method used to hide the presence of
private information within a public cover (or) message file. For example, it represents
the decimal number 493 in binary writing such as 111,101, and 101. We think of an
address that starts right and goes up to the left. As shown on the fig. tree, LSB in this
case 1 as shown in Fig. 5. The concept, LSB algorithm, replaces the LSB of each
byte in “carrier” data from “secret” message.
The sender likewise plays out the way toward implanting secret message into
transporter record as byte-to-byte. The receiver performs the extraction procedure by
reading LSB algorithm of each data received byte. So, how does a receiver recoup
the secret message? Here is a flow diagram that delivers the simple technique to
comprehend the extraction of secret message as shown in Figs. 6 and 7.
3.3.2 Echo Hiding
In this concept, the confidential message has to get inserted into baseband signal
referred as “echo.” We consider three parameters of an echo for baseband signals,
namely—amplitude, decay rate, and offset from original signals. These are differed
to represent encoded secret binary message. They are set below to threshold of human
Fig. 7 LSB extraction process
auditory system (HAS), so that the echo cannot easily get revolved. The original cover
video consists of frames represented by Ck(m, n) where “N” is the total number of
frame, and m, n are rows and columns indices of the pixels. The binary secret message
denoted by Mk(m, n) is embedded into cover video media by modulating it into a
signal. The stego—video signal is represented as
S(m, n) = Ck(m, n) + ak(m, n)Mk(m, n) where, k = 1, 2, 3, . . . N (2)
Now, ak(m, n) is scaling factor. For simplicity, ak (m, n) can be considered to be

constant over all pixels and frames. The equation becomes
S(m, n) = Ck(m, n) + a(m, n)Mk(m, n) where, k = 1, 2, 3, . . . N (3)
3.3.3 Wavelet and FFT Using Discrete Wavelet Transformations
“Wave transformation can be considered as converting the signal from a time zone to a
wavelet domain.” This new domain contains more complex functions called wavelets,
mother wavelets, or examination of wavelets [5]. The wavelet examination permits
the division of the message signal into two corresponding (or) symbols. This process
is called “decay.” Items on the edge of the signal are mostly allowed in the high
frequency section. This message signal is transmitted through a series of high-level
filters to analyze these high frequencies. Filters of various cutting methods are used to
analyze the signal at different resolutions. [6]. The DWT process involves choosing
positions and scales, based on the power of two articles, called dyadic scales and
positions. The mother wavelet is redeemed by two forces and converted into values.
Specifically, the function f (t) L2 (R), which describes the square joint function,
represented as the function (t) is known as mother wavelet, while the function (t)
known as scaling wavelet.
Speech Steganography Using Spread Spectrum Technique—Spread spectrum

techniques are widely utilized in electronic communications like CDMA mobile
communication. The primary patent was revealed by Lamar H. and St. George Antheil
in the year 1941 for providing secret communication for military functions, spread
spectrum techniques area unit strategies by that energy generated above all band-
widths are deliberately unfold within the frequency domain, leading to an indication
with a wider information measure [19].
3.4 Video Steganography
The concept of video steganography is employed to cover the information behind the
frames of videos. First, the information is encrypted using cryptography algorithm,
next the encrypted data are embedded into frames of videos. The technique accus-
tomed embed the information is LSB coding. Video steganography technique can
hide great deal of knowledge in most simplest and efficient way as shown in Fig. 8.
The video steganography uses the mix of both cryptography and steganography for
hiding the key data behind the video clips. Therefore, this is often a double security
system. The key data are first encrypted then the encrypted data are hidden behind
the frames of video. Using the key which is thought only to sender and receiver, the
sender encrypts the information and sends it to the destination, where the receiver
decrypts the information using the identical key. Nobody can easily detect that secret
information which is hidden behind the video. The change within the size of original
Fig. 8 Basic video speech steganography

video and encrypted video claims that the information is hidden behind the video
[20].
Here, we have the hash-based least significant bit technique for video steganog-
raphy which performs insertion of bits of text file in video in the least significant bit
position of RGB pixel as per hash function. In this way, it includes encoding and
decoding process for hiding message and extracting message, respectively. First of
all, text will be embedded within the video by using the steganographic tool. This
stego video file is again applied to steganographic tool to decode embedded data
[21].
3.4.1 Video Steganography Methods
There are various steganographic methods have been proposed in literature. A secured
hash-based LSB technique for image steganography has been implemented [22]. The
basic requirement of hiding a data in cover file will be explained [23]. The technique
[24] of data hiding for high resolution video is proposed. Hiding data using the motion
vector technique for the moving objects are introduced in [25]. In this compressed
video, it is used for the data transmission since it can hold large volume of the data. The
stego machine to develop a steganographic application to hide data containing text in
a computer video file and to retrieve the hidden information is designed [26]. A robust
method of imperceptible audio, video, text, and image hiding is proposed [27]. The
most secure and robust algorithm are introduced [28]. An improved least significant
bit (LSB)-based steganography technique for images imparting better information
security [29]. A new compressed video steganographic scheme in which the data
are hidden in the horizontal, and the vertical components of the motion vectors are
proposed [30]. There is system for data hiding uses AES for encryption for generating
secret hash function or key [31]. A hash-based least significant bit (LSB) technique
has been proposed.
4 Conclusion
We have successfully developed basic framework of different steganography tech-

niques and also the paper presented the different techniques like text, image, speech,
and video steganography techniques and their real-time implementations. We have
given four types of study information in steganography as text, image, speech, and
video steganography. This concept provides a variety of additional layer of protec-
tion and reduces the possibility of revealing secret information from a cover image.
Digital watermark technology is currently being used to track patents and patents
for digital content. With the advancement of technology, it is expected that more
efficient and more advanced methods will be developed in steganography analysis
that will help law enforcement to better access illegal Internet communications.
References
1. M.A. Saleh, Image steganography techniques—a review. Int. J. Adv. Res. Comput. Commun.
Eng. 7(9), 52–58 (2018). https://doi.org/10.17148/IJARCCE.2018.7910
2. S.H. Low, N.F. Maxemchuk, J.T. Brassil, L.O. Gorman, Document marking and identification
using both line and word shifting, in INFOCOM’95 Proceedings of the Fourteen Annual Joint
Conference of the IEEE Computer and Communication Societies (1995), pp. 853–860
3. S. Bhattacharyya, I. Banerjee, G. Sanyal, Data hiding through multi level steganography and
SSCE. J. Glob. Res. Comput. Sci. J. Sci. 2(2), 38–47 (2011). ISSN: 2229-371x
4. J.T. Brassil, S. Low, N.F. Maxemchuk, L. O’Gorman, Electronic marking and identification
techniques to discourage document copying. IEEE J. Sel. Areas Commun. 13(8), 1495–1504
(1995)
5. M.S. Shahreza, M.H.S. Shahreza, Text steganography in SMS, in 2007 International Confer-
ence on Convergence Information Technology (2007), pp. 2260–2265
6. W. Bender, D. Gruhl, N. Morimoto, A. Lu, Techniques for data hiding. IBM Syst. J. 35, 313–336
(1996)
7. F.A.P. Petitcolas, R.J. Anderson, M.G. Kuhn, Information hiding—a survey. Proc. IEEE 87,
1062–1078 (1999)
8. L.Y. Por, B. Delina, Information hiding—a new approach in text stegano, in 7th WSEAS
International Conference on Applied Computer and Applied Computational Science, 2008,
pp. 689–695.
9. L.Y. Por, T.F. Ang, B. Delina, WhiteSteg—a new scheme in information hiding using text
steganography. WSEAS Trans. Comput. 7(6), 735–745 (2008)
10. S. Changder, D. Ghosh, N.C. Debnath, Linguistic approach for text steganography throughIn-
dian text, in 2010 2nd International Conference on Computer Technology and Development
(2010), pp. 318–322
11. S. Kurane, H. Harke, S. Kulkarni, Text and audio data hiding using LSB and DCT a review
approach, in National Conference on “Internet Things Towar a Smart Future” Recent Trends
in Electronics and Communication (2016)
12. E. Nandhini, M. Nivetha, S. Nirmala, R. Poornima, MLSB technique based 3D image
steganography using AES algorithm. J. Recent Res. Eng. Technol. 3(1), 2936 (2016)
13. J. Kour, D. Verma, Steganography techniques—a review paper. Int. J. Emerg. Res. Manag.
Technol. 9359(35), 2278–9359 (2014)
14. E.R. Harold, What is an Image (2006)
15. M. Hussain, M. Hussain, A survey of image steganography techniques. Int. J. Adv. Sci. Technol.
54, 113–124 (2013)
16. N. Hamid, R.B. Ahmad, Image Steganography Techniques: An Overview, vol. 6 (2012),
pp. 168–187
17. G.N. Kumar, V.S.K. Reddy, Extraction of key frames using rough set theory for video retrieval,
in International Conference on Soft Computing and Signal Processing (Springer, Singapore,
2019), pp. 761–768
18. C.A. PetrKlapetek, D. Nečas, Wavelet transform [Online] (2016)
19. S. Khosarvi, M.A. Dezfoli, M.H. Yektaie, A new steganography method based HIOP algorithm
and Strassen’s matrix multiplication. J. Glob. Res. Comput. Sci. 2(1) (2011)
20. G.J. Simmons, The prisoners’ problem and the subliminal channel, in Proceedings of Advances
in Cryptology (CRYPTO’83), pp. 51–67. J.F. Berglund, K.H. Hofmann, Compact Semitopo-
logicalsemigroups and Weakly Almost Periodic Functions. Lecture Notes in Mathematics, vol.
42 (Springer, Berlin, New York, 1967)
21. B. Dunbar, A Detailed look at Steganographic Techniques and Their use in an Open System
Environment (SANS Institute InfoSec Reading Room, 2002)
22. B. Lin, B. Nguyen, E.T. Olsen, in Orthogonal Wavelets and Signal Processing, ed. by P.M.
Clarkson, H. Stark. Signal Processing Methods for Audio, Images and Telecommunications
(Academic, London, 1995), pp. 1–70
23. S. Mallat, A Wavelet Tour of Signal Processing (Academic, San Diego, CA, 1998)
24. S Andreas, P.T. Ed, A. Venkatraman, Audio Signal Processing and Coding (Wiley-Interscience
Publication, USA, 2006). ISBN 978-0-471-79147-8, TK5102.92.S73
25. A. Kumar, R. Sharma, A secure image steganography based on RSA algorithm and hash LSB
technique. Int. J. Adv. Res. Comput. Sci. (2013)
26. K. Rabah, Steganography the art of hiding. Inf. Technol. J. 3(3), 245–269 (2004)
27. A.K. Bhaumik, M. Choi, R.J. Robles, M.O. Balitanas, Data hiding in video. Int. J. Database
Theory Appl. 2(2), 9–16 (2009)
28. P. Paulpandi1, T. Meyyappan, Hiding messages using motion vector technique in video
steganography. Int. J. Eng. Trends Technol. 3(3), 361–365 (2012)
29. M. Ramalingam, Stego machine video steganography using modified LSB algorithm. World
Acad. Sci. Eng. Technol. 50, 497–500 (2011)
30. P. Bhautmage, A. Jeyakumar, A. Dahatonde, Advanced video steganography algorithm. Int. J.
Eng. Res. Appl. (IJERA) 3(1), 1641–1644 (2013)
31. G.S. Naveen Kumar, V.S.K. Reddy, An efficient approach for video retrieval by spatio-temporal
features. Int. J. Knowl. Based Intell. Eng. Syst. 23(4), 311–316 (2019)
Call Admission Control for Interactive
Multimedia Applications in 4G Networks
Kirti Keshav, Ashish Kumar Pradhan, T. Srinivas, and Pallapa Venkataram
Abstract Virtual Reality (VR) belongs to emerging interactive multimedia applica-

tions which have strict requirements from underlying 4G communication networks
such as LTE advanced networks. These applications have critical throughput and
latency needs. They need consistent throughput even at the cell edge and have strict
requirements for both uplink and downlink. As the number of users increases and
system load increases, present-day 4G Network faces a lack of resource availability.
This work proposes an adaptive call admission control scheme so that the overall
necessary resource allocation considering user feedback and history for interactive
applications can happen. The scheme uses mobile agents to gather the resource
requirement of the applications from the various 4G network elements such as VR
server, evolved nodeB, Packet Gateway, and various customer sites. The scheme
further use this information to selectively offload non-interactive traffic to alternate
available links whenever such a link is available to meet interactive applications
traffic requirements. The developed scheme is based on an adaptive threshold and
has been tested on 4G networks with several virtual reality users along with other
background loads of applications. Results indicate that system is practically imple-
mentable and the proposed scheme can improve the quality of service for interactive
applications over 4G networks.
Keywords Call admission control · Interactive multimedia · 4G networks
K. Keshav (B) · A. K. Pradhan

Samsung Semiconductor India Research, Bengaluru, Karnataka, India
e-mail: kirtik@samsung.com
A. K. Pradhan
e-mail: ak.pradhan@samsung.com
T. Srinivas · P. Venkataram
Indian Institute of Science, Bengaluru, Karnataka, India
e-mail: tsrinu@iisc.ac.in
P. Venkataram
e-mail: pallapa@iisc.ac.in
586 K. Keshav et al.
1 Introduction
Virtual Reality (VR) and Augmented Reality are upcoming emerging interactive
multimedia applications. These applications are useful in gaming, remote health
care applications especially during pandemic times, autonomous transport, remote
educational training, tactile internet, and industrial automation. However, these appli-
cations face challenges over 4G LTE cellular networks. This is because VR applica-
tions have significant throughput requirements for both uplink and downlink. Firstly
for both uplink and downlink, more capacity is required. However, the requirement is
asymmetric. Secondly, if network can achieve low latency, it helps for an immersive
experience. For latency, VR application’s uplink latency needs are more stringent as
compared to downlink. The third key requirement is to have a consistent experience
of full immersion everywhere, which requires consistent throughput even at the cell
edge.
Call admission control is an essential tool in wireless networks to ensure the
quality of experience (QoE) for various applications by ensuring the right amount
of load. Call admission control is usually done based on service level agreement.
It controls the incoming flow so that network can ensure QoS for existing and new
flow. A robust call admission control scheme is sought to address the traffic demand
for interactive multimedia applications.
This research work aims to design solutions to the critical issues in 4G based
integrated multimedia networks for interactive applications such as interactive vir-
tual reality applications. To solve these issues in the cellular network, use of agent
technology is proposed. Agents can make decisions by themselves on behalf of the
user, migrate from node to node, and dynamically resolve issues occurring at vari-
ous network elements. Thus, a mobile agent-based framework provides an efficient
solution to solve multimedia network issues.
Organization of rest of the paper is as follows. In Sect. 2, existing works for
handling call admission issues arising while handling interactive multimedia are dis-
cussed. Further in Sect. 3 call admission control issues in 4G networks are discussed.
In Sect. 4, we propose an agent based call admission control scheme which effective
takes of of quality of service needs of interactive VR applications. In Sect. 5 analyt-
ical model used to model proposed scheme is presented. Simulation and results are
discussed in Sect. 6. Conclusion is presented in Sect. 8, and finally future works are
presented in Sect. 9.
2 Existing Work
Call admission control is an important tool to meet desired QoS for users in 4G
LTE cellular networks. There have been several important works which address call
admission control issues in LTE networks. In [12], author proposes call admission
control to improve parameters such as packet delay and packet dropping as well as
Call Admission Control for Interactive Multimedia Applications in 4G Networks 587
to improve call dropping probability. In [9], to take care of packet-level and call level
QoS considerations across the network, i.e., both wired and wireless part of network,
two-tier call admission control scheme for 4G networks is presented. In [5], the
author presents a Markov chain-based performance model for dynamic call admission
control scheme in cellular networks. In [2] a portion of bandwidth used by admitted
non-real-time traffic is released with the aim of reduction of call dropping probability
by reducing handover call drop probability as well as to increase bandwidth utilization
to provide high QoS to real-time traffic. In [3], considering macro and small cells,
a call admission control mechanism is proposed. In [7], to avoid the starvation of
best effort traffic, a novel CAC scheme to provide effective use of network resources
is presented; however, here author do not discuss interactive application needs. The
involving an adaptive threshold value, which adapts the network resources under
heavy traffic intensity is presented. In [10], a channel borrowing CAC scheme in
two-tier LTE/LTE-Advance networks. Autonomous mobile agents have been used in
networks to solve various issues. A mobile agent consists of code, state, and attributes
and therefore mobile agents allow all network components to become intelligent
[14]. Few techniques on how best the base stations can allocate resources in the 5G
network are investigated in [11]. In [6] an overview of the security issues related to the
mobile agent based systems is presented. In [13], the author proposes a call admission
control, primarily considering minimal energy consumption in mind and to ensure an
acceptable quality of experience for application requests. In [13], the author proposes
a call admission control, primarily considering minimal energy consumption in mind
and to ensure an acceptable quality of experience for application requests. In [1], call
admission control is addressed for a heterogeneous cell, which involves both small
and large cells. New handover schemes which take care of movement across the
large cell and small cell are proposed. A Markov chain technique is used to calculate
call blocking probability of various subscriber requests as they move across different
types of cells. In [8], several schemes such as complete sharing and probabilistic
threshold policy are used for call admission control across 4G/5G networks. In [4],
a Markov Model of Slice Admission Control is presented.
3 Call Admission Control Issues in 4G Networks
The 4G network provides a mechanism for call admission control access class mech-
anisms, which can be controlled using Allocation and Retention Priority (ARP)
parameters. Each bearer in the 4G network has associated Priority Level (PL). In
4G networks, fifteen levels of priority mechanism are provided. However, given the
variety of types of applications which have evolved in recent years, even 15 level
of priorities in 4G networks are not sufficient. Another crucial issue is handling call
admission control across multiple wireless access technologies like cellular and WiFi
based on its location. Not only handovers within LTE network, but also handovers
from other technologies should also be accommodated in call admission control.
Therefore, 4G networks need to adapt their call admission control mechanisms for
interactive multimedia applications. There are many approaches like guard channel,
fractional guard channel, mobility based, and price based mechanism, which guide
the application’s admission control in 4G networks. 4G networks provide means to
allow call admission control based on different types of users, however, consider-
ing the latency needs of interactive multimedia applications like VR as an essential
parameter, call admission control mechanisms are required to be redesigned.
4 Proposed Scheme
In Fig. 1, architecture of proposed agent based call admission control scheme using
mobile agents is presented. Static agents are deployed in 4G cells, 4G home subscriber
stations, and media servers. Based on need, mobile agents will be deployed on other
key components of the 4G network environment, for example, in packet gateway,
serving gateway, and UEs.
The proposed agent based call admission control scheme has three essential steps.
In the first step, the call admission policy is decided. For this static agent in eNB
dispatches the mobile agent to entities involved in the call admission control mech-
anism in the 4G network. Initially, when the system is not under high load, all types
of traffic connections like interactive VR and non-interactive VR are accepted. The
Fig. 1 Architecture of agent based call admission control scheme over 4G

policy will define the present value of the admission threshold, which should be
used. Algorithms 1, 2 and 3 provide steps of proposed agent based call admission
control scheme. In Algorithm 1, call admission policy for 4G networks is derived
based usage of network resources, history of past application performance and radio
level measurements during previous sessions. Accordingly, call admission threshold
is increased or decreased to priorities admission control for interactive applications.
In Algorithm 2 bearer allocation and exact admission of interactive application are
decided including dropping of non VR applications if necessary. In Algorithm 3,
scheme is proposed to adapt call admission control policy by changing admission
threshold based on implicit user feedback.
Algorithm 1 Dispatch of mobile agent and deciding on dynamic call admission

policy for VR
1: Begin
2: Input: History of previous VR requests’s performance
3: Output: Derivation of call admission policy based on history
4: Static agent at eNB dispatches a mobile agent to S-GW, P-GW, PCRF
5: Mobile agents are deployed and together agents monitors the performance
6: while New session arrives do
7: The usage of resources across network elements is checked
8: Application performance and radio level measurements during previous sessions are analyzed
9: Agents monitor user behavior like user aborting the session in between due to poor quality
10: if Quality is poor during the previous session then
11: MA will update PCRF and PCEF will be updated with new policy
12: Call admission threshold is increased to not to allow non-interactive calls
13: else if User feedback (collected in a non-intrusive way) is satisfactory then
14: Call admission threshold is adapted to allow more non-interactive calls
15: eNB static agent, mobile agents across network elements P-GW and S-GW are updated
with new policy
16: end if
17: end while
18: End
5 Analytical Model of Bandwidth Allocation

and Monitoring
An analytical model with a M/M/1 system with the extra capability to accept only
interactive calls once system overload is detected has been proposed. In the begin-
ning, both interactive and non-interactive application requests are accepted as they
arrive. When both interactive and background applications go beyond a threshold,
the application’s delay will also be very high. Specifically, when the total incom-
ing request rate becomes high, a static agent in eNB, in consultation with S-GW
Algorithm 2 Call admission control for interactive VR application in 4G

1: Begin
2: Input: Monitor incoming request by application
3: Output: Decide to admit the application and allocate bearer
4: Static agent at eNB and P-GW available resources in air interface and backhaul
5: Mobile agent at UE and S-GW monitor resources usage in air interface and core network
6: while Check periodically about current load do
7: User starts interactive VR application
8: if Check if available resources is sufficient for interactive VR application then
9: Admit new interactive request and dedicated bearer is allocated
10: else if Check if available bandwidth is below the threshold arrived in previous step then
11: Allocate bandwidth via alternate link for non interactive application requests
12: Accept only interactive VR across the main data path
13: else if If alternate link not available then
14: Drop non VR load
15: end if
16: end while
17: End
Algorithm 3 Periodic adaptation of threshold of call admission policy

1: Begin
2: Input: Application and key network performance data is collected by mobile agent/static
agent
3: Output: Adapt call admission policy based on history
4: Static agent dispatches mobile agents to new network elements based on user mobility and
monitors the performance
5: while Check periodically about usage do
6: Application performance and radio level measurements are collected
7: User’s feedback is collected using an automated way, for example, a user is still checking
screen
8: if User’s feedback is poor and BER performance was not good during the previous session
then
9: Policy database is updated with the new policy
10: else if User feedback is good then
11: Policy is updated, and future resources are given based on an improved policy decision
12: Threshold is adjusted to allow more non interactive applications
13: else if History based learning predicts future increased need then
14: Admission threshold is adjusted allow less non interactive applications
15: end if
16: end while
17: End
and P-GW’s mobile agent, decides on the call admission threshold. The threshold
depends on the present capacity available across these nodes and the requirement
of interactive VR applications. A mobile agent-based system then can decide about
this threshold value as per the requirement of VR applications before actual over-
load happens. So after a threshold, only interactive applications are accepted in the
path across eNB/P-GW, and S-GW and non-interactive applications are re-directed
Fig. 2 Analytical queuing model for call admission control
Fig. 3 State diagram for call admission control queuing model
through other available links. Other available links could be through a device to
device link available in a 4G network or other radio access technology.
Figure 2 describes the analytical queuing model used for call admission control.
Figure 3 describes state diagram for the proposed call admission control model. It is
assumed that each session of the same type of traffic will have same specific require-
ments in terms of the number of slots occupied. So, each session of interactive and
non-interactive traffic takes one slot. The scheduling period in LTE is one subframe.
Let maximum number of non-interactive sessions be N N . Let the number of
interactive sessions be N I . Let total capacity be C and N I + N N < C It is assumed
that the interactive application rate follows Poisson processes with a rate of λ I .
Similarly, the non-interactive application rate also follows the Poisson process with
a rate of λ N .
Let N be the threshold on the number of total request rates inclusive of VR and
non-VR requests, beyond which if more non VR requests arrive, they are offloaded
to nearby link like device to device link as shown in Fig. 1. Please note that N ≤ C.
In Fig. 3, state diagram for the proposed call admission control model is presented.
pk defines the probability that there are k calls (data sessions) in the system.
λ ∗ pk = μ ∗ pk+1 , 0 ≤ k < N (1)
Let p be the fraction of a load of the VR application request rate as compared to the
non-VR application request rate.
p ∗ λ ∗ pk = μ ∗ pk+1 , k ≥ N (2)
So, with ρ = μλ ,
pk = ρ k ∗ p0 , 0 ≤ k < N (3)
and
p N +k = ρ N ∗ ( pρ)k ∗ p0 , k ≥ 0 (4)

From the normalization, ∞ k=0 pk = 1 Now an important measure is Pr ej , which
defines how much the non-VR application request rate is not accepted and is offloaded
to nearby links.
∞
p0 ρ N
Pr ej = p N +k = (5)
k=0
(1 − pρ)
Delay I and Delay N are the mean delays for the VR request rate and non-VR
request rate classes of traffic.
Using PASTA can can calculate Delay for Interactive applications as below.
1 − ρ N +1 − (N + 1)ρ N (1 − ρ) 1
Delay N = p0 (6)
(1 − ρ)2 μ

1 − ρ N +1 − (N + 1)ρ N (1 − ρ) ρN N ρN 1
Delay I = p0 + +
(1 − ρ)2 (1 − pρ) (1 − pρ)2 μ
(7)
6 Simulation Environment
In the simulation, we have considered the LTE advanced network, wherein the net-
work span a residential society. A virtual reality server has been installed near the
base station, which acts as an edge server, and also, there is the main server that
lies in the internet domain. VR application has been activated in the server, and it
is accessible to all the 100 UE nodes. These UE are spread across networks. Bearer
allocation follows a definite rule for allocating bearer allocation. Non-interactive VR
applications have only default bearer, while VR users have a dedicated bearer and a
default bearer.
7 Results
For 4G systems, it is observed that since 4G capacities are not sufficient for high-end
VR applications, the system is overloaded. Now in the overloaded system, when
both interactive VR requests and non-interactive requests arrive, with an adaptive
rate of offload, non-VR requests are offloaded to alternative links available. If the
Fig. 4 Delay comparison and offload percentage in 4G
alternate link is not available, these non-VR applications can wait for some time,
and if still, resources are not available, then non-interactive requests are dropped.
Delay performance of the interactive application and non-interactive applications
with respect to increased arrival is shown in Fig. 4a. Here, we see that as the num-
ber of users increases, interactive VR applications face the least delay because of
the proposed mobile agent-based scheme. However, non-interactive VR experience
higher delay as expected. Further, the amount of traffic belonging to non-interactive
applications being offloaded has been studied. As incoming traffic load increases, it
causes a specific portion of non-interactive traffic to be offloaded. As the arrival rate
increases, as shown in Fig. 4, the amount of the non-interactive application session
being offloaded also increases. Together through these two graphs in Fig. 4a and in
Fig. 4b , it is observed that through appropriately prioritizing interactive VR appli-
cations after a specific load level, prioritizing resources for interactive multimedia
across the data path in 4G such as eNodeB, S-GW and P-GW leads to the desired
performance for interactive VR applications.
8 Conclusion
4G LTE Networks are resource constraint especially when there are large number of
users in an area. When there is an increased number of interactive VR users and non-
interactive VR users, the system is overloaded. Whenever overload condition exist,
scheme of offloading of non VR traffic can allow system to allow preferential call
admission control to VR users. Proposed scheme provides improved performance
in 4G networks as interactive VR requests occur less delay as compared to non-VR
requests. For dynamic call admission control, the rate at which non-VR traffic waits
adapts to increased load to maintain overall system stability.
9 Future Works
4G networks do not provide scalable solution for meeting needs such as throughout
or less delay to for latency dependent interactive applications when there are large
number of users. Usually, most 4G networks are overloaded across the world. Call
admission control mechanism needs to be extended across multiple types of networks
so that smooth VR performance is achieved when users move across a non-compatible
set of technologies. 5G is an upcoming technology which address solution to these
scalability issues. However, 5G networks are evolving and still do not support efficient
solution needed for call admission control for interactive multimedia. However as
5G networks support large bandwidth, analytical model for 5G networks needs to be
developed considering system is an under loaded one. In future, present work can be
extended to 5G networks and integrated 4G/5G networks. Further it can be extended
to integrated networks consisting of 3GPP based 4G/5G networks as well as IEEE
Wifi networks.
References
1. S. Al-Rubaye, A. Al-Dulaimi, J. Cosmas, A. Anpalagan, Call admission control for non-

standalone 5g ultra-dense networks. IEEE Commun. Lett. 22(5), 1058–1061 (2018)
2. M.Z. Chowdhury, Y.M. Jang, Z.J. Haas, Call admission control based on adaptive bandwidth
allocation for wireless networks. J. Commun. Netw. 15(1), 15–24 (2013)
3. M.Z. Chowdhury, Y.M. Jang, Call admission control and traffic modeling for integrated macro-
cell/femtocell networks, in Fourth International Conference on Ubiquitous and Future Net-
works (ICUFN) (2017)
4. B. Han, D. Feng, H.D. Schotten, A Markov model of slice admission control. IEEE Netw. Lett.
1(1), 2–5 (2019). https://doi.org/10.1109/LNET.2018.2873978
5. F. Hayat, Z. Jukic, I. Ghaffar, A dynamic call admission control scheme and performance
modeling for 4g lte networks, in New Technologies, Development and Application III. NT
2020, ed. by I. Karabegović (Springer, Cham, 2020)
6. M. Karim, Security for mobile agents and platforms: securing the code and protecting its
integrity. J. Inf. Technol. Softw. Eng. 08 (2017). https://doi.org/10.4172/2165-7866.1000220
7. M. Mamman, Z.M. Hanapi, A. Abdullah, A. Muhammed, An adaptive call admission control
with bandwidth reservation for downlink lte networks. IEEE Access 5, 10986–10994 (2017).
https://doi.org/10.1109/ACCESS.2017.2713451
8. I.D. Moscholios, Bandwidth sharing policies for 4g/5g networks, in The 6th International
Conference on Communications, Computation, Networks and Technologies (INNOV) (2017),
pp. 8–12
9. D. Niyato, E. Hossain, Call admission control for qos provisioning in 4g wireless networks:
issues and approaches. IEEE Netw. 19(5), 5–11 (2005)
10. O.O. Omitola, V.M. Srivastava, A channel borrowing cac scheme in two-tier lte/lte-advance
networks, in 2017 4th International Conference on Advanced Computing and Communication
Systems (ICACCS) (2017), pp. 1–5. https://doi.org/10.1109/ICACCS.2017.8014577
11. A.R. Rajendran, K. Keshav, M. Balasubramaniam, Efficient and dual sim aware resource sched-
uler for 5g and future networks, in 2020 IEEE 3rd 5G World Forum (5GWF) (2020), pp.
337–342. https://doi.org/10.1109/5GWF49715.2020.9221128
12. F.L. Rodríguez, U.S. Dias, D.R. Campelo, R.D.O. Albuquerque, S.J. Lim, L.J.G. Villalba,
Qos management and flexible traffic detection architecture for 5g mobile networks, in Sensors
(Switzerland, 2019)
13. A. Slalmi, R. Saadane, H. Chaibi, H.K. Aroussi, Improving call admission control in 5g for
smart cities applications, in SCA’19: Proceedings of the 4th International Conference on Smart
City Applications, Oct 2019
14. C. Tsatsoulis, L.K. Soh, Intelligent agents in telecommunication networks, in Computational
Intelligence in Telecommunications Networks, Intelligent Agents in Telecommunication Net-
works (2000)
AI-Based Pro Mode in Smartphone
Photography
Paras Nagpal, Ashish Chopra, Shruti Agrawal, and Anuj Jhunjhunwala
Abstract The auto mode of cameras of smart phones has a very limited scope. A
person can click better pictures if he has an idea of how to use the manual mode. But
changing the settings like HDR, hybrid zoom, night mode, shutter speed, ISO, etc.,
in manual mode takes time and the right knowledge to capture the best image. One
may miss the moment when it was actually the right time to capture the picture. So,
we have come up with an idea to capture best quality images using AI which will
not be too time taking and would lessen the efforts of the user. We aim to enhance
the image quality based on individual components of the image.
Keywords AI · ISO · Shutter speed · Pro mode · HDR · Hybrid zoom
1 Introduction
In the recent times, smartphone photography is giving a really tough competition to

simple point and shoot digital cameras in terms of the quality of images captured.
The implementation of HDR, hybrid zoom, and night mode allows smartphones to
capture excellent quality images in almost any type of lighting conditions [1, 2].
However, we still hang on to our professional cameras as we can control a lot of
parameters manually which in turn gives us more creative freedom. But, it is mostly
a hidden feature in a way that people use the auto mode more frequently rather than
using the pro mode. We propose a novel technique for automatic image enhancement
in which we use image segmentation technique to group similar images together in
P. Nagpal (B) · A. Chopra · S. Agrawal · A. Jhunjhunwala

Platform R&D/Video and Gallery, Samsung Research Institute, Noida, India
e-mail: n.paras@partner.samsung.com
A. Chopra
e-mail: ashish.c2@samsung.com
S. Agrawal
e-mail: a.shruti@partner.samsung.com
A. Jhunjhunwala
e-mail: j.anuj@partner.samsung.com
598 P. Nagpal et al.
our database. We extract features from the newly captured image and then find similar
images from our database and apply the features from those images to our new image
for enhancement. We have tried and tested this algorithm and the associated code on
Samsung Galaxy S21 ultra.
2 Existing Methodologies/Closest Technologies
2.1 Sequential Attention GAN for Interactive Image Editing
The author has basically proposed a methodology to edit images using textual
commands through a virtual agent. The main challenges in this sequential and interac-
tive image production activity are as follows: (1) based on the context, the consistency
between the images that is generated and the text description provided; (2) step-wise
modification of the generated image at region level so that it looks consistent as
a whole. To address these challenges, the author has proposed a novel sequential
attention generative adversarial network (SeqAttnGAN), which tracks states of the
previous sequential image and its textual description and encodes it and also uses
a GAN architecture to regenerate an enhanced version of the image that is both
consistent and coherent with the preceding images and its description [3].
2.2 Sequential Attention GAN for Interactive Image Editing
The author has presented a new image editing approach with convolutional networks
to automatically alter the image content with a desired attribute and still keep the
image photorealistic. The proposed image editing approach effectively combines the
strengths of two prominent images editing algorithms, conditional generative adver-
sarial networks and deep feature interpolation, to be time-efficient, memory-efficient,
and user-controllable [4, 5]. He has also presented an inverted deep convolutional
network to facilitate the proposed image editing approach. Although, the generated
image is photorealistic and of good quality, generator generates a whole new image.
In our approach, we are enhancing the original image quality.
3 Methodology Used
First and foremost step is to build a database with images of appropriate NIQE
values. Then, we perform image segmentation of database images into components
and analyze their contribution to the whole image. We follow the same process of
image segmentation for test images. The next step is to compare the components of
AI-Based Pro Mode in Smartphone Photography 599
Fig. 1 Flowchart for the proposed algorithm
the testing image with subsequent entries in the database and apply effects of closest
matching images to the components separately. Recursively, we add this image to
the database and train the model again as shown in Fig. 1.
Segmentation Pseudocode
Algorithm Flowchart
See Fig. 1.
3.1 Abbreviations and Acronyms
Naturalness image quality evaluator (NIQE).

3.2 Modules
• Segmentation of the images that will be used to create the database to get the
different components in the image and their corresponding contribution to the
whole image [6]. Here, contribution means the percentage of the total area of the
image covered by each component.
• A database of images will be created for different components. The database will
consist of the contribution of the components and their properties [7].
• NIQE algorithm will be used to verify the quality of the components of the images
that will be used to create the database. Only images under an appropriate value
(the lesser the value, the better the image quality) will be chosen. The captured
test image will go through the same process of segmentation and the components
will be extracted.
• These components will be classified into the various categories of the database and
they will be matched with the closest entries of the same individual components
in the database. Some components from the database which are the closest to the
components of the test image and have a good NIQE score will be selected.
• The image properties of the components chosen from the database will be applied
on the individual test components to enhance them. These properties will go
through scaling and normalization [8].
• Individual components will be enhanced and now the overall quality of the image
will be validated through NIQE score and this image will be entered into the
database again to improve the database.
3.3 Steps
3.3.1 Image Segmentation
Image segmentation means segregation of an image into various groups. This way
we will have a set of images with good image quality verified by the NIQE score and
the number of images in the database will increase for every category. The segmented
parts will be used as individual images for the database. The steps involved in image
segmentation are shown below in Fig. 2.
There are several methods to deal with this and one of the most popular methods
is k-means clustering algorithm. K-means clustering algorithm is an unsupervised
algorithm, and it is used to differentiate the region of interest from the background
of the image. It combines, or separates, the given data into k-clusters based on the
k-centroids.
To perform k-means, we will first convert the image from RGB to HSV. Red,
green, and blue components of an object’s color in a digital image are all correlated
in a way, where the measure of light striking the object and with one another, image
descriptions in terms of these parts make discrimination of objects a tedious process.
Descriptions in terms of hue and lightness along with chroma or saturation are often
Fig. 2 Image segmentation

steps
more familiar.
⎧
⎪ 0, ◦ if MAX = MIN ⇔ R = G = B
⎪
⎨
60 .[0 + (G − B)/(MAX − MIN)] if MAX = R
H := (1)
⎪
⎪ 60◦ .[2 + (B − R)/(MAX − MIN)] if MAX = G
⎩ ◦
60 .[4 + (R − G)/(MAX − MIN)] if MAX = B
If H < 0° then H := H + 360°

0, if MAX = 0 ⇔ R = G = B = 0
SHSV := (2)
(MAX − MIN)/MAX, otherwise
V := MAX.
3.3.2 Database
The database will have the following attributes:

• Segment_ID: This will be the primary key of the tuple. It will be generated by
extracting the individual segments from the images and classifying them into a
category. Example—building001, ocean125, etc.
• Class: This is the class that the segment belongs to. Example—ocean, sky, etc.
• Histogram: The color histogram of each segment will be stored in the tuple in
order to use it for image retrieval during the later steps [9].
• Image Characteristics: These are the properties of an image like brightness,
sharpness, contrast, hue, and saturation. These values will later be used for
enhancing the user images [10, 11]. Image characteristics are explained in detail
in step 6.
3.3.3 NIQE (MATLAB)
• Naturalness image quality evaluator or NIQE is a type of no-reference image

quality metric that provides an objective image quality score. It performs blind
image quality assessment (IQA) by comparing statistical characteristics of the
input image with the set of features from the image database.
• Natural scene statistic (NSS) is a simple and successful space domain model.
NIQE uses NSS as a reference model to construct a quality-aware collection of
statistical features. A collection of natural and undistorted images is used to derive
these statistical features.
• Input argument consists of a 2D image in grayscale or RGB format.
• A no-reference image quality score returns as the output value. The output
argument is a non-negative scalar.
• The perpetual quality is inversely proportional to the NIQE score (see Fig. 3).
• NSS-based features were calculated from the input image and natural scenes
from the database on which the model was trained. These features are computed
as multidimensional normal distribution. Then, NIQE measures the deviations of
the input image from the natural images.
• NIQE will be used to add only those segments into the database that are verified
for good quality with this algorithm.
Fig. 3 NIQE value comparison

3.3.4 Segmentation of User Image
The same procedure that was done in step 1 with the database images will be done
with the user image to get the individual components of the user image. Once we
have these individual components, they will be classified to the various categories
and then content-based image retrieval will be performed on these components to
get the closest matching images from the database. These closest matched images
will be used to apply properties on these components to enhance them. The next step
explains content-based image retrieval.
3.3.5 Content-Based Image Retrieval
• Defining the image descriptor: Color histogram as an image descriptor. It is a

representation of the distribution of colors in an image. By utilizing this as the
image descriptor, the images will be grouped in different domains based on their
color distribution. It can be considered as a feature extracted from each segment
of the image in the dataset.
• Indexing the dataset: Now, after categorizing images based on their color distri-
bution, we need to access the similar images from the database based on the input
image. “Indexing” is a means of storing the features on persistent storage to make
easy retrieval as and when required. The images in the dataset are indexed based
on their extracted features (color histograms).
• Defining the similarity metric: The database now contains the natural images
indexed based on their color distributions. The color histogram for the input
image is computed, and it is used to extract the indexed images. The extracted
color histograms associated with the indexed images are compared with the input
image features using chi-squared distance distribution. Color histograms are
nothing but the probability distribution based on color. As chi-square compares
discrete probability distributions, this function constitutes an excellent choice.
For two histograms x = [x 1 , …, x n ] and y = [y1 , …, yn ],

d(x, y) = (xi − yi )2 /xi + yi /2 (3)
• Searching: This step takes the user image, segments it and stores histograms of
the segments (the same steps as done with the database images while storing), and
then compares these histograms with the ones in the database and returns closest
matching images.
Figure 4 shows steps 1 and 2, and Fig. 5 shows steps 3 and 4. Figure 6 shows the
output histogram for one of the components of the input image.
Test image from the user is broken down into segments, and the final closest
matching images based on comparing the histograms are returned from the database.
Fig. 4 Process of extracting and storing dataset images features
Fig. 5 Process of finding relevant images from dataset
Fig. 6 Creating histograms of the segments of an image
3.3.6 Enhancing the User Image
We have taken three characteristics of image into consideration:

• Brightness: Brightness implies to the complete lightness or darkness of the image
as shown in Fig. 7.
• Saturation: In image editing, the color of the image is referred to as hue. Satura-
tion describes the intensity (purity) of that hue. It determines how a hue will look
in particular lighting conditions. Hue for high saturation appears to be stronger,
for example, “redder” or “bluer” as shown in Figs. 8 and 9 [12, 13].
Fig. 7 Steps involved in adjusting the brightness of user image
Fig. 8 Steps involved in adjusting the saturation of user image

Brightness Saturation
Fig. 9 Example of brightness and saturation adjusted images
• Contrast: Contrast is the difference between the darkest and brightest areas or
maximum and minimum pixel intensities of the image. It will make the shadows
darker and highlights brighter.
Brightness
See Fig. 7.
Saturation
See Fig. 8.
Contrast
One of the methods to calculate the contrast of an image is by calculating the
maximum and minimum lightness values of the image [14, 15]. The following steps
will show how to make contrast enhancement for the new user image:
(1) Convert the RGB values of all the pixels of the matching image to HSL (as
shown previously).
(2) Calculate the minimum and the maximum value of lightness of the closest
matching image—L2max and L2min and of the user image—L1max and L1min .
(3) Now, we will shift all the lightness values of the user image by subtracting the
value L1min from each of them.
(4) Next, we will scale all the lightness values by multiplying them by
L2max /(L1max −L1min )

Fig. 10 Example output of contrast adjusted image
Therefore,
L new = (L old − L1min ) ∗ L2max /(L1max − L1min ) (4)
This means that on increasing the contrast, the lesser range values will decrease
in value and the higher range values will increase, hence increasing the contrast as
shown in Fig. 10.
3.3.7 Re-entry to Database
• We calculate NIQE score of the result image.

• If NIQE score is between expected ranges, we conclude that image is of good
quality. So, it can be re-entered into database for further training purposes and
improving accuracy of our algorithm.
• This is done to avoid any outlier images entered by user. E.g.: These images have
no value to our database as shown in Fig. 11.
These images do not have any characteristics so they cannot be sent to the database.
Fig. 11 No characteristic image
Fig. 12 Example outputs of image enhancement through this algorithm
4 Results
Figure 12 shows a sample output of this algorithm with each component enhanced
stepwise.
5 Future Scope
There is much scope of work in future with the increasing advancement in technology.
In future, we may be able to generate scenery and views which are out of the scope
of the lens. This includes the images that even the ultra-wide camera lenses will fail
to capture. There are some generative models which are coming up and can help to
capture extravagant photos through mobile photography.
References
1. C. Guo, C. Li, J. Guo, C.C. Loy, J. Hou, S. Kwong, R. Cong, Zero-reference deep curve
estimation for low-light image enhancement, in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (2020), pp. 1780–1789
2. W. Yang, S. Wang, Y. Fang, Y. Wang, J. Liu, From fidelity to perceptual quality: a semi-
supervised approach for low-light image enhancement, in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (2020), pp. 3063–3072
3. M. Wang, Z. Tian, W. Gui, X. Zhang, W. Wang, Low-light image enhancement based on
nonsubsampled shearlet transform. IEEE Access 8, 63162–63174 (2020)
4. P. Zhuang, X. Ding, Underwater image enhancement using an edge-preserving filtering Retinex
algorithm. Multimed. Tools Appl., 1–21 (2020)
5. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image
recognition, v1. arXiv preprint arXiv:1409.1556 (2014)
6. S. Kosugi, T. Yamasaki, Unpaired image enhancement featuring reinforcement-learning-
controlled image editing software. Proc. AAAI Conf. Artif. Intell. 34(07) 11296–11303
(2020)
7. R. Hummel, Image enhancement by histogram transformation. Comput. Graph. Image Process.
6(2), 184–195, 19
8. M.S. Hitam, E.A. Awalludin, W.N.J.H.W. Yussof, Z. Bachok, Mixture contrast limited adaptive
histogram equalization for underwater image enhancement, in Proceedings of the International
Conference on Computer Applications Technology (ICCAT ) (2013), pp. 1–5
9. R. Singh, M. Biswas, Adaptive histogram equalization based fusion technique for hazy
underwater image enhancement, in Proceedings of the IEEE International Conference on
Computational Intelligence and Computing Research (ICCIC), Dec 2016, pp. 1–5
10. S.B. Rana, S.B. Rana, A review of medical image enhancement techniques for image
processing. Int. J. Curr. Eng. Technol. 5(2), 1282–1286 (2015)
11. S. Mahajan, R. Dogra, A review on image enhancement techniques. Int. J. Eng. Innov. Technol.
(IJEIT) 4(11) (2015)
12. Y. Kinoshita, H. Kiya, Hue-correction scheme based on constant-hue plane for deep-learning-
based color-image enhancement. IEEE Access 8, 9540–9550 (2020)
13. G. Hou, Z. Pan, B. Huang, G. Wang, X. Luan, Hue preserving-based approach for underwater
colour image enhancement. IET Image Process. 12(2), 292–298 (2018)
14. W. Xiong, D. Liu, X. Shen, C. Fang, J. Luo, Unsupervised real-world low-light image
enhancement with decoupled networks. arXiv preprint arXiv:2005.02818 (2020)
15. B. Xiao, H. Tang, Y. Jiang, W. Li, G. Wang, Brightness and contrast controllable image
enhancement based on histogram specification. Neurocomputing 275, 2798–2809 (2018)
A ML-Based Model to Quantify Ambient
Air Pollutants
Vijay A. Kanade
Abstract Today, air pollution has become a serious problem for humanity on planet
earth. Air pollution is estimated to kill about seven million people every year globally.
Although there has been significant development in fighting air pollution in the
recent decades, yet large masses don’t seem to benefit from it as the number of
fatalities due to air pollution continue to pile-on. Considering these implications, the
research proposes a novel solution that helps in determining air pollution in a user’s
vicinity just by clicking pictures of the surrounding. These images are processed by
employing machine learning (ML) techniques. The method tracks air pollution by
analyzing the color pattern of the captured environmental images.
Keywords Air pollution · Machine learning (ML) · Epochs · UV light · Image

processing
1 Introduction
Today, air pollution is considered as one of the world’s largest environmental health
threat. According to a study, the global deaths due to air pollution in 2020 stood
at 6.6 billion [1]. WHO data says 9 out of 10 breathe polluted air that contain air
pollutants above threshold limits [2].
Air pollution has had a severe health and climatic impact. Along with outdoor
pollution, indoor (household) air pollution has also majorly affected global health
index. Due to air pollution, there has been an increased spike in fatalities occurring
from heart failures, brain strokes, chronic respiratory infections, lung diseases, and
many others.
According to WHO data, ambient air pollution causes nearly 4.2 million deaths
per year. This is due to the fact that about 91% of global population lives in areas
where air quality index (AQI) exceeds the limits set by WHO [2]. The source of
outdoor pollution includes vehicles, power generation units, construction industries,
and manufacturing units (i.e., factories).
V. A. Kanade (B)
Kolhapur, India
612 V. A. Kanade
On the other hand, it has been identified that household (indoor) pollution has
also served as a key driver in causing premature deaths in developing countries.
In households, burning dung, wood, and coal in stoves produce various pollutants
and emit fine particles that can damage health. These include particulate matter
(PM), methane, carbon monoxide, polyaromatic hydrocarbons (PAH), and volatile
organic compounds (VOC) [2]. All these factors contribute to respiratory illness, eye
irritations, and even cancer.
Earthly climatic conditions and ecosystems are intrinsically coupled with air
quality variable. The air pollution propellers have also been identified as key compo-
nents of greenhouse gases. Hence, considering the devastating impact that air pollu-
tion can have for life on earth, strategizing policies to reduce air pollution can offer
both climatic and health benefits. Solutions to tackle air pollution can potentially
lower the diseases linked to it and also aid in mitigating long-term climate change.
2 Proposed Model
Electromagnetic radiation of the Sun is composed of different light spectrums, i.e.,

UV, infrared, visible light spectrum. Owing to the size and properties of air pollutants,
they are not visible in broad daylight. These small sized particles absorb a section of
UV spectrum but largely stay unaffected by visible and infrared light. Additionally,
it has been identified that each air pollutant has a unique signature pattern in the UV
spectrum [3].
As each pollutant possesses a signature in UV light, the combination of various
pollutants (i.e., CO, NOx , SO2 ) also leads to UV light absorption with a certain
pattern. Hence, tracking the light spectrum as it passes through the ambient envi-
ronment containing air pollutants can reveal the presence of pollutant type in the
air.
In the proposed model, we track the light spectrum in the surrounding by capturing
images of the area under consideration. Here, images are captured by using a hand-
held device such as a smartphone. Smartphone cameras today are embedded with
CMOS sensors that have substantial sensitivity to UV wavelength. Hence, smart-
phone cameras are capable of measuring UV radiance [4]. We use this inherent
property of CMOS image sensors (in smartphones) to track the UV spectrum.
In the second step, we run the machine learning techniques on captured images to
detect the severity of pollution. Hence, the light radiations observed in the captured
surrounding images are analyzed by using ML-based image processing techniques
to detect the type of air pollutants in the air.
Let’s take a closer look at the ML-based image processing algorithm—initializa-
tion (first iteration) for a dataset.
A ML-Based Model to Quantify Ambient Air Pollutants 613
2.1 Algorithm
Step 1: Create training dataset [using captured images];

Step 2: Train model [TensorFlow.js];
Step 3: Set epochs;
Set batch size;
Set learning rate [ML];
Step 4: Execute trained model [TensorFlow.js - load real time images for analysis];
Step 5: Export model;
Loop to Step 3.
3 Experiment
We used ‘Teachable Machine’, a freely available Web-based ML tool to verify

and validate the research proposal [5]. It uses TensorFlow.js, a library for machine
learning in Javascript, to train and run the models in a Web browser [6].
Initially, we uploaded the captured images to dataset 1 and dataset 2 on the tool.
Images in dataset set 1 were captured when there was less outdoor pollution, while
images in dataset 2 were taken when there was slightly more ambient air pollution.
This was verified by noting the AQI for the respective days when the images were
taken in the area of study. Then, we trained the model by setting different epochs,
batch sizes, and learning rates. After training the model, we then uploaded real-time
images taken on random days to test the trained model.
3.1 Technical Details
The teachable machine tool uses a transfer learning technique [7]. It is a deep learning
technique that focuses on reusing the pre-trained model by storing knowledge of the
solved problem and applying it to new related problem. One example of this could
be applying knowledge earned from car recognition problem to a new problem of
truck recognition. In comparison to traditional ML methods, transfer learning can
train deep neural networks with comparatively little data.
Figure 1 highlights the basic difference between traditional ML and transfer
learning.
Thus, in our experiment, the teachable machine tool employs a pre-trained neural
network. Here, we created our own classes for less ambient pollution and more
outdoor pollution, respectively. As we created these classes, they became a part of
the last layer or step of the already existing neural net. Hence, in a way, the uploaded
images in respective classes were learning off from the pre-trained mobile net models
614 V. A. Kanade
Fig. 1 Traditional ML versus transfer learning
existing in the neural network of teachable machine [8]. The very neural net helped
in segregation of real-time images in our experiment.
Location of study (captured images):
Latitude—16° 43 23.7 N, Longitude—74° 14 06.5 E.
Epoch*—Means each and every sample in the training dataset has been fed
through the training model at least once. For example: If your epochs are set to
50, it means the model you are training will work through entire training dataset 50
times.
Batch size*—Set of samples used in one iteration of training. For example: If you
have 50 images and you choose batch size of 16, this means data will be split into
50/16 = 4 batches. Once all 4 batches have been fed through the model, exactly one
epoch will be complete.
Learning rate*—defines rate at which the model learns.
3.2 Results
The results are shown in Table 1.

Figure 2 shows the AQI when Image-1 was clicked.
3.3 Screenshot
Result for Image-1 can be seen in Fig. 3 that shows a screenshot of the tool used.
Table 1 Tracked air pollution by analyzing the captured images

S. No. Images Epochs Batch size Learning rate Result Outcome
(pollution
severity)
1 Image-1 50 16 0.001 66%—dataset 1 ≈Moderately
34%—dataset 2 polluted
2 Image-2 60 16 0.00101 98%—dataset 1 ≈Minimal
02%—dataset 2 pollution
3 Image-3 70 32 0.00105 90%—dataset 1 ≈Fairly
10%—dataset 2 polluted
Note The AQI metric for Image-1, 2, and 3 collected on respective days of image capture were as
follows: 75, 45, and 56. These values were recorded to verify the outcome of the ML tool. It was
identified that these AQI values matched and validated the outcome produced by the employed tool
Fig. 2 AQI when Image-1

was captured (Day 1)
3.4 Accuracy
In addition, we also analyzed the graphs to check how well the trained model worked.
Below are the graphical images that disclose the accuracy and loss per epoch for
the trained model. Figure 4 depicts accuracy per epoch for ‘Image-1’ used in our
experiment.
616 V. A. Kanade
Fig. 3 Teachable machine (ML tool) displaying the result for uploaded Image-1
Fig. 4 Accuracy per epoch

(for Image-1)
Here, accuracy is the percentage of classifications that a model gets right during
training. Implying, if a model classifies 60 samples out of 100, then accuracy is
60/100 = 0.6. Further, if the trained model’s prediction is exact, then the accuracy
is one. Else, the accuracy is lower than one.
3.5 Loss
Here, loss is measure for evaluating how well a model has learned to predict the right
classifications for a given set of samples. If the model’s predictions are accurate, then
the loss is zero. Else, the loss is greater than one. Figure 5 depicts loss per epoch for
‘Image-1’ used in our experiment.
Fig. 5 Loss per epoch (for

Image-1)
Consider an example where there are two models A and B. Model A predicts a
right classification for a sample but is only 50% confident of that prediction. Model
B predicts the right classification for the same sample but is 90% confident of that
prediction. In this case, both models have the same accuracy, but model B has lower
loss value.
4 Traditional System Versus Proposed Model
Air pollutants are primarily monitored by using analytical instruments, such as optical
and chemical analyzers. Gas chromatographs and mass spectrometers are some other
tools used for monitoring, but due to their complexity and high cost these are not
used widely. Generally, air pollutant analyzers are complex and expensive, with
each single instrument costing in the range of £5000 to tens of thousands of pounds.
Additionally, traditional air quality monitoring systems are voluminous in nature [9].
Further, traditional air quality monitoring units are static in nature—i.e., installed
in defined areas (e.g., detecting stations) [9]. However, the proposed model is not
static, but mobile from the outset as any handheld device with the right ML soft-
ware can perform the required image processing. With traditional systems, detecting
pollution at a user defined area can pose a challenge as stations have fixed positions.
However, the newly proposed model can traverse in any geography to detect air
pollution even in remotest location, unlike traditional stations.
Besides, the data accessible to common masses today is only a numeric of AQI.
AQI is an index that reports air quality and measures how air pollution affects one’s
health for a time period. Figure 2 shows the AQI parameter observed on iPhones
today.
Having said that, proposed research model is much more advanced than simple
AQI as the users can get an idea of the air composition around any area only
618 V. A. Kanade
by clicking images of the ambient surroundings and feeding it to the ML-based

software tool.
5 Conclusion
The research paper discloses an effective ML-based method to detect ambient air
pollution. The process involves capturing the images of the surrounding, analyzing
the captured images by using ML techniques, and determining the severity of
pollution. The research is designed with the aim of making the solution acces-
sible to a common man possessing any handheld device that can click pictures and
process images via ML-based software. The research paves way for a future age of
image processing that can have applications in various fields like healthcare, tech,
astronomy, and many others.
The innovative model harnesses the natural properties of light and chemical prop-
erties of the air pollutants to localize them in an environment. This reduces the
dependency on any external technological domain as seen in traditional systems.
6 Future Work
In the proposed work, we have used the Web-based tool for verifying the research.
However, in future, we intend to develop an app corresponding to the tool so that the
proposed model is accessible to any layman.
Currently, we have only proposed the idea of detecting the severity of ambient
air pollution based on image processing. However, in future, we intend to extend the
model that can identify specific air pollutants (i.e., VOCs, NOx , SO2 , CO, PM,
methane) just by analyzing the captured images. Figure 6 depicts the futuristic
simulation view of the proposed model.
Fig. 6 Simulation view for

detected air pollutants—each
color denotes specific air
pollutant (NOx , SO2 , CO)
Acknowledgements I would like to extend my sincere gratitude to Dr. A. S. Kanade for his
relentless support during my research work.
Conflict of Interest The authors declare that they have no conflict of interest.
References
1. J. Davidson, Air pollution responsible for over 6.6 million deaths worldwide in 2020, study
finds, 21 Oct 2020
2. Air Pollution, WHO, https://www.who.int/health-topics/air-pollution#tab=tab_1
3. V.A. Kanade, A bio-inspired unsupervised algorithm for deploying [BoT]: symbiotic intelli-
gence, in IOT’18: Proceedings of the 8th International Conference on the Internet of Things,
Oct 2018, pp. 1–5. Article No.: 24
4. D. Igoe, A. Parisi, B. Carter, Characterization of a smartphone camera’s response to ultraviolet a
radiation, in Photochemistry and Photobiology. © 2012 The American Society of Photobiology
5. Teachable Machine, https://teachablemachine.withgoogle.com/
6. TensorFlow.js, https://www.tensorflow.org/js
7. M. Satish, P. Srinivasa Rao, M. Ramakrishna Murty, Identification of natural disaster affected
area using twitter, in International Conference and Publish the Proceedings in AISC Springer
ICETC-2019 (Osmania University, Hyderabad, 2019), pp. 792–801
620 V. A. Kanade
8. Googlecreativelab/teachablemachine-community, https://github.com/googlecreativelab/teacha
blemachine-community/
9. Ultrasonic wind sensors and weather stations for air quality monitoring and analysis, http://www.
gillinstruments.com/applications/government-and-emergency/air-quality-monitoring.html
Multimodal Biometric System Using
Undecimated Dual-Tree Complex
Wavelet Transform
N. Harivinod and B. H. Shekar
Abstract A human identity assurance system using multimodal biometrics based on

the undecimated dual-tree complex wavelet transform (UDTCWT) is proposed in this
paper. Biometric sources from multiple modalities give better reliability and cover-
age than single modality. To represent the multimodal biometric images, UDTCWT-
based descriptor is used. Histogram-based features are computed using the small
blockades that capture the salient features within the local region and help in obtain-
ing the invariance even though there exist intra-class variations. During the feature
descriptor formulation, both local and global features of UDTCWT are used. In the
experimentation, a bimodal combination of face-iris and face-palmprint is imple-
mented to design the multimodal biometric system. Experiments are carried out in
different configurations that prove the superiority of the proposed method.
Keywords UDTCWT · Multimodal biometrics · LUPP · GUPP
1 Introduction
Biometrics technology gained wide acceptance in society [7]. Even though unimodal
biometrics are acceptable in constrained environments, they suffer from several lim-
itations such as lack of large population coverage and easily vulnerable to noise.
Multibiometric systems use information from multiple modalities and try to over-
come these limitations and provide numerous benefits [5]. Due to the combination
of modalities, better performance is expected in terms of biometric accuracy. Multi-
biometric systems fuse the information at various phases, namely sensor-based,
feature-based, score-based, and decision-based information. Feature-level fusion
has advantages as the features are preserved till the classification. Also, there is
a need for a good feature extraction technique that captures discriminating features
N. Harivinod (B)
St. Joseph Engineering College, Mangaluru, Dakshina Kannada, Karnataka, India
B. H. Shekar
Mangalore University, Mangalagangothri, Dakshina Kannada, Karnataka, India
622 N. Harivinod and B. H. Shekar
of images from multiple modalities. Undecimated dual-tree complex wavelet trans-

form (UDTCWT) well suits this scenario. In the literature, UDTCWT is used for
a wide variety of applications. This includes texture feature extractor for blurred
images [1], watermark decoder [10], image denoising [6, 8], image fusion [4], and
action recognition [14]. In biometrics, UDTCWT-based face recognition is reported
in [11]. From the literature, we can find that UDTCWT is good at preserving invariant
features. Motivated by this, we propose UDTCWT-based multibiometric system.
2 Methodology
The design of the system proposed is given in Fig. 1. In our method, histogram-
based features are computed for the small blockades that capture the salient features
within the local region. This helps in getting the invariant features even though
their exist intra-class variations. These features are concatenated to give the global
descriptor. For biometric images acquired from different traits, we have retrieved
the region of interest followed by the application of UDTCWT on these images
and the computation of the coefficient matrix. Two types of coefficient matrices are
computed: local UDTCWT phase pattern (LUPP) and global UDTCWT phase pattern
(GUPP). Based on LUPP and GUPP of both the modalities, feature descriptor is
Fig. 1 Framework of the proposed system

Multimodal Biometric System Using Undecimated Dual-Tree … 623
computed. The classification is performed based on these features. The computation

of the feature descriptor is explained in the next section.
2.1 UDTCWT
Since its inception, the discrete wavelet transform (DWT) is used with great prosper-
ity across diverse image processing applications. The undecimated DWT (UDWT)
was proposed by a number of researchers with different names [3, 12, 15] indepen-
dently. In UDWT, downsampling is avoided from each phase of the DWT. UDWT is
shift invariant because it is mainly caused by downsampling. The scaling and wavelet
filters of an orthonormal DWT are defined as h ∈ l 2 (Z ) and g ∈ l 2 (Z ), respectively.
The undecimated wavelet filter at scale s + 1 is defined recursively as
m
(s+1) (s) g (s) , if mis even
g [m] = g [m] ↑ 2 = 2 (1)
0, if mis odd
Selesnick et al. in [13] proposed an enhancement to DWT called dual-tree com-

plex wavelet transform (DTCWT). This work highlighted the superiority of DTCWT
over conventional DWT. DTCWT is shift invariant and improved directionality. Even
though DTCWT has the advantage of redundancy reduction, its subsampled sub-
bands have fewer coefficients. The spatial position in the image decides these coef-
ficients. To overcome this, Hill et al. [6] gave the undecimated version of DTCWT
called undecimated DTCWT (UDTCWT) by removing the downsampling process
in the DWT. In UDTCWT, every subband has the same size as the image itself. The
UDTCWT provides the image responses in different orientations and scales. This is
analogous to the Gabor wavelets. The four-level UDTCWT implementation is shown
in Fig. 2.
3 Feature Descriptor Computation
The feature descriptor formation is driven by the work of Zhang et al. [16] using
Gabor wavelets. They suggested global Gabor phase pattern (GGPP) and the local
Gabor phase pattern (LGPP) for face image representation. They computed the Gabor
wavelet coefficients at 8 orientations and 5 scales which result in 40 complex Gabor
wavelet coefficient matrices. These complex coefficients are decomposed into real
and imaginary parts. Using these coefficient matrices, they computed 80 LGPPs
and 10 GGPPs. In our work, we have designed the global UDTCWT phase pat-
tern (GUPP) and the local UDTCWT phase pattern (LUPP). For a given image,
UDTCWT coefficients are computed as mentioned in Sect. 3.1 at 6 orientations and
4 scales to obtain 24 UDTCWT complex coefficient images. By separating the real
Fig. 2 Analysis stage of the UDTCWT. Courtesy [6]
and imaginary parts, 48 UDTCWT coefficient images are obtained. From these coef-
ficients, GUPP and LUPP coefficient images are computed.
3.1 Extracting Orientation-dependent Edge Information
The edge detail coefficients corresponding to different orientations can be extracted

from an image using the 2D UDTCWT filter banks. These filter banks are formed by
wavelets h ψ (x), h ψ (y), h ψ (x), and h ψ (y). Here, h ψ (y) stands for the conjugate of
h ψ (y). The wavelets h ψ (x). h ψ (y), and h ψ (x). h ψ (y) can extract edge detail coeffi-
cients corresponding to 45◦ and −45◦ , respectively, at every phase of the dual tree.
The wavelets h ϕ (x).h ψ (y), h ψ (x).h ϕ (y), h ϕ (x).h ψ (y), and h ψ (x).h ϕ (y) are capa-
ble of extracting edge detail coefficients corresponding to four more angles. Thus,
at each stage of the UDTCWT tree, it is possible to extract edge detail information
at six different angles, i.e., 15◦ , 45◦ , 75◦ , −15◦ , −45◦ , and −75◦ .
3.2 Computation of LUPP Coefficient Image
Corresponding to each 48 UDTCWT coefficient images C, we compute a LUPP

image. The size of a LUPP image of a N × N UDTCWT coefficient image is
Fig. 3 Illustration of computation of the LUPP (left) and GUPP (right) at pixel P in an image of
particular scale and orientation
(N − 2) × (N − 2). The computation of LUPP at location P(x, y) of a UDTCWT

coefficient image is explained here. The LUPP image value corresponding to loca-
tion P(x,y) is computed using its eight neighbors. An 8-bit vector is formed using
the eight neighbors of P(x,y). If a neighbor pixel Pi has a value higher than the value
at P, the corresponding bit in this vector will be 1 else 0. Mathematically, it can be
stated as
8
LUPP(P) = sign(C(P) ∗ C(Pi )).2d (2)
i=1
where Pi is the ith neighbor of P.

The function sign(x) is defined as

0, if z ≥ 0
sign(z) = (3)
1, if z < 0
A detailed illustration of LUPP coefficient image formation at location pixel

P(2, 2) is given in Fig. 3.
3.3 Computation of GUPP Coefficient Image
Using the six different orientation coefficient images at a particular scale, a GUPP
coefficient image is computed. Thus, for a given image, eight GUPP coefficient
images are computed using real and imaginary parts in four scales. The illustration
of GUPP image computation at location P(1, 1) of a sample image of size 3 × 3 is
shown in Fig. 3.
Let, Cs,d be the coefficient image at scale s and orientation d. Consider these
coefficient matrices at six various orientations with a particular scale s. Let, GUPPs
be the GUPP coefficient image at scale s. The computation of GUPPs at location
(x, y) is formed by concatenating the bit vector got from six various orientation
coefficient matrices at (x, y). The MSB and LSB are assumed to be zero, to form
an 8-bit vector. The decimal value of this byte becomes the GUPPs image value at
(x, y). This computation is extended to all pixel locations (x, y). Thus, for a given
image, we compute eight GUPP images; four each for real and imaginary coefficient
matrices at various scales. The order of the GUPP is the same as that of the coefficient
image. Mathematically, this procedure is formulated as

6
GUPPs (x, y) = u(Cs,d (x, y)).2d where s = 1, 2, 3, 4 (4)
d=1
The function u(y) is defined as

1, if y ≥ 0
u(x) = (5)
0, if x y < 0
The global descriptor for an image using 48 LUPPs and 8 GUPPs is formed as fol-
lows: (a) figure out the spatial histogram in each blockade by breaking down GUPPs
and LUPPs into blockades. (b) The global descriptor is formed by integrating these
histograms. UDTCWT responses for a face image at different scales and directions
are shown in Fig. 4. Also, GUPP responses are shown in Fig. 5.
Further, we have applied various classification techniques, namely logistic regres-
sion, linear discriminant analysis, kernel Fisher analysis, and K -nearest neighbor
with K = 1, 3, 5. It is found that kernel Fisher analysis gives the best recognition
accuracy, and details are reported in the following sections.
50 100
50
50
0
0
0
-50
-50
-100
-50
-100
100
50
50
50
0
0 0
-50
-50
-50
-100 -100
Fig. 4 (Left) UDTCWT real coefficient image responses for a face image at six directions at
a particular scale (scale = 4). The colors indicate the range of values that the image responses
correspond to. (Right) First two rows show UDTCWT real coefficient image responses at six
directions for two different scales of a face image. Last two rows show that of UDTCWT imaginary
coefficient image responses for two scales
Fig. 5 (Left) First row shows GUPP real coefficient image responses at four scales of a face image.
Second row shows that of GUPP imaginary coefficient image response. (Right) First row shows
GUPP real coefficient image responses at four scales of a face image. Second row shows that of
GUPP imaginary coefficient image response
The implementation is carried out using MATLAB (R2018a). The MEPCO, PolyU,
and CASIA biometric datasets are used for the experiments. MEPCO biometric
database [9] gives both face and iris images from the same person. In face images, the
subject is captured with different illuminations and expressions. The PolyU palmprint
database [17] contains images of 386 individuals. In total, there are 7752 grayscale
images. Samples are collected in two sessions. CASIA face image database (CASIA-
FaceV5) [2] contains 2500 color facial images of 500 subjects. Typical intra-class
variations include illumination, pose, expression, eyeglasses, imaging distance.
4.1 Experimental Results on MEPCO Dataset
The identification results of the proposed method on the MEPCO dataset are pre-
sented in Table 1. Unimodal and bimodal results are given for the face and iris. To test
the robustness for the face modality, we have used non-frontal testing samples with
varying pose and images with spectacle. We have compared the results with various
classifiers also, viz linear discriminant analysis (LDA), logistic regression (LR), K -
nearest neighbors (K NN). K NN experiments are conducted with K = 1, 3, 5. It is
observed that UDTCWT features give good results on recognition. The results are
Table 1 Comparison of recognition accuracy of unimodal and bimodal biometrics on MEPCO

dataset
Features used Recognition accuracy
Face Iris Face and iris
Gray values 20.43 54.35 90.87
Gabor coefficients 74.78 69.57 98.57
UDTCWT features 90.87 98.37 100.0
Table 2 Comparison of UDTCWT features recognition using various classifiers on MEPCO dataset
Classifiers Recognition accuracy
Face Iris Face and iris
LR 92.75 98.26 100.0
LDA 61.06 96.52 100.0
K NN with K = 1 90.24 98.26 100.0
K NN with K = 3 81.64 93.04 98.26
K NN with K = 5 79.80 88.69 90.43
Kernel Fisher analysis 98.57 99.24 100.0
Table 3 Comparison of verification rate or guinine acceptance rate (GAR) of multimodal biometrics
on the MEPCO dataset using UDTCWT features
Features GAR at
%1 FAR 0.1% FAR 0.001 FAR
Gray features 32.71 22.17 17.34
Gabor features 80.87 49.13 37.39
UDTCWT features 100.00 100.00 100.00
shown in Table 2. The recognition accuracy is obtained using K -fold cross-validation

with K = 5. This method gives a good measure for how well our algorithm is trained
upon a given data and test it on unseen data.
The verification experiments are also conducted. Table 3 summarizes the result.
The receiver operating characteristic curve for these bimodal experiments is given
in Fig. 6. We observe that the UDTCWT features also provide the best results for
verification.
4.2 Experimental Results on PolyU Palmprint and CASIA

Face Dataset
The recognition results of the proposed method on the PolyU palmprint and CASIA
face dataset are presented in Table 4. Unimodal and bimodal results are given for
face and palmprint. Here, one can observe that in the bimodal experiments when
features are gray or Gabor, the recognition accuracy is decreasing. This is because
Gabor wavelets represent palmprint efficiency but fail to represent a face. Hence,
the combination of both deteriorates the results. The results are also compared with
various classifiers, viz linear discriminant analysis (LDA), logistic regression (LR),
K -nearest neighbors (K NN). K NN experiments are conducted with K = 1, 3, 5.
It is observed that UDTCWT features provide good results on identification. The
results are shown in Table 5. The recognition accuracy is obtained using K -fold
Fig. 6 ROC for verification experiments on MEPCO dataset. ROC for experiments on PolyU
palmprint and CASIA face dataset
Table 4 Comparison of recognition accuracy on PolyU palmprint dataset and CASIA face dataset
using UDTCWT features
Features used Recognition accuracy
Face Palmprint Face and palmprint
Gray values 42.25 72.17 59.56
Gabor coefficients 44.32 91.28 64.75
Table 5 Comparison of UDTCWT features-based recognition using various classifiers on PolyU

palmprint and CASIA face dataset
Classifiers Recognition accuracy
Face Palmprint Palmprint and face
LR 56.00 96.00 97.50
LDA 62.75 89.25 92.25
K NN with K = 1 48.50 94.50 94.00
K NN with K = 3 37.50 92.00 89.25
K NN with K = 5 33.75 89.25 89.75
Kernel fisher analysis 73.75 97.25 99.65
cross-validation with K = 5. This method gives a good measure for how well our
algorithm is trained upon a given data and test it on unseen data. The verification
experiments are also conducted. Table 6 summarizes the result. The receiver operating
characteristic curve for these bimodal experiments is given in Fig. 6.
Table 6 Comparison of verification rate or guinine acceptance rate (GAR) of multimodal biometrics
by combining PolyU palmprint and CASIA face dataset using UDTCWT features
Features GAR at
%1 FAR 0.1% FAR 0.001 FAR
Gray features 80.50 66.75 57.25
Gabor features 76.00 56.25 42.50
5 Conclusion
The UDTCWT features for face-iris and face-palmprint multibiometrics are dis-
cussed in the paper. The experiments are given for both recognition and verification.
From the experimental results, we conclude that a combination of multimodal bio-
metrics using local and global features of UDTCWT gives better results.
References
1. N. Anantrasirichai, J. Burn, D.R. Bull, Robust texture features for blurred images using undec-
imated dual-tree complex wavelets (2014), pp. 5696–5700
2. CASIA, CASIA face dataset (2020). http://biometrics.idealtest.org/
3. P. Dutilleux, An implementation of the “algorithme à trous” to compute the wavelet transform,
in Wavelets (1990), pp. 298–304
4. A. Ellmauthaler, E.A. da Silva, C.L. Pagliari, S.R. Neves, Infrared-visible image fusion using
the undecimated wavelet transform with spectral factorization and target extraction (2012), pp.
2661–2664
5. G. Goswami, P. Mittal, A. Majumdar, M. Vatsa, R. Singh, Group sparse representation based
classification for multifeature multimodal biometrics. Inf. Fusion 32, 3–12 (2016)
6. P. Hill, A. Achim, D. Bull, The undecimated dual tree complex wavelet transform and its
application to bivariate image denoising using a cauchy model (2012), pp. 1205–1208
7. A.K. Jain, R. Bolle, S. Pankanti, Biometrics: Personal Identification in Networked Society, vol.
479 (Springer Science and Business Media, 2006)
8. S.K. Kanagala, G. Sreenivasulu, Landsat 8: Udt-cwt based denoising and yield estimation, pp.
1036–1040 (2018)
9. MEPCO: MEPCO biometric database (2021). http://biometric.mepcoeng.ac.in/mepcobiodb/
index.html
10. P. Niu, X. Shen, T. Wei, H. Yang, X. Wang, Blind image watermark decoder in UDTCWT
domain using Weibull mixtures-based vector HMT. IEEE Access 8, 46624–46641 (2020)
11. D. Rajesh, B. Shekar, Undecimated dual tree complex wavelet transform based face recognition,
pp. 720–726 (2016)
12. O. Rockinger, Pixel-level fusion of image sequences using wavelet frames, in Proceedings of
the 16th Leeds Annual Statistical Research Workshop (1996), pp. 149–154
13. I.W. Selesnick, R.G. Baraniuk, N.G. Kingsbury, The dual-tree complex wavelet transform.
IEEE Sig. Process. Mag. 22(6)
14. B.H. Shekar, P. Rathnakara Shetty, M. Sharmila Kumari, L. Mestetsky, Action recognition using
undecimated dual tree complex wavelet transform from depth motion maps/depth sequences,
in International Archives of the Photogrammetry, Remote Sensing and Spatial Information
Sciences (2019)
15. M. Unser, Texture classification and segmentation using wavelet frames. IEEE Trans. Image
Process. 4(11), 1549–1560
16. B. Zhang, S. Shan, X. Chen, W. Gao, Histogram of gabor phase patterns (HGPP): a novel
object representation approach for face recognition. IEEE Trans. Image Process. 16(1), 57–68
(2007)
17. D. Zhang, W.K. Kong, J. You, M. Wong, Online palmprint identification. IEEE Trans. Pattern
Anal. Mach. Intell. 25(9), 1041–1050 (2003)
Design of Modified Dual-Coupled Linear
Congruential Generator Method
Architecture for Pseudorandom Bit
Generation
Cherukumpalem Heera and Vadthyavath Shankar
Abstract In various cryptography applications, security during data transmission

and reception and its storage plays a key role. Digital signatures and authentication are
a very important application of public-key cryptography. Regular pattern bits can be
easily detected and hacked by the third party. To avoid this problem Pseudo Random
Bit Generator is used. Linear Feedback Shift Register, Linear Congruential Gener-
ator (LCG), Couple LCG (CLCG) and Dual-Coupled LCG (Dual-CLCG) are most
popular existing PRBG methods. However, uniform clock rate is finally achieved
using Dual CLCG method, but it fails to attainmaximum sequence length and some
parameters like high initial clock latency, excessive memory usage are drawbacks. To
overcome these drawbacks, dual CLCG method is modified in this paper. The modi-
fied method generates Pseudo Random Bit by achieving uniform clock rate with
one initial clock delay and reduced hardware complexity. The existing and proposed
architectures are designed for 8-bit using Microsemi Libero Tool.
Keywords Couple LCG · Dual-coupled LCG · Area · Initial clock latency
1 Introduction
To protect the data in various applications over the internet requires security and
privacy for it. Especially in IOT applications, Pseudo–Random Bit Generator is
used as an essential component to maintain the user privacy. There are different
methods of PRBS among which Linear Feedback Shift Register (LFSR), Linear
congruential generator (LCG), Couple Linear congruential generator (CLCG), Dual-
Coupled LCG are very popular. Linear feedback shift registers and Linear Congru-
ential generators are mathematically well understood, and it is also a low complexity
PRBG. However, these PRBGs badly fail randomness tests and are insecure due to
its linearity structure [1]. But there are some cryptographic limitations such as that
C. Heera (B) · V. Shankar

Department of ECE, G. Narayanamma Institute of Technology, and Science (for Women),
Hyderabad, Telangana 500104, India
634 C. Heera and V. Shankar
the sequence generated depends on the linear equations. Because of these limita-
tions, these PRBG’s fails random tests and are insecure. Two LCGs are coupled in
the CLCG method and makes it more secure than a single LCG [2]. Dual CLCG
involves four LCGs and two inequality comparisons to generate pseudorandom
binary sequence. The dual CLCG method produces one-bit random output only
when inequality equations occur. Hence, at every iteration, it is unable to generate
pseudorandom bit.
2 Related Study
The following properties characterize a good Pseudo Random Binary Generator.

• When the same seed is being used, the same output sequence will be given.
• There are some random tests which should be passed and yield good statistical
properties. This is called as randomness.
• Fixed period and longest sequence are also properties of Pseudo Random
Sequence that should be taken into account.
• Period, Randomness, and statistical properties should be independent of the seed
and this is called as seed insensitiveness.
• Another necessary property is that a PRBG should be ready to generate a sequence
of bits that can’t be distinguished in polynomial time from a really random distri-
bution even though such a sequence is deterministically generated, given a brief
really random seed [3, 4].
Keeping all the above basic properties in to account, one should choose the
Random Bit generators. In Linear congruential Generators, there is a more chance
to predict the random sequences as the LCG generates small set of numbers. Hence,
the LCG is insecure one. Hence, the two LCG’s are coupled together to make the
system more secure. The same set of equations used in the CLCG [5] as in LCG.
The following are the two generalized equations
xi+1 = a1 xi + b1 (mod m) (1)
yi+1 = a2 yi + b2 (mod m) (2)
The output (Bi+1) of the coupled considered and evaluated as follows
Bi+1 = 1 if xi+1 > yi+1 else 0 (3)
For more security again two CLCG’s are coupled in Dual CLCG Method [6],
whose outputs acts as inputs to tri-state buffer. But there are some draw backs in Dual
CLCG method. To overcome those drawbacks, a modified method is proposed. In
Design of Modified Dual-Coupled Linear Congruential … 635
modified method, we are placing one XOR gate instead of tri-state buffer, controller,
and memory units.
Security and privacy over the net is that the most sensitive and first objective to
safeguard information in numerous Internet-of-Things (IoT) applications. Several
devices which are connected to the net generate huge information which will lead
to user privacy problems [7, 8]. Also there are various security problems to design
IoT, whose purpose is to connect people to things and things to things over the net
[9, 10].
3 Existing System
3.1 Existing Pseudo Random Bit Generator
The existing dual CLCG comprises of Blocks two CLCGs, controller-CLCG,

controlled-CLCG, tri-state buffer, controller, 1 × 128—bit memory. These blocks
are integrated to create existing dual CLCG block as shown in Fig. 1. The outputs
of the controller CLCG and controlled CLCG are given to tri-state buffer whose
output is given as input to 1 × 128—bit memory which is controlled by controller
and produces final output bit by bit. The same results are obtained in simulations.
LCG is as shown in Fig. 2.
The dual CLCG method is defined mathematically as follows:
xi+1 = a1 × xi + b1 mod 2n (4)
Fig. 1 Existing dual CLCG method block diagram

Fig. 2 Linear congruential

generator block diagram
yi+1 = a2 × yi + b2 mod 2n (5)
pi+1 = a3 × pi + b3 mod 2n (6)
qi+1 = a4 × qi + b4 mod 2n (7)

1 if xi+1 > yi+1 and pi+1 > qi+1
Zi = (8)
0 if xi+1 < yi+1 and pi+1 < qi+1
Z i = Bi if Ci = 0 (9)
where

1, if xi+1 > yi+1 1, if pi+1 > qi+1
Bi = and Ci = (10)
0, else 0, else
In Fig. 1 a1 , a2 , a3 , a4 , b1 , b2 , b3 , and b4 are the constant parameters; x 0 , y0 , p0

and q0 are the initial seeds.
3.2 Existing Dual CLCG Method Architecture Drawbacks
• Large usage of flipflops

• High initial clock latency of 2n for n-bit architecture
• Fails to attainmaximum length sequence of 2n
• Unable to produce pseudo random bit at every iteration.
4 Proposed Pseudo Random Bit Generator
4.1 Modified Dual CLCG Method Architecture
The proposed architecture performance depends on the efficient implementation of

the binary comparator and the three-operand adder [11]. For three-operand modulo-
2n (n = 8) addition, the carry save adder is widely adopted adder technique and
most efficient. Therefore, in the proposed architecture carry save adder is used to
implement the three-operand modulo-2n (n = 8) [12]. The proposed modified dual
CLCG comprises of blocks two CLCGs, one XOR gate. These blocks are integrated
to create proposed modified dual CLCG block as shown in Fig. 3. The outputs of the
controller CLCG and controlled CLCG are given to XOR gate and produces final
output bit by bit. The same results are obtained in simulations, and it is mathematically
defined as follows,
xi+1 = a1 × xi + b1 mod 2n (11)
yi+1 = a2 × yi + b2 mod 2n (12)
pi+1 = a3 × pi + b3 mod 2n (13)
qi+1 = a4 × qi + b4 mod 2n (14)
The final output Z i is obtained by using the modulo2 Eq. (15),
Z i = (Bi + Ci ) mod 2 = Bi ⊕ Ci (15)
where
Fig. 3 Block diagram of the modified dual CLCG

1, if xi+1 > yi+1 1, if pi+1 > qi+1
Bi = and Ci = (16)
0, else 0, else
the modified Dual CLCG method is summarized in an algorithm.
4.2 Modified Dual CLCG Method Algorithm
1. Input
n (positive integer), m = 2n .
2. Initialization
b1, b2, b3, b4 < m, such that these are relatively prime with m.
a1, a2, a3, a4 < m, such that a1 − 1, a2 − 1, a3 – 1 and a4 – 1 must be divisible
by 4.
Initial seeds x 0 , y0 , p0, and q0 < m.
3. Output Zi
(a) For i = 0 to k
(b) Compute x i+1 , yi+1, pi+1 and qi+1 using following equations
• x i+1 = a1 * x i + b1 mod 2n
• yi+1 = a2 * yi + b2 mod 2n
• pi+1 = a3 * pi + b3 mod 2n
• qi+1 = a4 * qi + b4 mod 2n
(c) If x i+1 > yi+1, then Bi = 1 else Bi = 0
(d) If pi+1 > qi+1, then C i = 1 else C i = 0
(e) Z i = (Bi + C i ) mod 2
(f) Return Z i .
4.3 Applications of Pseudo Random Bit Generators
• Pseudo Random Bit Generators are used in the following cryptographic algorithms
which require the 8-bit vector sequence.
Exhaustive search
Time-memory Trade offs
Counter mode of operation
• Pseudo Random Bit generated are used in the signal integrity tests in transceivers
in order to test the signal integrity by running the different tests like loop back
tests by sending the random data patterns from the transmitter and receiving the
same at the receiver.
5 Simulation Results
5.1 Existing Dual CLCG Result
Here, we considered initial seeds as X 0 = 1, Y 0 = 2, P0 = 14, Q0 = 3 and constants

as A0 = 5, B0 = 1, A1 = 5, B1 = 3, A2 = 9, B2 = 141, A3 = 33, B3 = 79 and we
got LCG0 , LCG1 outputs as X i = (6, 31, 156, 13, …), Y i = (13, 68, 87, 182, …),
controlled-CLCG output Bi = (0, 0, 1, 0, …), we got LCG2 , LCG3 outputs as Pi =
(11, 240, 253, 114, …), Qi = (178, 65, 176, 255, …), controller-CLCG output C i =
(0, 1, 1, 0, …) and final output Z i = (0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, …).
Same outputs are observed in simulations as shown in Figs. 4, 5, and 6.
Fig. 4 Simulation output of existing dual CLCG for 8-bit

Fig. 5 Zoomed view of Fig. 4
Fig. 6 Zoomed view of Fig. 4
5.2 Modified Dual CLCG Results
Here, we considered initial seeds as X 0 = 1, Y 0 = 2, P0 = 14, Q0 = 3 and constants

as A0 = 5, B0 = 1, A1 = 5, B1 = 3, A2 = 9, B2 = 141, A3 = 33, B3 = 79 and we
got LCG0 , LCG1 outputs as X i = (6, 31, 156, 13, …), Y i = (13, 68, 87, 182, …),
controlled-CLCG output Bi = (0, 0, 1, 0, …), we got LCG2 , LCG3 outputs as Pi =
(11, 240, 253, 114, …), Qi = (178, 65, 176, 255, …), controller-CLCG output C i =
Fig. 7 Modified dual CLCG simulation output for 8-bit
Table 1 Measured
Parameter Existing Modified
parameters for existing and
dual-CLCG dual-CLCG
modified architectures
Number of LUTs 192 131
Number of DFFs 121 96
Initial clock latency 256 [In Tclk] 8 [In Tclk]
Output to output 2 [In Tclk] 2 [In Tclk]
latency
Maximum frequency 220.11 MHz 220.11 MHz
Power at max 27.501 mW 27.501 mW
frequency
Power/Frequency 0.1249 mW/MHz 0.1249 mW/MHz
Delay 4.632 ns 4.363 ns
(0, 1, 1, 0, …) and final output Z i = (0, 1, 0, 0, 1, 1, 0, 1, 0, 0, …). Same outputs are

observed in simulations as shown in Fig. 7.
5.3 Measured Parameters for Existing and Proposed

Architectures
See Table 1.
6 Conclusion
Existing Dual CLCG Method architecture have so many drawbacks as large usage of
flipflops, high initial clock latency of 2n for n-bit architecture, fails to attain maximum
length sequence of 2n , unable to produce pseudo random bit at every iteration. From
Table 1, we can conclude that proposed modified dual CLCG method architecture
overcomes all these drawbacks. The existing and proposed architectures are designed
using Verilog-HDL using Microsemi Libero Tool.
References
1. J. Stern, Secret linear congruential generators are not cryptographically secure, in Proceedings
28th Annual Symposium on Foundations of Computer Science, Oct 1987, pp. 421–426.
2. R.S. Katti, R.G. Kavasseri, Secure pseudo-random bit sequence generation using
coupled linear congruential generators, in Proceedings IEEE International Symposium on
Circuits and Systems (ISCAS), Seattle, WA, USA, May 2008, pp. 2929–2932
3. R. Ostrovsky, Foundations of Cryptography (Lecture Notes) (UCLA, Los Angeles, CA, USA,
2010)
4. O. Goldreich, Foundations of Cryptography (Cambridge University Press, New York, NY,
USA, 2004)
5. R.S. Katti, R.G. Kavasseri, V. Sai, Pseudorandom bit generation using coupled congruential
generators. IEEE Trans. Circuits Syst. II, Exp. Briefs. 57(3), 203–207 (2010)
6. A.K. Panda, K.C. Ray, Modified dual-CLCG method and its VLSI architecture for pseudo-
random bit generation. IEEE Trans. Circuits Syst. I: Regular Pap. 66(3) (2019)
7. J. Zhou, Z. Cao, X. Dong, A.V. Vasilakos, Security and privacy for cloud-based IoT: challenges.
IEEE Commun. Mag. 55(1), 26–33 (2017)
8. Q. Zhang, L.T. Yang, Z. Chen, Privacy preserving deep computation model on cloud for big
data feature learning. IEEE Trans. Comput. 65(5), 1351–1362 (2016)
9. E. Fernandes, A. Rahmati, K. Eykholt, A. Prakash, Internet of Things security research: a
rehash of old ideas or new intellectual challenges? IEEE Secur. Privacy 15(4), 79–84 (2017)
10. M. Frustaci, P. Pace, G. Aloi, G. Fortino, Evaluating critical security issues of the IoT world:
present and future challenges. IEEE Internet Things J. 5(4), 2483–2495 (2018)
11. S.-W. Cheng, A high-speed magnitude comparator with small transistor count, in Proceedings
ICECS, vol. 3, 2003, pp. 1168–1171
12. T. Kim, W. Jao, S. Tjiang, Circuit optimization using carry-saveadder cells. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst 17(10), 974–984 (1998)
Performance Analysis of PAPR and BER
in FBMC-OQAM with Low-complexity
Using Modified Fast Convolution
D. Rajendra Prasad, S. Tamil, and Bharti Chourasia
Abstract The asynchronous waveform with an ultra-low side lobe, fast convolu-
tion multi-carrier (FCMC), has appeared to be a promising technique for potential
wireless communications. Limiting computational complexity offers benefits such as
lower energy usage, faster processing, and less latency. These benefits have become
more significant with 5G, where the most important characteristics are minimal
latency, robust, and high speed. The fast convolution (FC) filter architecture uses
only filter values obtained in the frequency domain, since we limit the filter to the
frequency domain. This completely removes the traditional polyphase operation of
the filter. We introduce the FBMC/OQAM framework utilizing modified fast convo-
lution and test the system’s performance on a communication channel. The proposed
system is related to a traditional FBMC/OQAM polyphase system, and we have
noticed our modified fast convolution filter outperforms FBMC’s polyphase design
in terms of complementary cumulative distribution function (CCDF), computational
complexity, and BER metrics.
Keywords Computational complexity · Fast convolution · PAPR · FBMC/OQAM
1 Introduction
The OFDM is a multi-carrier communication system that splits up the spectrum avail-
able into multiple carriers, each of which is modulated by a low-rate data source.
OFDM is identical to FDMA by segmenting the bandwidth available into several
channels and thus enabling it to be allocated to users. OFDM uses the spectrum
more efficiently by spacing the channels, but by grouping the channels much more
closely together, OFDM utilizes bandwidth much more effectively. This is accom-
plished by orthogonalizing both carriers and eliminating interference between all
the narrow carriers. OFDM may be considered as a transfer strategy. It also refers
to this technique as OFDM when used in a wireless environment. It refers to the
wired environment as discrete multi-tone such as asymmetric digital subscriber lines
D. R. Prasad (B) · S. Tamil · B. Chourasia

Department of ECE, SRK University, Bhopal, India
644 D. R. Prasad et al.
(ADSLs). In OFDM, such a carrier is orthogonal. That being said, DMT [1] does not
hold this situation on the record. Amidst several benefits relative to traditional OFDM
systems, FBMC still has several unresolved problems that make them workable for
realistic applications. In this study, the primary objective of research is to resolve
some core challenges for potential wireless networks in FBMC systems. There are
many desirable features in FBMC-OQAM like excellent frequency position, a very
low side lobe power spectra density, and reliability of time-varying channels and
carrier frequency offsets. FBMC-OQAM, configured with this feature, is much more
appropriate than the orthogonal frequency division multiplexing (OFDM) wireless
communication system for 5G technology for future generations, in particular for
asynchronous applications. FBMC-OQAM possesses a good peak-to-average power
ratio (PAPR) as a multi-carrier system.
New strategies for minimizing PAPR are important. There is a notable classi-
fication of the following method: Galactic swarm optimization [2], optimal filter
design based on GFDM [3], suboptimal partial transmits sequences based [4],
custom conic optimized iterative adaptive clipping, and filtering based [5], while
some other PAPR reduction approaches for OFDM do exist. The fast convolution
of the filtering design uses the frequency response samples of the filter, or rather,
FFT samples for filtering. The sampling factor is regulated during fast convolu-
tion by selecting the required input and output blocks. FBMC/OQAM and FMT
waveform generation using fast convolution (FC) are deliberated explicitly about the
performance of their architecture, which is represented in terms of reduced spec-
tral leakage, better filter bank stability, and computational complexity savings in
the FBMC/OQAM filter bank framework. However, few authors have measured the
system’s overall performance in the presence of noise on communication links. This
paper describes the FBMC/OQAM method with fast convolution. The BER system
performance in an AWGN and system PAPR channel is measured and compared to
traditional FBMC/OQAM polyphase. In the case of a fast convolution method, BER
performance and PAPR are shown to be higher.
2 Related Work
With their critical research, we mention here a few significant and recent papers. In
[6], authors stated that specific channel estimation systems remain an open problem
for both inter-carrier as well as inter-symbol interference in the multi-carrier filter
bank that has offset quadrature amplitude modulation (QAM). To increase the channel
estimation efficiency in preamble-based FBMC/OQAM systems, the adoption of the
commonly employed modeling for the study of analysis filter bank (AFB) response
proposes a recursive least squares multi-tap-channel estimation (RLS-MTCE) frame-
work which minimizes noise effects. On the other hand, the authors are using the new
AFB signal model and designating multi-tap channel estimation systems, which are
referred to as LS-MTCE and MMSE-MTCE, for least square and minimum mean
square error (MMSE), respectively, to correctly consider the interference in order to
Performance Analysis of PAPR and BER … 645
increase the efficiency of the channel estimation. As related to the current systems,
the proposed counterparts typically attain minimal error expense with a reasonable
rise in computational complexity.
In [7], the author shows the efficiency of an FBMC-OQA massive MIMO uplink
system (MSE) is technically defined by an average mean squared error (OMF)
production and for three distinct forms of linear receivers, i.e., zero forcer (ZF),
LMMSE, and matched filter (MF). The random matrix theory asymptotically char-
acterizes the MSE performance of these receivers as the number of base station
(BS) N antennas and K is increased, thus keeping a finite N/K ratio. The expressions
obtained allow several inferences to be made, some of them already noted but not tech-
nically illustrated in the literature. First, because of the channel hardening effect, the
MSE becomes uniform throughout the frequency band. Second, they illustrate good
user synchronization in a massive MIMO environment. In conclusion, the various
MSE concepts such as noise, inter-user interference (IUI), and channel frequency
selectivity skewed variance would become trivial at broad N/K ratio values if the
users are well synchronized. In the previous research, they recognized this effect as
“self-equalization.”
In [8], the author implemented a multi-carrier filter bank (FBMC) framework to
address these drawbacks with prototypes of pulse shaping filters to satisfy system
specifications. According to its significant impact on achieved performance, filter
selection is vital for the FBMC/offset quadrature amplitude modulation (OQAM)
system. In order to improve system performance, new pulsing filters prototypes for
FBMC/OQAM systems would then be suggested. Several prototypes like raised
cosine pulse (RCP), root-raised PHYDYAS, etc. Hermitis are pulse shaping, root-
raised cosine (RRC) pulse filters, effectively timeline offset (TO). Since multiple
input multiple output (MIMO) is a more prominent FBMC-related problem, we are
proposing a new method using Walsh-Hadamard (WH) code in MIMO FBMC/OQA
multiple input/time block spreading and exploring how our suggested pulse shaping
filters integrate with MIMO systems. FBMC/OQAM systems applicants for the
suggested filters are seen to be appropriate.
In [9], the authors suggested the Bayesian compressive sensing method (BCS) for
the FBMC/OQA multiple input scenario (MIMO scenario) to estimate channel effi-
ciency. They suggest a high channel estimation for an iterative fast Bayesian matching
tracking algorithm. They provide the first statistical data for the sparse channel model
in Bayesian channel estimations. They use the BCS channel estimating technique to
predict the channel pulse response efficiently. Then, by optimizing iterative termina-
tion conditions, an updated FBMP algorithm is suggested. The simulation findings
show that the proposed method gives more than the traditional compressive sensing
technique, mean square error (MSE), and bit error rate (BER).
In [10], in the 5G connectivity research, they expressed a major concern regarding
the multi-carrier filter bank with offset quadrature amplitude modulation (FBMC-
OQAM). FBMC-OQAM also has the intrinsic limitation of resolving the high peak-
to-average power ratio (PAPR). Because of the FBMC-OQMA conflicting frame-
works, they see that the procedure suggested for FBMC-OQAM OFDM is inade-
quate, explicitly using the conventional partial transmission sequence (PTS). They
suggest an updated PTS-based method with the use of phase rotation factors to
only maximize the phase of the sparse PTS (sparse PTS) signal. Theoretical and
simulation findings show that the suggested S-PTS scheme offers substantially less
computationally complex PAPR reduction performance.
3 Material and Methods
This section introduces the typical configuration of the FBMC framework with
the filter bank or transmultiplexer (TMUX) structure for synthesis study. Figure 1
displays the TMUX configuration of the FBMC framework. The complex input
symbols of the FBMC framework are as discussed in [11] and represented in
mathematical form as
xmn = amn + j × bmn , 1 ≤ m ≤ M, 1 ≤ n ≤ N , (1)
In Eq. (1), the terms and denote complex terms on the nth subcarrier of the mth
data block of the system. These components are equally spaced in the time domain of
T /2, where T is a symbolic duration. N then transfers to N-prototype filters parallel
symbols. The s(t) of the data blocks of M may then be written with the FBMC-OQAM
signals [12].

M
M
N N
T
s(t) = smn (t) = amn p t − m T + jbmn p t − m T − e jnϑt , (2)
m=1 n=1 m=1 n=1
2
In Eq. (2), the term p(t) signifies a PHDYAS prototype filter as adopted by [13]
with U = 4, ϑt is equivalent to 2πt
T
+ π2 .Then, resultant impulse response of the term
p(t) is stated as

U −1
it
p(t) = 1 + 2 G i cos 2π (3)
i=1
UT
Fig. 1 Direct form representation of FBMC transceiver model

√
where G1 = 0.97196, G2 = 2/2, and G3 = 0.235147.
As per the Nyquist theorem, at the sampling rate of T /K, all the signals of s(t) will
undergo the process of sampling which results in the approximation of the true PAPR
value of the discrete-time signals. The discrete-time signals may then be represented
as
⎧ M N
⎨ m=1 n=1 amn p(k − m K ) + jbmn p k − m K − 2
K
s(k) = ∗e jn 2πk + π2 i f m K ≤ k ≤ m K + K2 + L p (4)

⎩ K
0 else
FBMC operates on the transmultiplexer theory. To obtain the center frequency

for all the channels present in the FBMC, we compress the signal range at each
and every input channel of the system. We combine the broadcast signal to create a
frequency division multiplexed signal. We combine both channel outputs. Also, on
the receiver side, the multiplexing of frequency divisions after we exclude filtering.
The orthogonality of OQAM modulation between adjoining channels is necessary.
Until entering the FBMC stage, we preprocess the input signal with OQAM. With the
help of OQAM symbols, the sampling frequency gets doubled and d k ,n is segmented
into even and odd terms of complex parts in relation to each complex meaning QAM
symbol c.
Re(Ck,n ), k even
dk,n = (5)
Im(Ck,n ), k odd
Re(Ck,n ), k even
dk,2n+1 = (6)
Im(Ck,n ), k odd
In Eq. (4), the term L p signifies prototype filters discrete-time length, and k signi-
fies its corresponding time index. These valued real symbols are therefore multiplied
in numbers before filtering. Post-processing OQAM can be extracted from a specific
analytical filter bank and integrates two signals at once, the true component of each
OQAM symbol. The last phase of the FBMC-OQAM method is post-processing.
The analytical filters hk (m) and gk (m) are extracted from the prototype N-length
filter h(m) [14] as

2π k N −1
h k (m) = h(m) exp j m− (7)
M 2
gk (m) = h ∗k (N − 1 − m) (8)
where m = 0, 1, …, N − 1.
When convolution is applied by leveraging the FFT, it is cyclic convolution. For
FFT, we insert zeros before all are of the same duration as the signal or filter sequence.
If the x(n) signal FFT is multiplied by the h(n) filter’s FFT, the product is the y(n)
output FFT. That being said, the y(n) length generated by an inverse FFT is equivalent
to the input length. This is not necessarily simple, since the output length of a long
L block with an M-length filter is L + M − 1. That implies that the output blocks
must be overlapped and added, not merely concatenated. The second argument is
that the overlapping stages involve non-cyclical convolution, and FFT convolution
is cyclical. Implementing a L − 1 zero on the impulse response and a M − 1 zero
on each input block are both done conveniently such that both FFTs have M + L
− 1 duration. There is no aliasing, and the cyclic convolution applied generates the
same effect as the ideal non-cyclic convolution. Arithmetic savings may be important
whether FIR digital filtering is implemented or carried out with convolution. There
are two limits, though. The usage of blocks contributes to a one-block delay. Until
the first input blocks are available, none of the first output blocks can be determined.
The second limit is the handling and sorting of blocks.
This problem is also eliminated by the continuous reduction in memory costs. An
alternate way of overlap-add may be created if the output is first segmented instead
of the data. If we find the measurement of an output block, we can show it requires
not only the relevant input block but also part of the previous input block. We can
also show it that for each output block, we require a M + L − 1 portion of the input.
Then, the last element of the previous block is saved and compared with the new input
block to be determined by h(n). Figure 2a shows the overall diagram of the block of
the fast convolution filter bank. The utilization of required weights in the frequency
domain of FFT operation is used by a filter that uses fast convolution operation rather
than time domain impulse response. The FFT is multiplied by FFT filter weights on
Fig. 2 Overall bock diagram of filter banks a fast convolution, b polyphase synthesis
the input signal block, and the IFFT is used to generate a time domain output signal
[15]. This study investigates opportunities for reducing the complexity of FC-based
waveforms, remembering that the basic notion of FC is the effective implementation
of high-order linear filters via frequency domain processing.
We place a particular emphasis on circumstances in which a small portion of the
bandwidth is active. By adding to the start and the end of each packet, a set of virtual
(i.e., not carrying any data) symbols and by intelligently selecting these symbols, we
show that FMBC-OQAM ramp-up and ramp-down tails may be removed such that
they are unimportant and are therefore discarded. This reduces the signal length of
each FBMC-OQAM packet so that its bandwidth efficiency increases, i.e., the same
data are delivered in a less time-limited fashion. By choosing the input and output
block length, the multi-rate configuration, the proposed filter is incorporated with an
embedded fast convolution operation to improve the performance of the system, as
shown in Fig. 2b.
The signal C ph (m) = exp( j2π mko L s,k /L k ), where m is the block index, and
k o is the center frequency of the kth band-pass filter is used in each input block of
the kth channel. They require this to hold the process of consecutive blocks contin-
uous. We order FFT domain weights in Fig. 3 for the FBMC/OQAM waveform of
neighboring channels. The neighboring channels get overlapped with the half portion
of bandwidth for accommodation of sub-channels collected as the preprocessing of
OQAM triggers the up-sampling factor two.
We make the FFT weights up only of bands and bands of change. Therefore, the
FC filter is a linear device that varies time (LPTV). We built the FFT filter weights
to minimize the impact of cyclic distortion to mitigate interference in the stop-band
area due to an optimization problem. If the L domain weighs, the pulse response
h(h) is to be found with the N-point IFFT of weight wk. In fact, we may combine
both functionalities in filter-based deployments. First, we include the waveforms
created for transmission in a spectrum and the unused sections of the spectrum for
dynamic and fragmented utilization must not be clear by further processes. Second,
on the recipient side, the filter bank processing may eliminate the interference from
the unused portions of the permitted spectrum. We would note that the structure
contains lengthy FFT/IFFT transformations, length N, short transformations of length
L, and nontrivial complexes L − 1 for each subchannel, in an evaluation of FC filter
bank (FCFB) computational complexity based on implementing FBMC/OQAM. The
Fig. 3 Process of two adjacent channels multiplexing in FBMC system

h̃
time-varying impulsive response (η) can now be modeled as described in [16]. With
n
h̃ H̃
N-point FFT of (η) for measure frequency response (ω) for each n. The total
n n
stop-band region interference is measured as
2

Is (ωi ) = H̃n (ω1 ) L s for ω1 ∈ s (9)
n
However, in our proposed system, the process of filtering is divided to synthesis

filter bank at both the ends of transceiver as ĝm [k]. They are also possessed complex
conjugate as
ĝm [k] = gm [k] (10)
The filter window (Mirabbasi-Martin) is the function statistically signified by

Eq. (11)

k−1
2π ln
g(n) = h 0 + 2 h 1 cos 0≤n≤N (11)
l−1
N
where K is the overlapping factor and N is the filter length, N = KM. A 128-channel
FBMC/OQAM framework has been analyzed in terms of machine complexity. For
72 active subcarriers, the FC dependent FBMC/OQAM complexity with N = 1024
architecture parameters, L = 16, and L s = 10 are 40 multiplications per identi-
fied symbol toward 44 multiplications and 56 multiplications involved in the iden-
tification of an FBMC/OQAM polyphase symbol on the transmitter and recipient
side, respectively. The increase in performance in FC-driven FBMC/conceptual
OQAM’s complexity over FBMC/OQAM polyphase is growing with fewer involved
subcarriers.
In this section, the performance metrics were carried out in MATLAB. For deploy-
ment, an FBMC-OQAM 64 channel framework is being considered. Choosing N =
2048 and L = 64 reached a ratio of sampling R = 32. Filter FFT weights must also
be configured to eliminate cyclic distortion. Figure 4a analyzed and contrasted the
graph showing BER comparison for the proposed fast convolution filter (FC-FBMC-
OQAM) with the classical polyphase filter in the presence of an AWGN channel.
It is evident that the low bit error rate (BER) of the proposed fast convolution filter
bank is maintained for all the SNR values as related to classical polyphase filters.
Fig. 4 Performance comparison of two filters: a BER comparison for proposed fast convolution
filter (FC-FBMC-OQAM) with classical polyphase filter, b PAPR comparison for proposed fast
convolution filter (FC-FBMC-OQAM) with classical polyphase filter
Table 1 Complexity comparison of different FBMC-OQAM schemes

Scheme No. of active No. of real additions No. of real
subcarriers multiplications
Polyphase 73 4320 3250
FBMC-OQAM
FC-FBMC-OQAM 43 2812 1140
(Proposed)
Figure 4b analyzed and contrasted the graph showing PAPR comparison for the
proposed fast convolution filter (FC-FBMC-OQAM) with the classical polyphase
filter in the presence of an AWGN channel. It is evident that the good CCDF of the
proposed fast convolution filter bank is maintained for all the PAPR values as related
to classical polyphase filters. Also, it is shown that for proposed up to 12 dB only,
whereas in the FBMC/OQAM polyphase, it extends to 16 dbB. Table 1 demonstrates
the computational complexity of the proposed fast convolution filter (FC-FBMC-
OQAM) with classical polyphase filter schemes.
From Fig. 5, it is evident that the number of additions and multiplications required
for our proposed system (FC-FBMC-OQAM) is directly proportional to the number
of active subcarriers. For our proposed scheme (FC-FBMC-OQAM), it is observed
that for 43 active subcarriers, the computational complexity is less as compared to
the classical polyphase filter for 73 active subcarriers.
5 Conclusions
In this study, given the high PAPR of the FBMC-OQAM signal, the need to investigate
adequate reduction schemes for PAPR are critical. To date, we have suggested several
improved techniques in order to decrease FBMC-OQAM PAPR values in compliance
Fig. 5 Graph showing computation complexity for proposed fast convolution filter (FC-FBMC-
OQAM) with classical polyphase filter
with the FBMC-OQAM signal characteristics. Compared to traditional polyphase

filters, the BER performance of the FC-FBMC device would be desirable for the
FBMC/OQAM architecture that uses the process of fast convolution. The framework
may then be relied on to improve performance. The comparison of PAPR outperforms
in the case of CCDF and BER with the help of a fast convolution filter for possible
strategic participants for traditional polyphase filters. Having an energy spectrum
density comparable to the FBMC requires large-screen guard band filters between
non-synchronized users and does not permit an unsynchronized uplink, which is
the delay time plus a finite channel length. We will shortly highlight some of the
potential areas of study which may be explored as an extension of current study in
this section. Meanwhile, the effects of non-perfect channels and the effects of time-
different channels on monitoring may be evaluated. We have tried FBMC modulation
using model channels. It would be good to investigate it on real channels in future
to see the actual performance of this novel modulation.
References
1. T. Takahara, Implementation considerations of optical discrete multi-tone (2017), pp.

SpTh2D.2. https://doi.org/10.1364/SPPCOM.2017.SpTh2D.2
2. R. Chauhan, N. Sood, I. Saini, Galactic swarm optimization-PTS strategy to minimize PAPR in
WP-OFDM system (2019), pp. 805–808. https://doi.org/10.1109/ICCS45141.2019.9065395
3. C.L. Tai, B. Su, P.-C. Chen, Optimal filter design for GFDM that minimizes PAPR under
performance constraints (2018), pp. 1–6. https://doi.org/10.1109/WCNC.2018.8377147
4. J. Gao, J. Wang, B. Wang, X. Song, Minimizing PAPR of OFDM signals using suboptimal
partial transmit sequences, in Proceedings of 2012 IEEE International Conference on Infor-
mation Science and Technology, ICIST 2012 (2012). https://doi.org/10.1109/ICIST.2012.622
1753.
5. N. Vijayakumar Dr., S. Sudhir, PAPR Reduction of OFDM signal via custom conic optimized
iterative adaptive clipping and filtering. Wirel. Pers. Commun. 78, 867–880 (2014).https://doi.
org/10.1007/s11277-014-1788-x
6. D. Ren, J. Li, G. Zhang, G. Lu, J. Ge, Multi-tap channel estimation for preamble-based
FBMC/OQAM systems. IEEE Access 8, 176232–176240 (2020). https://doi.org/10.1109/ACC
ESS.2020.3026341
7. F. Rottenberg, X. Mestre, F. Horlin, J. Louveaux, Performance analysis of linear receivers for
uplink massive MIMO FBMC-OQAM systems. IEEE Trans. Sig. Process., 1–1 (2017). https://
doi.org/10.1109/TSP.2017.2778682
8. H.M. Abdel-Atty, W. Raslan, A. Khalil, Evaluation and analysis of FBMC/OQAM systems
based on pulse shaping filters. IEEE Access, 1–1 (2020). https://doi.org/10.1109/ACCESS.
2020.2981744
9. H. Wang, W. Du, X. Wang, G. Yu, L. Xu, Channel estimation performance analysis of
FBMC/OQAM systems with Bayesian approach for 5G-enabled IoT applications. Wirel.
Commun. Mob. Comput. (2020).https://doi.org/10.1155/2020/2389673
10. H. Deng, S. Ren, Y. Liu, C. Tang, Modified PTS-based PAPR reduction for FBMC-OQAM
systems. J. Phys.: Conf. Ser. 910, 012057 (2017).https://doi.org/10.1088/1742-6596/910/1/
012057
11. P. Siohan, C. Siclet, N. Lacaille, Analysis and design of OFDM/OQAM systems based on
filter bank theory. IEEE Trans. Sig. Process. 50, 1170–1183 (2002).https://doi.org/10.1109/78.
995073
12. P. Jirajaracheep, S. Sanpan, P. Boonsrimuang, P. Boonsrimuang, PAPR reduction in FBMC-
OQAM signals with half complexity of trellis-based SLM (2018), pp. 1–5. https://doi.org/10.
23919/ICACT.2018.8323624
13. H. Nam, M. Choi, S. Han, C. Kim, S. Choi, D. Hong, A new filter-bank multicarrier system
with two prototype filters for QAM symbols transmission and reception. IEEE Trans. Wirel.
Commun. 15, 1–1 (2016).https://doi.org/10.1109/TWC.2016.2575839
14. N. Varghese, J. Chunkath, V.S. Sheeba, Peak-to-average power ratio reduction in FBMC-
OQAM system, in Proceedings—2014 4th International Conference on Advances in Computing
and Communications, ICACC 2014 (2014), pp. 286–290. https://doi.org/10.1109/ICACC.201
4.74
15. M. Renfors, F. Harris, Highly adjustable multirate digital filters based on fast convolution, in
2011 20th European Conference on Circuit Theory and Design, ECCTD 2011, pp. 9–12 (2011).
https://doi.org/10.1109/ECCTD.2011.6043653
16. M. Borgerding, Turning overlap-save into a multiband, mixing, downsampling filter bank
(2007).https://doi.org/10.1002/9780470170090.ch13
Sign Language Recognition Using
Convolution Neural Network
Varshitha Sannareddy, Mounika Barlapudi, Venkata Koti Reddy Koppula,

Gali Reddy Vuduthuri, and Nagarjuna Reddy Seelam
Abstract Sign language is one of the medias to communicate with deaf and dumb
people; usually, it is not known to normal people. So, it becomes a challenging task
to establish a communication between normal people and hearing impaired person.
Many tools are developed to help them, but unfortunately not produce accurate
results. To interact with them, we are using various fingers gestures; then, designed
model will convert those gesture into words or alphabets into specific language. The
predicted model is helpful to reduce the gap between the normal people and hearing
impaired person. In our proposed sign language recognition algorithm, we focused
on deep convolution neural network to produce better accuracy.
Keywords Sign language recognition · Convolution neural network · Support

vector machine
1 Introduction
In the world to interact with any person, we require some language in the form of
textual, vocal or visual representation. But in case of visually challenged persons
means those who have hearing impaired problems communication is very tedious
task. To communicate with them require a suitable media through visual, i.e., sign
language [1]. It is very useful to persons those who have difficulties with speaking or
hearing. Sign language is one of the popular communication media it used various
ways like hand motions, facial expressions and body movements to express some-
thing. In the world, already some of the popular sign languages are exists with various
functionalities and limitations [2, 3]. Some of the popular sign languages are Indian
Sign Language (ISL), Polish Sign Language, American Sign Language, etc., based on
the geographical conditions languages are changed like spoken languages. Because
of these every sign language have some limitations.
V. Sannareddy (B) · M. Barlapudi · V. K. R. Koppula · G. R. Vuduthuri · N. R. Seelam

Computer Science and Engineering, Lakireddy Balireddy College of Engineering, Mylavaram,
Andhra Pradesh 521230, India
656 V. Sannareddy et al.
Fig. 1 Hand gestures of ISL
Many software tools and packages are also developed teach and understand the
sign language, but the usage is limited because of not producing accurate results.
To overcome the limitations, we proposed an algorithm to interact with a hearing
impaired person accurately by recognizing and understanding the sign correctly.
We considered Indian sign Language (ISL) to check the efficiency of the proposed
algorithm.
The following Fig. 1 conveys the hand gestures of ISL.
To understand ASL, so many algorithms are already proposed, but ISL is
completely different from ASL. In ASL, they used only one hand to give the signs,
and in ISL they used two hands to provide the signs.
2 Related Work
Ansari and Harit [4] have produced significant research contributions to categorize
the Indian signal language gestures accurately. In that they used different alphabets,
numbers and different movements all together of 140 different samples they catego-
rized. To detect the various parts in the body like hand they used traditional unsuper-
vised learning technique, i.e., K-Means clustering algorithm. They used Gaussian
distribution also to extract the features required for train the data set with the accuracy
reaches 90%.
Deora and Bajaj [5] adopted PCA algorithm to recognize the sign language
gestures. In this, they also used artificial neural networks to recognize in efficient
manner. They considered very small data set and produced results are also not
satisfactory. But, compared to neural networks, PCA produces accurate results.
Sign Language Recognition Using Convolution Neural Network 657
Zhang et al. [6] adopted convolution neural networks to perform sign language
recognition. To perform the process in efficient manner, they established two-step
process—first one is extract features then next followed by classification process. To
extract the features, they used CNN and artificial neural network is used to classify
them. In proposed algorithm, they used Italian sign gestures of 27 members. CNN
adopted max pooling technique to extract the features and forwarded to ANN. This
model produces the highest accuracy of 91.7%.
3 Methodology
In the proposed algorithm, we adopted the approach described Fig. 2. In the figure,
it contains the steps to recognize the various sigs worked on the ISL.
In data set, we are going to analyze total of 4800 sign images of ISL of the English
alphabets, and it consists of 26 different class labels in the data set. The data set is
as shown in Fig. 1. At pre-processing stage, it resize the images into 640 × 480
pixels. Both feature extraction and classification techniques will be performed after
performing the normalization on images. The following Fig. 3 provides information
about workflow diagram of the proposed model. The proposed model can perform
in various steps.
1. Image Acquisition
2. Image pre-processing
3. Feature extraction
4. Apply CNN
5. Classification.
Convolution Neural Network Architecture is shown in Fig. 4.
Fig. 2 Block diagram of proposed model

Fig. 3 Work flow diagram

of proposed model
Fig. 4 Convolution neural network architecture
4 Implementation
4.1 Image Acquisition
Image acquisition is the process of creating the photographic images, such as the
interior structure of an object. The term is often assumed to include the compression,
storage, printing, and display of such images. To acquire frames in real time we use
Fig. 5 Acquiring frames in

real time
various built in functions in the open CV library. The following python code used to
capture the images in real time.
cap = cv2.VideoCapture(0)
ret, img = cap.read()
The following Fig. 5 provides an example used to represent acquiring frames in
real time.
4.2 Image Pre-processing
The main objective of image pre-processing to remove unwanted pixels by applying

various techniques like image crop, applying filtering techniques to improve the
brightness of the images and also to remove the unwanted noise or outliers detected
in the images. During this, all acquired images are in the form of RGB. But by
applying various techniques, they will be converted to binary images. In this step,
only all images are resized into specific and size, by using image segmentation we are
going to detect the image boundaries. The following Fig. 6 shows about the frames
after applying the pre-processing steps.
Fig. 6 Images after applying various filtering techniques

4.3 Feature Extraction and Classification
It is the most important step by using this, we are going to create a model for sign
recognition. To extract relevant features from the images, we adopted CNN technique.
It contain more than one convolution layer to extract relevant features. To extract the
topological properties from an image, we adopted feed-forward network.
On the selected features apply popular classification technique like SVM, Random
Forest and KNN to design the model using trained data set. After fitting, the model
predicts the values for test data set.
Program Code
The following python function is used to predict the hand gestures of ISL. In this,
we used opencv and tensorflow frame work to design the proposed model. Initially
for every alphabet, some integer is assigned.
The results of this proposed method will give alphabets as output corresponding to
the captured hand gestures. We can use various deep learning algorithms to predict
this output, but accuracy differs from one algorithm to other. Here, we used deep
convolution neural networks in order to predict more accurate output. Here, the
output will be alphabets that relate to the hand gestures captured. As shown in Fig. 7,
if there is no hand in front of camera then there will be no output and if any symbol
that matches with the data set is there in front of camera as shown in Fig. 8 then the
corresponding alphabet is shown.
In our proposed work, Convolution Neural Network and Adam optimizer with
learning rate 0.01 and dropout 0.25 is used. Accuracy of proposed system is as
shown in Fig. 9.
Fig. 7 Proposed method predicted result
Fig. 8 Proposed model predicts as letter V
Fig. 9 Accuracy of
proposed model
In this paper, the sign language system is proposed using the convolution neural
network algorithm after researching about various algorithms. Although our
proposed task is to recognize only the alphabets, it is 98.74% accurate which is
higher than existing systems. The proposed system is now only suitable for ISL
signs for alphabets only. It is not suitable for numbers, body movements, sentences
and facial expressions. In future, it will be extended to work with different forms
of signs. And also to improve the accuracy, we can adopt mixed breed clustering or
classification techniques in future.
References
1. A.K. Sahoo, G.S. Mishra, K.K. Ravulakollu, Sign language recognition: state of the art. ARPN
J. Eng. Appl. Sci. (2014)
2. J. Singha, K. Das, Indian sign language recognition using Eigen value weighted Euclidean
distance based classification technique. Int. J. Adv. Comput. Sci. Appl. 4, 2 (2013); N.V. Tavari,
A.V. Deorankar, Indian sign language recognition based on histograms of oriented gradient. Int.
J. Comput. Sci. Inf. Technol. 5(3), 3657–3660 (2014)
3. NMANIVAS. Gesture recognition system. https://github.com/nmanivas/Gesture-Recognition-
System
4. Z. A. Ansari, G. Harit, Nearest neighbour classification of Indian sign language gestures using
kinect camera. Sadhana 41(2), 161–182 (2016)
5. D. Deora, N. Bajaj, Indian sign language recognition, in 2012 1st International Conference
Emerging Technology Trends in Electronics, Communication and Networking (ET2ECN) (2012).
https://doi.org/10.1109/ET2ECN.2012.6470093
6. C. Zhang et al., Multi-modality American sign language recognition, in 2016 IEEE International
Conference on Image Processing (ICIP) (2016). https://doi.org/10.1109/ICIP.2016.7532886
Key-based Obfuscation of Digital Design
for Hardware Security
Hina Nazeem and Deepthi Amuru
Abstract This paper proposes a unique key-based dynamic obfuscation for

providing hardware security. Hardware security is the countermeasures adopted for
design protection from illegitimate use and adversary attacks during the fabrication
process. Obfuscation encrypts the designs by use of input keys that need to be set
to a particular correct value for correct design behavior. This key value is known
only to authorized users. Moreover, the obfuscation is done without compromising
on the proper functionality and efficiency of the device. In this paper, two different
obfuscation techniques are analyzed, i.e., fixed obfuscation and dynamic obfusca-
tion. These two techniques are implemented on the fast Fourier transform (FFT)
algorithm. The results are verified and analyzed for both fixed and dynamic obfus-
cation to show dynamic obfuscation is better when compared to fixed obfuscation in
terms of security.
Keywords Obfuscation · Hardware security · Trigger · Dynamic obfuscation
1 Introduction
The fabless semiconductor industries are increasing in number which outsource their
work to the IC foundries or to manufacturing plants to avoid the cost of maintaining
a manufacturing plant. Ensuring the device security during the design and manu-
facturing plays a vital role in modern life cycle development of products. There has
been a significant investigation on the security of the systems. The prime focus is on
the protection of the system rather than improvements in functionality. As various
companies located in multiple countries are involved, it has become a daunting task.
Many issues like counterfeiting, piracy, overproduction without authority, etc., have
H. Nazeem (B)
G. Narayanmma Institute of Technology and Science (for Women), Hyderabad, Telangana
500104, India
D. Amuru
CVEST, International Institute of Information Technology, Hyderabad, Telangana 500032, India
e-mail: deepthi.amuru@gnits.in; deepthi.amuru@research.iiit.ac.in
664 H. Nazeem and D. Amuru
caused an enormous revenue loss for the manufacturing industries of semiconductor

devices [1, 2]. The threats of inserting malicious circuits and reverse engineering add
trouble in obtaining safe and reliable circuits.
Hardware security is implemented through the obfuscation technique. Adding
encryption to the system, which involves using secret keys and keep the operation
and design secured, contributes to countermeasure known as obfuscation. The aim
of obfuscation is same as to conceal the actual functionality of the circuit. It is
an encryption technique which has keys embedded in the circuit design. The third
party cannot access the design without the correct key, and the attempts of illegal
production are worthless. Only the design house has knowledge of secret key. After
the fabrication process, either a trusted party allotted by design house or design house
itself programs the key.
2 Related Work
Many obfuscation processes for hardware had been developed since last few years
for sequential and combinational circuits. Few publications call this technique as
obfuscation while other publications address it as encryption due to the use of key
to lock the circuit instead of the old definition of concealing the functionality. The
actual idea of obfuscation is design security using keys which avoids unauthorized
usage.
Few early papers for logical obfuscation have been implemented in [3, 4]. These
two techniques use finite state machine (FSM) insertion referred as obfuscating FSM.
The obfuscating FSM takes key bits as input to select a state of the FSM. Only when
correct state gets activated, the circuit can function correctly. However, as per [5],
one can trace and remove them from the design and copy the circuit. The limitations
of encryption technique based on logic using xnor/xor, or/and, and multiplxers have
been uncovered by the recent work using SAT solver-based tools [6]. But it has its
own drawbacks of high overheads.
The implementation presented in [7, 8] combines the FSM of obfuscation with a
physically unclonable functions (PUFs) and creates circuits with states which depend
on output of PUF. This creates each IC with unique signature, termed as IC metering.
After the chips are manufactured, each individual IC is tested for collecting necessary
information to unlock the chip. By combining the collected information and the
knowledge which the design provides, the design house only can unlock each IC.
Key-based Obfuscation of Digital Design for Hardware Security 665
Fig. 1 Fixed obfuscation using 2:1 mux
3 Fixed, Time-Varying, and Dynamic Key-based

Obfuscation
In hardware obfuscation using multiplexers, two input multiplexers are used in the
design to insert the key bits. The incorrect and correct signal are mapped to the input,
and select line is mapped with key bits. The output from the multiplexer is given to
the next stage, which may be the correct signal or obfuscated signal depending on
key input. There are different types of obfuscation; they are fixed obfuscation, time-
varying obfuscation, and dynamic obfuscation. To understand this, we use control
signals. For example, consider C1 and C2 are the correct control signals and C1 and
C2 are incorrect control signals. The two key gates with select lines K[0] and K[1]
are driven with these control signals as input. The outputs are S1 and S2.
3.1 Fixed Obfuscation
In fixed obfuscation technique, the output is proper for the correct key input (which
is fixed) and is always improper for the incorrect key input. Let us assume, the
correct key combination is “01.” When {K[0], K[1] } = {0, 1}, we get a valid signal
combination at S1 and S2 as shown in the Fig. 1.
3.2 Time-Varying Obfuscation
In this approach, the output signal is dependent not only on key input but also on
trigger value. The outputs are mapped to C * T, where C is the control signal and
T is the trigger signal. In this technique, the incorrect key value for K[0] and K[1]
will choose the obfuscated signal C1 * T 1 and C2 * T 2, respectively, at S1 and S2.
Here, T 1 and T 2 are trigger signals. It is represented by a function G as shown in
the Fig. 2. In time-varying type of obfuscation, we have a periodic trigger signal.
Fig. 2 Time-varying obfuscation using 2:1 mux
3.3 Dynamic Obfuscation
Dynamic obfuscation is similar to time-varying obfuscation, but here, the trigger

signal is combined with a random number. It breaks the predictability of occurrence
of trigger. The outputs are mapped to C * T * R, where C is the control signal and T
is the trigger signal and R is the random number.
In this technique, the incorrect key value for K[0] and K[1] will choose signals
C1 * T 1 * R and C2 * T 2 * R, respectively, at S1 and S2. Here, T 1 and T 2 are trigger
signals. Here, the representation Ci * Ti * R represents a combined signal which is
an obfuscated signal that is obtained using correct control signal Ci, trigger signal
Ti for i = 1 and 2, and a random number R from a random number generator. It is
represented by a function G as shown in Fig. 3.
Fig. 3 Dynamic obfuscation using multiplexers

4 Implementation of Obfuscation on Digital Design
Generally, architectures are divided as control path and data path to ease the opti-
mization and testing of the designs [9]. The correct operation will depend on the
control flow and data path, and this information derived is most important to the
systems. So, we introduce multiplexers controlled by key bits at these critical links
for obfuscation. In this paper, the digital design used for obfuscation is Fast Fourier
Transform (FFT). FFT is the critical unit in most of the signal processing applica-
tions. We consider an 8-point DIT FFT and demonstrate adding obfuscation after
stage 1, as the output of stage 2 is dependent on the output from stage 1.
4.1 Fixed Obfuscation of FFT
In this section, we demonstrate the fixed obfuscation technique on an 8-point FFT.

As shown in Fig. 4, key gates are added at the critical nodes between stage 1 and
stage 2 for obfuscation. When a correct key input is given, the output of stage1 passes
correctly to the stage 2, for any other key input, the value passing from stage 1 to stage
2 is incorrect (obfuscated), which makes the final FFT output incorrect. Security is
provided to the design by adding a fixed key as encryption.
Fig. 4 Fixed obfuscation of FFT

4.2 Dynamic Obfuscation of FFT
We introduce trigger circuits into the design to convert the fixed mode obfuscation
into a dynamic mode obfuscation. The trigger circuits inserted generate signals which
will trigger rarely and randomly.
In this, the control signals with the help of trigger signals are obfuscated by
using a trigger combination circuit. Then, the obfuscated signals and correct control
signal are given as input to the multiplexers. Figure 5 shows a multiplexer which is
connected to one of the control signals. To implement dynamic obfuscation on the
FFT design, we need to replace the key gates in fixed obfuscation with the basic
dynamic obfuscation unit. As shown in the Fig. 6, the basic dynamic obfuscation
unit has three inputs, i.e., trigger signal from trigger combination circuit, actual input
signal to be obfuscated, and the key. The basic dynamic obfuscation unit instantiates
Fig. 5 Trigger circuit combined with obfuscation
Fig. 6 Dynamic obfuscation unit

a 2:1 mux for which the inputs are actual signal and the modified signal (which is
generated by fusion of input and trigger signal). Depending on key value, it selects
the actual signal or modified signal to drive the output.
For example, when correct key value is 1 and incorrect key value is 0,
key = 1, trigger = 0 or trigger = 1 ⇒ output = input

key = 0, trigger = 0 ⇒ output = input
key = 0, trigger = 1 ⇒ output =∼ (input)
It must be noted that for correct value of key, the circuit gives correct functionality,
and for incorrect key, the circuit gives correct output when trigger is low and incorrect
output when trigger is high. The trigger which is input to the circuit goes high rarely
and randomly. This will result in stronger obfuscation even with shorter keys.
5 Results
The obfuscation techniques discussed above are simulated in Xilinx 14.7 ISE to
verify their logic. The hardware description language used for the design is Verilog.
Also test benches have been created to test the design logic. The simulation results
are shown below for fixed and dynamic obfuscated FFT designs using both correct
key and incorrect key.
5.1 Fixed Obfuscated FFT
When the input sequence is x = {1, 1, 1, 1, 1, 1, 1, 0} and the key input is key
= 4 b1101 only, then we get the correct output {7, −0.707 − j0.707, −j, 0.707 −
j0.707, 1, 0.707 + j0.707, j, −0.707 + j0.707} as shown in Fig. 7. For other key
values (incorrect), the output is incorrect.
Fig. 7 Simulation waveform of fixed obfuscation FFT

Fig. 8 Simulation results of dynamic obfuscation of FFT with correct key value
5.2 Dynamic Obfuscation of FFT
With Correct Key Value

When input sequence is x = {1, 1, 1, 1, 1, 1, 1, 0} and the key input is key = 4 b1101,
then we get the correct output {7, −0.707 − j0.707, −j, 0.707 − j0.707, 1, 0.707
+ j0.707, j, −0.707 + j0.707} as shown in Fig. 8. Also, it can be observed that the
trigger does not affect the output for correct key.
With Incorrect Key Value
For incorrect key values, we get correct output when the trigger is low whereas we
get incorrect output whenever trigger goes high as shown in the simulation waveform
(Fig. 9).
When the key value is 4 b0010, we get correct output when trigger is low, but
whenever any of the trigger goes high, the output becomes incorrect.
In the above simulation waveform (Fig. 9), for wrong key value 4 b0010, whenever
we get trigger suppose when T 2 goes high, we can see the change in the output values
in the waveform. Similarly, again when T 3 goes high, we can see the change in the
output value at that point. So, whenever trigger goes high, it gives incorrect output
at that point.
5.3 Power and Delay Values
Table 1 shows the delay and power values of fixed obfuscated FFT design, dynamic
obfuscated FFT design, and the design without obfuscation, i.e., simple FFT. It can
Fig. 9 Simulation results of dynamic obfuscation of FFT with incorrect key value
Table 1 Delay and power

Design Delay (ns) Power (W)
values of different designs
Simple FFT 9.786 25.015
Fixed obfuscation FFT 10.476 26.603
Dynamic obfuscation FFT 15.617 30.297
Table 2 Logic cells utilization for each design type

Logic utilization Without obfuscation Fixed obfuscation Dynamic obfuscation
Number of slice LUTs used 486 590 621
Number of fully used 486 590 625
LUT-FF pairs
Number of bonded IOBs 94 98 104
used
Number of 1 1 1
BUFG/BUFGCTRLs used
be seen that there is a minor increase in delay and power values with the addition of
obfuscation circuit.
5.4 Device Utilization
Table 2 shows the device utilization analysis of fixed and dynamic obfuscated designs
when compared to the design without obfuscation.
6 Conclusion
This paper explains the necessity of incorporating hardware security in digital designs
to prevent IC counterfeiting and illegal overproduction. We demonstrate the poten-
tial of various obfuscation techniques—fixed and dynamic in providing hardware
security to digital designs through 8-bit Fast Fourier Transform. The dynamic obfus-
cation is different from fixed obfuscation in which obfuscating signals keep changing
with time. Dynamic obfuscation is advantageous over fixed obfuscation in terms of
time to attack even for shorter keys and hence results in stronger obfuscation. The
area, power, and delay overheads have also been analyzed. Dynamic obfuscation will
provide more security compared to fixed with a small percentage increase in delay
and power.
References
1. M. Rostami, F. Koushanfar, R. Karri, A primer on hardware security: models, methods, and

metrics. Proc. IEEE 102(8), 1283–1295 (2014)
2. J. Villasenor, M. Tehranipoor, Chop shop electronics. IEEE Spectr. 50(10), 41–45 (2013)
3. R. S. Chakraborty, S. Bhunia, Hardware protection and authentication through netlist level
obfuscation, in Proceedings of the 2008 IEEE/ACM International Conference on Computer-
Aided Design (IEEE Press, 2008), pp. 674–677
4. R.S. Chakraborty, S. Bhunia, HARPOON: an obfuscation-based SoC design methodology for
hardware protection. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 28(10), 1493–1502
(2009)
5. P. Subramanyan, N. Tsiskaridze, K. Pasricha, D. Reisman, A. Susnea, S. Malik, Reverse engi-
neering digital circuits using functional analysis, in Proceedings of the Conference on Design,
Automation and Test in Europe (EDA Consortium, 2013), pp. 1277–1280
6. P. Subramanyan, S. Ray, S. Malik, Evaluating the security of logic encryption algorithms, in
IEEE International Symposium on Hardware Oriented Security and Trust (HOST) (IEEE, 2015),
pp. 137–143
7. F. Koushanfar, Provably secure active IC metering techniques for piracy avoidance and digital
rights management. IEEE Trans. Inf. Forensics Secur. 7(1), 51–63 (2012)
8. Y. Alkabani, F. Koushanfar, M. Potkonjak, Remote activation of ICs for piracy prevention and
digital right management, in Proceedings of the 2007 IEEE/ACM International Conference on
Computer-Aided Design (IEEE Press, 2007), pp. 674–677
9. S. Koteshwara, C.H. Kim, KK. Parthi, Key-based dynamic functional obfuscation of integrated
circuits using sequentially-triggered mode-based design. IEEE Trans. Inf. Forensics. Secur.
https://doi.org/10.1109/TFIS.2017.2738600
Internet of Things-based Cardless
Banking System with Fingerprint
Authentication Using Raspberry Pi
Eliyaz Mahammad, Nagarjuna Malladhi, G. Bhaskar Phani Ram,

and K. Yeshwanth
Abstract For the purpose of online transactions, the user needs to carry credit or
debit card at commercial places. In the existing swipe machine system, to authenticate
a user, only personal identification number (PIN) is used. Sometimes the user may
forget PIN or attempts wrong passwords consecutively, then the card will be blocked
and the user needs to visit the bank frequently. There are chances that the fraudsters
as an act of phishing may steal the personal information such as user id, credit or
debit card number, CVV number, and card expiry date using skimming devices. To
overcome these problems, a system is proposed which uses biometric authorization
and personal identification number (PIN) to make a transaction. The user can make
a transaction with any one of the user’s bank accounts without using a credit or
debit card and also this system does not require a swipe machine. The transaction
information is sent to the server over a secure network using the Internet and further
processing will be done. The proposed system enhances customers experience and
increases security.
Keywords Biometric authorization · Personal identification number (PIN) · Swipe

machine
1 Introduction
Many countries are moving completely toward digitization to reduce the amount of
cash flow by means of notes and make money exchange between people easier. Swipe
E. Mahammad (B) · N. Malladhi · G. B. P. Ram

Department of ECE, Vardhaman College of Engineering, Hyderabad, Telangana, India
e-mail: eliyazmohammed@vardhaman.org
N. Malladhi
e-mail: malladhinagarjuna@vardhaman.org
G. B. P. Ram
e-mail: bhaskarphaniram@vardhaman.org
K. Yeshwanth
Vardhaman College of Engineering, Hyderabad, Telangana, India
674 E. Mahammad et al.
machines are used to enable customers to make cashless or digital transactions at

many places with the use of credit or debit cards. At shopping malls, hospitals, bill
payment centers, the usage of the swipe machine increased enormously [1]. The user
just need credit or debit card along with the personal identification number (PIN). To
make a transaction, the merchant swipes credit or debit cards or place the card in the
machine. The data present in the magnetic strip of the card which has been encoded
can be decoded and user is identified. Then, the user will be asked to enter the PIN.
If the PIN is verified, then the transaction is accepted.
Personal identification number (PIN) is an important numeric code to access the
user account. Only the PIN entry may not be enough to protect the account. Also,
there is a chance of copying magnetic stripe data using skimming devices [2]. If the
card is stolen, there will be more chances to lose money from the user account. The
features of biometrics along with its technological advances can store and access
the user information which makes the user account more secure and also does not
require the use of swiping cards to make digital transactions.
Among all the biometric features available to recognize a user, fingerprint is the
most unique identity to identify a customer. Since fingerprint of each user is unique,
it can be used as a better authentication method.
In this proposed method, the combination of fingerprint authentication and
personal identification numbers is used to improve customer experience and security.
Since it is impossible to replicate the fingerprint of a person, the proposed system
will give a solution to solve the authentication problem. While making transactions,
the user can transact through any of the user’s bank accounts just by placing user’s
finger on the authentication system and the PIN will be asked to enter. The usage of
IoT server will establish a highly secured connection and data security will also be
high. Proper care and control should be taken at the transaction places to ensure data
loss.
2 Existing System
In this existing system, user credit or debit card is swiped through the machine, then
the user is asked to enter the PIN. After successful verification of PIN, the entered
amount debits if the amount is available in user’s account. Since only a PIN is used for
authentication, which is not more secure. Hence, the fingerprint biometric is used to
authenticate the user’s identity and also increase the security of electronic payment.
The flowchart of existing swiping machine authentication system is shown in Fig. 1.
3 Proposed System
In this proposed system, biometric-based authentication is introduced [3]. Here,

fingerprint authorization and personal identification number (PIN) are used to make
Internet of Things-based Card Less Banking … 675
Fig. 1 Flowchart of existing system
a transaction. The working procedure of proposed system is discussed below with

the help of flowchart shown in Fig. 2.
4 System Design
In proposed system, the external power supply is given to microcontroller and GSM
module. The microcontroller is connected to the fingerprint module, matrix keypad,
and GSM module. The microcontroller can be connected to mobile or laptop through
the Internet from the Wi-Fi module of Raspberry Pi. The details of transaction will
be updated, as laptop or mobile is connected to the IoT platform. The detailed
architecture is shown in Fig. 3.
A. Raspberry Pi: The microcontroller is a small computer on a single integrated
circuit containing a processor, memory, and programmable I/O peripherals.
Raspberry Pi consists of onboard CPU, RAM, USB ports, Wi-Fi module,
camera interface to connect camera module, etc. New models of Raspberry
Pi are released with updated features.
Fig. 2 Flowchart of proposed system
Fig. 3 Block diagram of the proposed system

B. GSM Module: GSM module is a wireless modem that works with GSM wireless
network. The GSM module requires a SIM card and is able to operate as a digital
identity to link with cellular phone network. GSM is used for sending mobile
voice and information services.
C. Fingerprint Module: The fingerprint module reads the fingerprint of the user.
It converts the fingerprint to a template. The user template is sent to the micro-
controller. It performs a comparison with existing templates and authenticates
the user [4–7].
D. Matrix Keypad: Matrix keypad has built-in push button contacts connected to
the row and column lines. The microcontroller can scan these lines in button
press mode.
E. IoT Platform: IoT platform can be used to perform data storage, data analytics,
etc. IoT platforms are secure, fast, reliable, and scalable. These platforms collect
data from different sensors and devices and perform required operations on data.
In the IoT system, the data transfer happens over a network without requiring
human to human or human to computer interaction.
5 Results
Figure 4 shows the hardware module. Raspberry Pi is interfaced with GSM module,
fingerprint module, and matrix keypad.
When the fingerprint of user is scanned, the user is recognized and his name
will be displayed. Later, the user has to enter the PIN to continue. When the PIN
is verified, all the existing bank accounts of the user will be displayed. User has to
select the bank in which he wants to make transaction. After selecting a bank, amount
Fig. 4 Hardware module of project

Fig. 5 Transaction details
to be withdrawn has to be entered. If the amount is available in user account, the

transaction will be successful. After successful transaction, message is sent to user
registered number about transaction details. Figure 5 shows detailed procedure of
transaction.
The need for the proposed system involves increases security, minimizes fraud,
and two level verification of users. The benefits of the proposed system are user no
need to carry debit/credit cards, user can make transactions with any of his available
accounts using biometric authentication and enhanced customer experience.
6 Future Scope
In future, the system can be developed with camera module which captures the user’s
face and gestures to identify. But this system has a limitation that only the users can
make a transaction with the bank account.
7 Conclusions
With the increase of transactions using credit or debit cards, the need of secured
and fast transaction system is required. So, it is very important to authenticate users
properly and also new digital technologies should be used to match the increased
demand. By using a combination of biometric and PIN authentication, without the
need of a swipe machine, our proposed model will increase security and customer
experience.
Acknowledgments The authors would like to thank Management and Principal of Vardhaman
College of Engineering, Shamshabad, Hyderabad, India for continuous support and encouragement.
References
1. A. Singh, S. Singh, R. Kumar, Secure swipe machine with help of biometric security, in 2016
International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)
(Chennai, 2016), pp. 1056–1061
2. L. Honnegowda, Security enhancement for magnetic data transaction in electronic payment and
healthcare systems. Int. J. Eng. Technol., 331–335 (2013).https://doi.org/10.7763/IJET.2013.
V5.569
3. M. Dutta, K.K. Psyche, T. Khatun, M.A. Islam, M.A. Islam, ATM card security using bio-metric
and message authentication technology, in 2018 IEEE International Conference on Computer
and Communication Engineering Technology (CCET) (Beijing, 2018), pp. 280–285
4. S. Barman, S. Chattopadhyay, D. Samanta, S. Bag, G. Show, An efficient fingerprint matching
approach based on minutiae to minutiae distance using indexing with effectively lower time
complexity, in International Conference of Information Technology (IEEE, 2014), pp. 179–183
5. M.O. Onyesolu, I.M. Ezeani, ATM security using fingerprint bio-metric identifier: an investiga-
tive study. Int. J. Adv. Comput. Sci. Appl. 3(4), 68–72 (2012)
6. C. Ashwini, P. Shashank, S.M. Nayak, S.S. Yadav, M. Sumukh, Cardless multi-banking ATM
system services using biometrics and face recognition. Int. J. Eng. Res. Technol. (IJERT)
NCCDS—2020 8(13) (2020)
7. A.K. Jain, A. Ross, S. Prabhakar, An introduction to biometric recognition. IEEE Trans. Circuits
Syst. Video Technol. 14(1), 4–20 (2004)
Cluster Adaptive Stationary Wavelet
Diffusion
Ajay Kumar Mandava and Emma E. Regentova
Abstract A new wavelet diffusion method based on clustering is presented to

denoise the image. This is done to achieve structure-controlled diffusion. A noisy
image is segmented using wavelet energy features, which are both insightful and
robust in conveying structure information even in the presence of substantial noise.
In order to achieve the best results, the optimal number of clusters is calculated.
Each cluster is diffused until the best possible outcome is obtained. Experiments
show that the proposed approach performs well in terms of quantitative metrics and
visual consistency.
Keywords Clustering · Diffusion · Fuzzy c-means · Stationary wavelet
1 Introduction
In the last few decades, image noise reduction has received a lot of attention. A
variety of approaches has been developed. For image denoising, nonlinear anisotropic
diffusion algorithms work efficiently. Perona and Malik [1] first proposed diffusion-
based denoising in the early 1990s. Perona-Malik anisotropic diffusion (PMAD) is
the original diffusion model. The approach constructs a family of restored signals
by using a noisy signal and developing it locally according to a method specified by
the PMAD equation during the denoising process. After then, the PMAD is altered
for various objectives [2–10].
Chao and Tsai [2] have suggested a diffusion model that includes both local and
gray gradient variances in order to maintain edges and fine details while effectively
eliminating noise. The main disadvantage of this approach is that pictures with a
A. K. Mandava (B)
Department of Electrical, Electronics and Communication Engineering, GITAM School of
Technology, Bengaluru, India
e-mail: amandava@gitam.edu
E. E. Regentova
Department of Electronics and Computer Engineering, University of Nevada, Las Vegas, NV
89154, USA
e-mail: emma.regentova@unlv.edu
682 A. K. Mandava and E. E. Regentova
high noise level cannot be added. In the image, such bright pixels usually have very
wide gray gradients and gradients than the edges and information.
Yu et al. [3] suggested a SUSAN controlled diffusion, in which the edge detector
uses local knowledge from a pseudo global perspective to find image features.
SUSAN importantly directs the diffusion process with its noise insensitivity and
structural protection characteristics. For the desired diffusiveness, many parameters
need to be tuned. However, the expense of comprehensive image analysis can be a
powerful tool for finding reasonably good results in closed contours.
The correlation between explicit one-dimensional nonlinear diffusion schemes
and discrete Haar wavelet shrinking was investigated by Mrazek et al. [4]. Weickert
et al. [4, 5] explored the correlation between discrete diffusion filtering and Haar
wavelet reduction, including an analytic four-pixel technique, but only for 1D and
isotropic 2D scenarios with scalar-valued diffusivity. When compared to Perona-
Malik diffusion [1], this allows for the enhancement of edges.
The relationship between multiwavelet denoising and nonlinear diffusion was
examined by Alkhidhr et al. [6]. According to the findings, the multiwavelet shrink-
ages of the widely used CL(2) and DGHM multiwavelets are linked to a second-order
nonlinear diffusion equation. They also came up with high-order nonlinear diffusion
equations for multiwavelet shrinkages in general. According to the experiments, the
diffusion-inspired multiwavelet shrinkage outperforms typical multiwavelet hard-
and soft-thresholding shrinkages.
Nonlinear diffusion in the stationary wavelet domain was exploited by Zhong and
Sun in [7]. They demonstrated that noise has a less effect on the partial differential
equation in the wavelet domain than on the raw noisy picture domain because noise
diminishes with size. On the finest scale, this method employs filtering based on
the minimum mean squared error followed by anisotropic diffusion in the stationary
scale-space.
Nikpour and Hassanpour [8] diffuse wavelet transform approximations when
shrinkage is applied at different levels. The decomposition employs a five level
wavelet transform with the dB10 mother wavelet. A median filter, a wavelet threshold,
anisotropic diffusion, and PDE in the fourth order were used to compare the process.
Mandava and Regentova [9] introduced a context-adaptive nonlinear diffusion
approach termed context-based diffusion in the stationary wavelet domain (SWCD)
for image denoising in the wavelet domain. The diffusivity function is applied to
the wavelet coefficients of a stationary wavelet transform as a weighting function.
The expected results in this strategy are based on iterative threshold processing
implementation.
Zhang and Feng developed a new wavelet domain denoising approach in [10].
In the dual-tree CWT domain, the algorithm is implemented. Each diffusion stage
employs the local Wiener filter. In terms of noise variance, the suitable stopping time
is stated. Multiple-step local Wiener filter (MSLWF) is the name of this technique.
MSLWF has the best performance among wavelet domain approaches, according to
the testing.
LFAD technique proposed in [11] has the best performance in the class of
advanced diffusion-based methods. The method uses superpixel segmentation for
Cluster Adaptive Stationary Wavelet Diffusion 683
performing the diffusion. On the other hand, the proposed approach requires extensive
computations which reduce its efficiency for online applications.
Wavelet domain diffusion can achieve superior denoising results than nonlinear
diffusion in the spatial domain. This is because wavelet transforms maintain image
details better than spatial domain diffusion methods, and more effective wavelet
coefficients can be used in the diffusion process compared to wavelet shrinkage
approaches. The major disadvantage of all the diffusion-based denoising is that all
the regions are diffused for an equal number of iterations. Even though the best result
is obtained for texture and edge region pixels, it continues to diffuse these pixels until
there is an improvement from the image’s remaining pixels which makes the image
blurred. To overcome this problem, diffusion of each cluster or region for a different
number of iterations until the best result is obtained for each cluster. In addition,
the incorporation of cluster-based nonlinear diffusion in the wavelet domain is also
investigated. At the second level of the stationary wavelet transform, for robust region
categorization, wavelet features are employed in the approximation subband and then
diffuse the coefficients in detail at first level. Cluster-based wavelet diffusion (CWD)
is the name of this technique. This approach is introduced in Sect. 2 after a theoretical
basis. The experiment’s findings are presented in Sect. 3; followed by a conclusion.
2 Region-based Wavelet Diffusion
2.1 Nonlinear Diffusion
Perona and Mallik [1] defined the first nonlinear diffusion technique. Inter-region
smoothing is inhibited by their process, which facilitates intra-region smoothing.
Perona and Mallik’s diffusion process is mathematically defined as
∂
I (x, y, t) = ∇ · (c(x, y, t)∇ I ) (1)
∂t
The image is represented by I(x, y, t), t is the number of iterations, and c(x, y,
t) is the so-called diffusion function. Perona and Mallik proposed two diffusivity
functions.

|∇ I (x, y, t)| 2
c1 (x, y, t) = exp − (2)
k
and
1
c2 (x, y, t) = 2 (3)
|∇ I (x,y,t)|
1+ k
The constant diffusion k is used in this equation. Equation (1) can encompass a
wide range of filters depending on the diffusivity function used. The discrete diffusion
structure can be described as follows:

c N ∇ N Ii,n j · ∇ N Ii,n j + c S ∇ S Ii,n j · ∇ S Ii,n j
Ii, j = Ii, j + (∇t) ·
n+1 n
(4)
+c E ∇ E Ii,n j · ∇ E Ii,n j + cW ∇W Ii,n j · ∇W Ii,n j
The local gradient direction is denoted by the letters N-north, S-south, E-east, and
W-west, and the local gradient is determined using nearest neighbor differences.
∇ N Ii, j = Ii−1, j − Ii, j ; ∇ S Ii, j = Ii+1, j − Ii, j ;

∇ E Ii, j = Ii, j+1 − Ii, j ; ∇W Ii, j = Ii, j−1 − Ii, j (5)
2.2 Stationary Wavelet Transform and Wavelet Diffusion
The stationary wavelet transform algorithm (SWT) was used in this study to estab-
lish a new method for combining spectral and spatial data simultaneously [12]. SWT
extracts the signal frequency components using specified low- and high-pass filters
and generates four distinct sets of wavelet features, namely approximation, hori-
zontal, vertical, and diagonal coefficients. Another essential feature of the SWT
method is using several wavelet families to fit better the type of signals being
investigated. For instance, the well-known Haar wavelet is the first and simplest.
2.3 Clustering
The two main types of clustering algorithms suggested in this literature are crisp
clustering processes, in which each data point belongs to just one cluster, and fuzzy
clustering methods, in which each data point belongs to a cluster with a defined
degree of membership. The key advantage of FCM over the k-means algorithm is that
it allocates a degree of membership to each pattern to each cluster. So for obtaining
clusters that share similar structures, fuzzy c-means clustering was employed to
separate them [13].
2.4 Cluster-based Wavelet Diffusion (CWD) Algorithm
The following steps are followed by the method:

1. Using SWT with Haar wavelet, the image is split into two layers.
2. The image is clustered based on wavelet energy feature calculated in 5 × 5

window of second-level subband and approximation coefficient using FCM.
3. Detail coefficients of each cluster at the first level of transform are diffused
according to the following iteration step.
⎛ ⎞
⎜ −2.9183 ⎟
⎜ ⎟
D1(k+1) = D1(k) ∗ exp⎜ √ 4 ⎟ (6)
⎝ (k)
2D1 ( p,q) ⎠
λ
where D1(0) = x( p, q); p, q = 1, 2, 3, 4, …, n of n × n image.

4. The inverse SWT is used to generate the image.
5. Step 3 is repeated until the best PSNR is attained for each cluster; that is, it
decreases in the subsequent iteration.
3 Experiment
To measure the efficiency of CWD, benchmark images are degraded by zero-mean

μ = 0 and σ = 10, 20, 30, 40 additive white Gaussian noise in order to evaluate
its performance. Additional analysis includes experimenting with different number
of clusters. Other diffusion models, including LVCFAB [16] and GSZFAB [17], as
well as the state-of-the-art approach BM3D [14], are compared.
The objective evaluation is based on the PSNR, which is derived using the equation
below.
2
Imax
PSNR = 10 log (7)
MSE
where MSE stands for mean square error and using the universal image quality index
(UIQI) given by
4σx y x y
Q= (8)
σx2 + σ y2 (x)2 + (y)2
where x, y are the means and σ x , σ y are the standard deviations and σ xy denotes
the covariance. As it is mentioned in [15], the mean subjective ranks of observers
correspond with the average quality index UIQI.
Tables 1, 2, and 3 show the PSNR results for different number of clusters (i.e.,
5, 10, and 15). From Tables 1, 2, and 3, for low noise levels such as σ = 10 and
20, one can conclude that ten clusters yield best results, and for high noise levels
such as σ = 30 and 40, fifteen clusters produce best PSNR values. Based on PSNR
Table 1 PSNR (dB) of CWD method for five clusters

Sigma 10 Sigma 20 Sigma 30 Sigma 40
Lena 31.01 28.95 27.77 26.97
House 32.91 31.11 29.63 29.04
Peppers 30.10 28.23 26.88 25.98
Synthetic 32.53 31.98 31.34 30.71
Table 2 PSNR (dB) of CWD method for ten clusters

Lena 32.49 29.82 28.35 27.41
House 34.19 31.78 30.88 29.60
Peppers 30.87 28.92 27.44 26.29
Synthetic 33.40 32.83 32.19 31.94
Table 3 PNSR (dB) of CWD method for 15 clusters

Lena 31.17 29.43 28.38 27.37
House 33.75 30.61 30.89 28.90
Peppers 30.04 28.60 27.66 26.66
Synthetic 32.67 32.09 32.23 32.04
values and the algorithm’s execution time, ten clusters represent an optimal value for
practical implementation. Table 4 provides the PNSR results of stationary wavelet
diffusion without clustering to emphasize the effect of the cluster-based diffusion.
The proposed method improves PSNR on average 0.3–3 dB compared to the SWD.
Table 5 presents UIQI values attained by CWD (ten clusters) for benchmark images
with the additive white Gaussian noise. Based on the findings from Tables 2 and 4, it
appears that cluster-based diffusion provides superior results. Table 6 shows PSNR
values for methods under comparison for “Lena” image. The diffusivity constant used
in [13] is λ = 10. Figure 1 shows clustering results of “house” and “Lena” images
based on pixel intensity values and the clustering based on wavelet approximation
Table 4 PSNR (dB) of SWD method

Lena 28.03 27.24 26.42 25.70
House 30.06 28.85 27.73 26.79
Peppers 27.02 26.39 25.73 25.12
Synthetic 29.14 29.83 31.02 30.98
Table 5 UIQI results of the proposed CWD method (ten clusters)

Lena 0.789 0.703 0.670 0.639
House 0.636 0.553 0.487 0.459
Peppers 0.791 0.714 0.672 0.641
Synthetic 0.266 0.253 0.249 0.242
Table 6 Comparison for

Sigma 10 Sigma 20
Lena image based on PSNR
CWD 32.49 29.82
LVCFAB [16] 31.90 26.67
GSZFAB [17] 32.49 28.29
Fig. 1 First column: Cluster map of “house” and “Lena” based on intensity values; second column:
Cluster map of “house” and “Lena” based on wavelet approximation value and energy features.
Number of clusters = 10
Fig. 2 First row: “House” with level σ = 20 and CWD result; second row: “House” with level σ
= 40 and CWD result
and energy features. Major trends and discontinuities contribute to large wavelet
coefficients at the second level, while noise makes small coefficients contribute to
smooth regions’ features. Figures 2 and 3 show noisy “house” and “Lena” images
for σ = 20 and 40 and their denoised versions, respectively. When employed in
smoother regions, the proposed technique produces a higher level of visual quality.
Smooth regions are specifically diffused to a higher extent. Figure 4 shows a synthetic
image with the noise level as σ = 40 and the denoised results by BM3D and CWD.
Although BM3D is superior PSNR-wise, we can observe better edge preservation
by CWD.
4 Conclusion
A novel cluster adaptive diffusion approach is proposed in this paper. In the scheme
developed, first cluster the image into ten clusters using the wavelet energy feature
calculated at the detail subbands of the second level and fuzzy c-means. Then, the
diffusion is performed on each cluster until the best PSNR is obtained. According to
Fig. 3 First row: “Lena” with noise σ = 20 and CWD result; second row: “Lena” with noise σ =
40 and CWD result
the experiments, the proposed approach exhibits fairly good performance in terms
of objective and visual qualities.
Fig. 4 First row: Synthetic image and synthetic with noise level σ = 40 (PSNR = 16.56, UIQI =
0.150); second row: BM3D (PSNR = 33.89, UIQI = 0.251) and CWD (PSNR = 31.94, UIQI =
0.242) result
References
1. P. Perona, J. Mallik, Scale-space and edge detection using anisotropic diffusion. IEEE Trans.
Pattern Anal. Mach. Intell. 12, 629–639 (1990)
2. S.-M. Chao, D.-M. Tsai, An improved anisotropic diffusion model for detail- and edge-
preserving smoothing. Pattern Recogn. Lett. 31(13), 2012–2023 (2010)
3. J. Yu, J. Tan, Y. Wang, Ultrasound speckle reduction by a SUSAN-controlled anisotropic
diffusion method. Pattern Recogn.. 3083–3092 (2010)
4. P. Mrazek, J. Weickert, G. Steidl, Diffusion inspired shrinkage functions and stability results
for wavelet denoising. Int. J. Comput. Vis. 64, 171–186 (2005)
5. M. Welk, J. Weickert, G. Steidl, A four-pixel scheme for singular differential equations, in
Scale-Space and PDE Methods in Computer Vision, ed. by R. Kimmel, N. Sochen, J. Weickert.
Lecture Notes in Computer Science, vol 3459 (Springer, Berlin, 2005), pp. 585–597
6. H. Alkhidhr, Q. Jiang, Correspondence between multiwavelet shrinkage and non-linear
diffusion. J. Comput. Appl. Math. 382, 45–61 (2020)
7. J. Zhong, H. Sun, Wavelet-based multiscale anisotropic diffusion with adaptive statistical
analysis for image restoration. IEEE Trans. Circ. Syst. I Regul. Pap. 55(9), 2716–2725 (2008)
8. M. Nikpour, H. Hassanpour, Using diffusion equations for improving performance of wavelet-
based image denoising techniques. IET-IPR 4(6), 452–462 (2010)
9. A.K. Mandava, E.E. Regentova, Image denoising based on adaptive non-linear diffusion in
wavelet domain. J. Electron. Imaging 20(3), 033016–033016 (2011)
10. X. Zhang, X. Feng, Multiple-step local Wiener filter with proper stopping in wavelet domain.
J. Vis. Commun. Image R. 25(2), 254–262 (2014)
11. A.K. Mandava, E.E. Regentova, G. Bebis, LFAD: locally- and feature-adaptive diffusion based
image denoising. Appl. Math. Inf. Sci. 1, 1–12 (2014)
12. G.P. Nason, B.W. Silverman, The stationary wavelet transform and some statistical applications,
Lecture Notes in Statistics, vol 103 (1995), pp. 281–299
13. J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algoritms (Plenum Press,
New York, 1981)
14. K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3D transform-domain
collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)
15. Z. Wang, A.C. Bovik, A universal image quality index. IEEE Signal Process. Lett. 9(3), 81–84
(2002)
16. Y. Wang, L. Zhang, P. Li, Local variance controlled forward-and-backward diffusion for image
enhancement and noise reduction, IEEE, Trans. Image Process. 16(7) (2007)
17. G. Gilboa, N. Sochen, Y.Y. Zeevi, Forward-and-backward diffusion processes for adaptive
image enhancement and denoising. IEEE Trans. Image Process. 11(7), 689–703 (2002)
Low Complexity and High Speed
Montgomery Multiplication Based
on FFT
B. Jyothi, M. Sucharitha, and Anitha Patibandla
Abstract The particular augmentation activity is the most tedious activity for
number-hypothetical cryptographic calculations in RSA and Diffie-Hellman. There
are quick multiplier structures to limit deferral and increment the throughput utilizing
parallelism and pipelining. Such structures are gigantic with limited efficiency. In
this endeavor, we incorporate the improved fast Fourier transform (FFT) technique
into the McLaughlin’s system and Montgomery calculation to accomplish high zone
time efficiency. Contrasted with the past FFT-based structures, we hinder the zero-
cushioning activity by processing the particular increase steps straightforwardly
utilizing cyclic and nega-cyclic convolutions. Moreover, upheld by the number-
hypothetical weighted change, the FFT calculation is utilized to give quick convolu-
tion calculation. The outcomes illustrate that our work has a superior efficiency with
best in class FFT-based MMM representations from or more 1024-piece operand
dimensions.
Keywords Montgomery modular multiplication · Number-theoretic weighted

transform · Fast Fourier transform (FFT)
1 Introduction
The focus of this work is equipment usage of RSA calculation [1] with bigger than
1024-piece module size. The executions of RSA calculation [2] are accepted with
512-piece modulus would be sufficient in factorization techniques expanded the
modulus length to 1024 bits. NIST suggested [3] 4096-piece for the not so distant
future so as to keep up RSA secure. Obviously, bigger key dimensions lead to
extended handling time and more equipment asset during processing because of
the RSA calculation needs the secluded exponentiation.
Montgomery multiplication is an efficient strategy to process particular increase
[4]. In MMM calculation, the tedious preliminary division is supplanted by increases
and decreases modulo R to decreases are insignificant by choosing R to be an intensity
B. Jyothi · M. Sucharitha (B) · A. Patibandla

Department of ECE, Malla Reddy College of Engineering and Technology, Hyderabad, India
694 B. Jyothi et al.
of 2. Existing increase techniques have two gatherings. Techniques are performed

uniquely in time area, together with the textbook strategy [5–7]. Strategies for the
subsequent gathering are acted in equally time and ghastly spaces [8–10]. Meanwhile,
the quick Fourier change (FFT)-based calculation is enforced to the subsequent
gathering, a lesser asymmetric intricacy could be accomplished contrasted with the
strategies. There are several hardware executions of these increase strategies: the
textbook strategy [11, 12], SSA [13, 14], and the Karatsuba technique [15, 16].
In this endeavor, we introduce a FFT-oriented MMM calculation underneath
McLaughlin’s structure, where the time-ghastly area change deprived of zero-
cushioning is the rehashed essential activity. The proposed calculation is named
as FMLM3 . Equipment acknowledgment of the FMLM3 is focusing on large terri-
tory time efficiency; in case, mutually, equipment assets cost and cycle prerequisite
are examined dependent on various constraint groups.
The rest some portion of this paper is sorted out as following. Section 2 gives the
pipelined designs of FMLM3 , and the models with single two fold butterfly structures
are executed. Section 3 presents arithmetical enhancements of the FMLM3 and a
common parameter set determination strategy of the FMLM3 . In the last segment
Sect. 4, results are given.
2 FFT-based McLaughlin’s Montgomery Modular

Multiplication (FMLM3 )
The FMLM3 incorporates two diverse modulus R = 22v−1 and Q = 22v+1 and
accomplishes quick NWT deprived of zero-padding.
The design of the FMLM3 is proposed in Fig. 1. A controller block is intended
to create all the controller signs related to the whole framework. The RAM block
comprises of a few RAM groups, accumulates the pre-figured data, the middle
outcomes, and the terminal secluded item.
This calculation has an increasingly confused information flow contrasted with the
repeatable structure method. Additionally, more activity units are required to manage
the secluded decreases and the contingent determinations. Tasks of the FMLM3 from
the highest view are registered successively, while pipelined structures are planned
inwards every block. The front and opposite NWTs are acted in the FFT/FFT−1
block. Segment savvy duplication and expansion are acted in a multiply adder block.
The ripple carry adder (RCA), the subtractor, and the shift module components are
liable considering the time space operations such as modulo R and Q decreases,
contingent choices, and so on.
The engineering of the structure is focused on large clock recurrence while keeping
up a little asset price. The pipelined butterfly structure (BFS) anticipated is embraced
in our FFT/FFT−1 block to accomplish the objective. Figure 2 gives the FFT/FFT−1
block.
Low Complexity and High Speed Montgomery … 695
Fig. 1 FMLM3
Fig. 2 FFT/FFT−1 with two butterfly structures

Shift_ctrl signals are used in multiplication. Channel switcher reorders the output
digits. The operators A and B and channel switcher (CS) comprise of eight 2-to-1
MUX clusters are intended to guarantee the middle of the road digits can be composed
into the right RAM area.
The FFT/FFT−1 block is structured using six sources of info, four contributions
into BFSs calculation, and two contributions for the pre-registered higher bounce
requirements to limit the after effects of NCT−1 before aggregation. The digits are
not negative and equivalent to x n or x n + M. Because the lesser limits of are non-
positive whole numbers, while x n ∈ [0, M), we just have to crisscross the higher
limits and right the x n + M cases by deducting M.
An aggregator is intended to alter the consequences of CT−1 .Two neighboring
digits are included in each one cycle, and corresponding outcomes are produced in
two stages. In the initial phase, it figures:
R0 = xi+1 2u + xi (1)
where x i and x i + 1 indicate the two information digits. In the subsequent phase, R0
is summed to the aggregation
r
i
ri+1 = R0 +
22u
X i=1 2u + X i = ri mod 22u (2)
where X i and X i + 1 indicate the two yield digits, r i signifies the information put
away in aggregation register (r 0 = 0).
Control signals shift_ctrl0 transmits the fidget factors (the quantity of ω) to deal
with the move activity. At the point when c = 21 , A = 2c is an objective
√ number
and ω has whole number powers. At the point when c = 1/2, A ≡ 2 ≡ 23·2v−3 −
22v−3 mod M, consequently, one deduction, two movements, and three modulo M
decreases are essential to duplicate A.
For the case c = 1/2, the calculation of NCT and NCT−1 need additional activities
contrasted with CT and CT−1 owing to the non-whole number intensity of√ω and
the size of the silly number. Considering the NCT calculation when A = 2, the
fidget factors are acquired by ω−[n/j].j+j/2 , where J = 2v−1−j and choose the lower
yield during the last stage calculation by subbing J = 1 and ω = 2, move bits of the
last stage are calculation
W −−[n/J ].J +J/2 = 23.2v−3−n − 22.v−3−n (3)

√
When A = 2, the final phase of NCT−1 is partitioned into two cases. At the
point while n is even, A = 2n/2 is constantly a whole number, so (An P)−1 ≡ 22v−v–n/2
mod M is just an intensity of 2. Be that as it may, when n is odd, registering (An P)−1
is more complicated:

(A P)−1 = 23 2n−3−1 23 − 22n−3−1 22v−v−(n−1)/2 mod M (4)
2.1 Multiply Adder Unit (MAU)
The MAU unit actualizes the segment shrewd increase and the expansion of the
FMLM3, since it is apt if the operand magnitude is no bigger than two or three
hundred bits. The multiply adder unit is in pipelined mode with three (cP + 1) piece
inputs (indicate as A, B, and C), and one yield acquiring (A × B mod M) + C.
To upgrade the exhibition of augmentation, the Karatsuba strategy is indicated as
d(i) is
d (i) + 1
d (i+1) = (5)
2
where d (0) = cP + 1. The result of A and B is decreased by modulus M. The pipeline

phase of the structured design is
M A = 4 + 4n + mul (6)
where n indicates the quantity of recursions, mul means the pipeline profundity of
the center increase unit.
2.2 Time Domain Operation (TDO) Units
In RCA, shift module and subtractor blocks are intended to execute time area tasks.
Taking into account that the information width in RAM piece to store the after effects
of NWT [16]. The information size is kept up during FFT calculation, we isolate
these huge operands into two unit fragments and register one section for each cycle
to abbreviate the convey chain. Since, every unit is intended to have three pipeline
stage cycles.
3 Modified Montgomery Multiplication
In our method, proposed two FCS-based Montgomery multipliers, designated as

FCS-MM-1 and FCS-MM-2 multipliers, comprised of one five-to-two (three-level)
and one four-to-two (two-level) CSA architecture, correspondingly. Figure 3 shows
Fig. 3 FCS-MM-1 multiplier
Fig. 4 Modified Montgomery multiplication

FCS-MM-1 multiplier, and Fig. 4 illustrates the process of modified Montgomery

multiplication.
The secluded augmentation steps of FMLM3 are registered by FFT strategy legiti-
mately with no zero-cushioning, so that lower multifaceted nature is accomplished.
An altered variant of the FMLM3 is introduced, which lessens the quantity of area
changes from 7 to 5. A universal constraint set determination technique is anticipated
for specified operand dimensions to help best FFT calculation. Pipelined models with
solitary and twofold butterfly architectures are planned, actualized so as to investigate
the connection among cycle necessity and number of butterfly architectures. The top
level block of the proposed is shown in Figs. 5 and 6.
The Virtex-6 FPGA usage results illustrate the proposed FMLM3 with mutually
one and two butterfly architectures have preferable territory inertness efficiency over
the cutting-edge FFT-based Montgomery particular duplication. Furthermore, the
preparing rate of the proposed multiplier is likewise practically identical, particularly
for huge change length (for example P = 64 or higher). Figures 7 and 8 show the
RTL schematic and simulation waveforms correspondingly.
Fig. 5 Top level block

Fig. 6 Detailed signal view
5 Conclusion
In this endeavor, we introduced a modern variant of the FFT built Montgomery

particular duplication calculation underneath McLaughlin’s system (FMLM3 ). By
assigning cyclic and nega-cyclic convolutions to register the particular increase steps,
Fig. 7 RTL schematic
the zero-cushioning activity is kept away from and the change length is diminished
considerably contrasted with the ordinary FFT-oriented multiplication. Besides, we
investigated in few exceptional situations, the quantity of changes can be additionally
decreased from 7 to 5 deprived of additional computational endeavors, so the FMLM3
could be additionally quickened. A universal technique for efficient constraint set
determination has been summed up for a specified operand dimension.
Also, pipelined models with one and two butterfly architectures are intended
for large territory inactivity efficiency. We additionally investigated the association
Fig. 8 Simulation
among the quantity of butterfly architectures and the cycle necessity. The estimation
outcomes demonstrate a practical physical methodology can be executed which could
exchange region price for quicker swiftness.
References
1. R.L. Rivest, A. Shamir, L. Adleman, A method for obtaining digital signatures and public-key
cryptosystems. Commun. ACM 21, 120–126 (1978)
2. R.L. Rivest, A description of a single-chip implementation of the RSA cipher. Lambda. 1, 4–18
(1980)
3. E. Barker, W. Barker, W. Burr, W. Polk, M. Smid, P.D. Gallagher, Recommendation for key
management–part 1: general, vol 32 (NIST Special Publication, 2012)
4. P.L. Montgomery, Modular multiplication without trial division. Math. Comput. 44, 519–521
(1985)
5. A. Karatsuba, Y. Ofman, Multiplication of multidigit numbers on automata. Soviet Physics
Doklady7, 595–601 (1963)
6. S.A. Cook, S.O. Aanderaa, On the minimum computation time of functions. Trans. Am. Math.
Soc. 23, 291–314 (1969)
7. A. Schönhage, V. Strassen, Schnelle multiplication for digit numbers. Computing 7, 281–292
(1971)
8. M. Fürer, Faster integer multiplication. SIAM J. Comput. 39, 979–1005 (2009)
9. D. Harvey, J. Van Der Hoeven, G. Lecerf, Even faster integer multiplication. arXiv preprint
arXiv.1407.3360 (2014)
10. S. Covanov, E. Thomé, Fast arithmetic for faster integer multiplication. arXiv preprint arXiv.34
(2015)
11. A.F. Tenca, C.K. Koc, A scalable architecture for modular multiplication based on Mont-
gomery’s algorithm. IEEE Trans. Comput. 52, 1215–1221 (2003)
12. M.D. Shieh, W.C. Lin, Word-based Montgomery modular multiplication algorithm for low-
latency scalable architectures. IEEE Trans. Comput. 59, 1145–1151 (2010)
13. M. Morales-Sandoval, A. Diaz-Perez, Scalable GF(p) Montgomery multiplier based on a digit–
digit computation approach. IET Comput. Digi. Tech. (2015)
14. M. Huang, K. Gaj, T. El-Ghazawi, New hardware architectures for Montgomery modular
multiplication algorithm. IEEE Trans. Comput. 60, 923–936 (2011)
15. G.C. Chow, K. Eguro, W. Luk, P. Leong, A karatsuba based Montgomery multiplier, in
IEEE International Conference in Field Programmable Logic and Applications (FPL) (2010),
pp. 434–437
16. M.K. Jaiswaland, R.C.C. Cheung, Area-efficient architectures for large integer and quadruple
precision floating point multipliers, in 2012 IEEE 20th Annual International Symposium in
Field-Programmable Custom Computing Machines (FCCM) (2012), pp. 25–28
An Efficient Group Key Establishment
for Secure Communication to Multicast
Groups for WSN-IoT Nodes
Thirupathi Durgam and Ritesh Sadiwala
Abstract Wireless sensor networks (WSNs) are an important element for Internet
of Things technologies (IoT). The radio, multimedia, group communications deliver
effective way between limited resources nodes in the IoT-WSNs rather than device-
to-device communication. Limits to the capacity of sensor nodes processing and
power usage also rendered it impossible to implement the encryption strategies devel-
oped for traditional networks. This paper introduces group key set-up protocols for
secure multicast communications between IoT-resourced devices. We presented a
new matrix for heterogeneous wireless sensor networks (HWSNs) with this paper.
In HWSN, the cluster heads are having more energy, communication, and processing
than the cluster members. This heterogeneity reduced the overhead of the clusters
of the security. It may provide cluster members on the network for both expensive
computations. Our system has several advantages as compared to other classical key
management systems in energy consumption. The experiment study shows that our
system can keep complete network connections, control configurations and explicitly
set up neighboring cluster members in pairs and minimize overhead storages.
Keywords Internet of Things (IoT) · Wireless sensor networks (WSNs) ·

Heterogeneous wireless sensor networks (HWSNs) · Session key establishment
1 Introduction
The IoT has been a powerful aspect in the networking technology of the next decade.
Wireless sensor networks (WSNs) described a core building block of IoT appli-
cations [1]. Regarding Internet access, sensors, and intelligent devices increasing,
security service providers have a fair chance. International Business Machines (IBM)
revealed recently the IoT Solutions Practice product. In this security suite, this IBM
product offers different security services. The Cisco system has estimated that the
Internet connection unit would be over 50 billion by 2022 [2]. Many sensors (i.e.,
T. Durgam (B) · R. Sadiwala

Department of ECE, RKDF University (UGC), Bhopal, India
706 T. Durgam and R. Sadiwala
heat, pressure, humidity, radiation sensors), IoT devices, power actuators, moni-
toring equipment, and several networking devices have been included in the Internet-
linking devices. Data often swing rapidly as the growth rate of the connecting devices
increases. Contributions and organizations to the relevant work were regularly inves-
tigated and checked based on security based on key control, trust management, and
authentication. Sensor systems with reduced battery power and computing capacity
are usually called resource-restricted devices [3, 4]. Ensuring for key group estab-
lishment is a prime feature to ensure that message transmission in such multicast
communities is integral, authenticated, and confidential [5]. In addition, group key
establishment protocols must accommodate IoT-powered WSNs devices and network
features, including resource limits, scalability, and the creation of dynamic groups.
Data consistency and community verification are the minimum standards for
multicast security in the same way. The multicast messages with the cryptographic
traffic encryption key (TEK) defined as the group key [6, 7] encrypt these specifica-
tions. The group key controller (GKC) produces and distributes the group key to all
group participants. In order have good privacy for the data, there should be regular
updating in the group key from time to time. For a group key control algorithm, the
GKC alters and distributes it. Any group rekey algorithm must have backward secrecy
that ensures that a passive opponent who knew an old group key subset would not
reveal the next keys and forward secrecy that ensures an enemy who knew a group key
subset could not identify the previous groups. These algorithms implement commu-
nications and overhead computations both on the GKC and on each group member
throughout the computation of the new group key. Secret sharing is being used for
various WSN protection protocols, including core administration and confidentiality
of records. The authenticated community key transmission protocol suggested in [8]
calls for the development and distribution of the group key through an online key
generation center (KGC) to maximize the overhead and reduce the versatility of the
mechanism implementing them. This work paves the way for the keying system to be
reproduced in [9, 10]. A more complex system is without trustful KGCs. The commu-
nity leader is among the group members, and the final core derivation includes all
the stakeholders as well. However, a scheme does not include ubiquitous chip-suites
for globally linked IoT systems and contains pairing dependent calculations.
We will use this paper as our core scheme and refine it on the model implemen-
tation and show that experience of sensor deployment will help us boost the perfor-
mance of a pairwise key predistribution scheme. However, structured modeling in
the IoT-enabled WSN implementation paradigms of an adequate group key estab-
lishment protocol is to secure multicasting. In other words, the performance metrics
like efficiency, scalability, and integrity are compared with associated work, and the
applicability of these protocol is defined. We justify the proposal suggested in allevi-
ating the current security limitations of other traditional systems offering improved
performance to provide secure communications.
The main contributions and organization of this paper are summarized as follows:
In Sect. 2, we describe literature review of IoT-based WSN systems. Section 3 is
proposed work. Section 4 is results and discussion. Finally, in Sect. 5, we concluded
the paper.
An Efficient Group Key Establishment for Secure Communication … 707
2 Related Work
The authors addressed in [11] about the effectiveness of messages transmitted

from group communications by broadcasting and multicasting to resource-restricted
sensor nodes in WSNs allowed by the IoT. Secure key management is essential to
ensure that multicast communications are accurate, integrity, and confidential. This
document establishes two group key protocols for the establishment of secure multi-
cast communications between resource-controlled IoT devices. In terms of particular
IoT implementation scenarios, the main deployment conditions and specifications
for each protocol are defined. In addition, an extensive study of the efficiency, scal-
ability, and security of these protocols is evaluated and justified on the applicability
of both protocols. In [12], the authors have completely ignored constructing the
distributed system reflecting the network’s topology. It often exchanges complicated
messages by carrying out some additional local computations to apply for more
complicated networks. However, additional computations to apply become easier and
spread equally among network participants, contributing to a good energy balance.
They test protocol success in a simulated environment and correlate their findings
with established core protocols for group creation. The protocol’s security depends
on the problem of Diffie-Hellman, and they used their elliptical curve analog in the
experiments. Their results essentially show how feasible it is that protocols be imple-
mented in actual solution provided for technologies to get energy and time compe-
tence requirements. The authors in [13] introduced, based on the recently suggested
IP-based multicast protocol, a new core protocol for control of group-based commu-
nication in dense WSNs. Confidentiality, honesty, and authentically based operations
are created. In this, adapted protocol purely relies on cloud for network multicast
manager (NMM) for creating, controlling, and authenticating groups within the WSN
but cannot extract the actual group key. The procedure differentiates between three
major stages. Firstly, the nodes are registered by submitting a message to the NMM
during the registration process. Secondly, in the main construction process, the group
members calculate the mutual group key. Two separate approaches have been tested
for this process various groups and multihop contact, to examine the effect of the
suggested frameworks on the WSN. These simulations show that the multicast solu-
tion is less time-consuming, more energy effective and uses almost the same amount
of coding memory as the unicast approach.
In [14], the authors report a secure process of resolving spatially deployed
geocasting in a WSN. The virtual framework on which they built the protocol allows
for distributed network use and especially efficient use for the BS control. The secu-
rity solution is an elliptical curve cryptography (ECC) key management feature to
produce symmetrical key and resolve the key exchange issue between network sensor
nodes following implementation. They avoided many threats. The protocol is easy
and energy-efficient besides integrating the critical elements of security. Consid-
ering the aforementioned effects, there are already many challenges. They expect
to investigate geocast-resistant faults in dimensions 3, ensure that standard stations
receive objects within a limited amount of time, and have a stable space for solving
this issue with the view that the basic station is not the network core and that nodes
cannot be reached. In [15], the paper presented by the authors provides methods in
order to guarantee users’ security before connections to the sensor network facili-
ties, and data are obtained. In the paper, they can attempt to speed up the estimation
of scalar multiplications by the concurrent strategy of distributing the calibration
into various individual tasks concurrently processed by multiple nodes. They were
trying to develop the stable crypto-ECC multicast routing protocol that reflects the
limitations of the WSN. Finally, Telosb sensors will test the suggested approach.
3 Material Methods
The field of multicast applications is as diverse as IoT’s technology field itself,

including smart homes, intelligent businesses, environmental monitoring, and health
care. The following case for smart industries is determined to help understand the
major criteria for multicast support. Figure 1 shows the first instance of usage to
control switches in a smart industry. The industrial surveillance network gathers
data on all parts of the industry on pressure, temperature, and level and provides
aggregated data at a central gateway. We based the central gateway on received
data which can allow synchronous operations with several machines in the indus-
trial section (e.g., on, off commands) for synchronization with the user. We require
security services to ensure the secrecy and confidentiality of communications.
Without secure and reliable key management, these are not feasible. Thus, the
primary emphasis is on key control and authentication in peer group protection work.
However, if any participant (member) is free to join the group in dynamic peer groups
(DPGs), group key management is pointless if someone can access the group key.
It also required methods to monitor membership. The access control is, therefore,
essential to allow only authorized users (members) to enter the group. This is a very
interesting concern, since all other DPG security resources are group-based. In brief,
group admission control is used to determine eligibility for membership and to start
other essential security resources, such as secure group key management and routing.
Admission monitoring for group key management is, therefore, a requirement.
System model
To construct the model, we considered the multicast network with the n number of
nodes (in our case, n = 3 see Fig. 1); among them, one is treated as initiator node to
start the process and (n − 1) be the group members. From the constructed multicast
group members, often called the response nodes, originator’s identity U j for j = 1,
2, …, (n − 1) is appointed. We used a standard secret key for secure communication
within the multicast network, identified by the initiator and respondents.
Sign the message
(1) Initially, take a random number y ∈ Z ∗p , do the operation Y = yG.
Fig. 1 Use case of smart industry for multicast group key management
(2) Recalculate x = h(Ui MR Y ) and z = y + sx where M denotes the message

and h denotes hash function.
(3) The appended signature is (R, x, z).
The signee transmits message M in combination with signature (R; x; z).
Verify the message
(1) Do the operation c = h(Ui R ).
(2) Validation x = h(Ui MR (zG − x(R + cQ i ))).
where Qi is the public key of the signee (i.e., sender) that is already known to the
receiver earlier.
This work employs signatures to ensure the integrity of data so that it can be
used in conjunction with message transmission. As IoT networks are generated from
IPv6 addresses, the use of device IDs with the signature method would be an added
advantage. We propose in this work a system for establishing a key for sessions
between the gateway and the cluster head, so that the cluster head may communicate
encrypted data to the gateway. In order to provide additional security, our proposal
should remove the vulnerabilities of the existing schemes. Furthermore, the design
of the system should not only take account of security but also energy costs, since the
nodes of the WSN are battery powered. For this purpose, the proposed system offers
features for establishing a session key between the head of the cluster and the gateway
and for encrypting and decrypting the data used. The session key ought not to be
revealed to an attacker trying to eavesdrop on the sent messages, since the strategy
provided secures the data with a session key. In addition, although long-term cluster
head parameters are revealed to an attacker, it is not appropriate for the attacker to
estimate future or previous session keys.
Session Key establishment
To begin the transaction, we sent message form initiator I, which will pick random
sequence r Z ∗ p and do the operation of R = rG. With the help of initiator identity
U i , we can compute the s value, hash function h: s = r + d i h(U i ||R), and private key
d i which is depicted in Fig. 2.
Process of Rekeying and Vault
Vault uses Shamir’s secrete sharing algorithm, in which a secret is split into a subset
of parts in order to prevent one participant from completing access to the system;
in order to rebuild, the original secret a subset of those parts is required. As part
of the unsealing method, vault extensively uses this algorithm. If a vault server has
started, vault produces a master key and breaks it into several key shares immediately
following the Shamir secret sharing algorithm. The master key is never stored by
vault, but only a quorum of unseal keys may be regenerated. The master key is used
to decode the coding key. For the encryption of remaining data in storage backend,
such as the file system or consul, vault uses an encryption key. We usually allocated
each of these major shares to trustworthy organizational parties. These parties must
come together in their major role to “unseal” the vault.
Fig. 2 Process of message flow in session key establishment
The rekeying procedure usually involves decrypting and encryption of secure data
by the old key with the new, the more expensive method for data center maintenance
as shown in Fig. 3. Often, we try to regenerate the master key and key shares like
anyone is joining or leaving the multicast network, security needs to adjust the number
of shares or the threshold of shares, and we exchanged regularly the master key with
Fig. 3 Rekeying process with key shares

the mandates of enforcement to comply. Besides removing the master key, we can
also rotate the underlying encryption key vault uses in order to encrypt data at rest. In
vault, two distinct processes are rekeying and revolving. “Recovery” is the method of
producing a new master key and implementing the algorithm of Shamir. “Rotation” is
the mechanism by which to create a fresh vault encryption key to encrypt data at rest.
Both vault rekeying and vault revolving coding key are fully online operations, as
Fig. 2 shows (b). During each of these procedures, vault can maintain uninterrupted
applications.
.
The energy consumption of the computation, transmission, and reception is deter-

mined for both standard protocols and suggested on the respondent sides by different
sizes in the n network, as shown in Fig. 4. As seen in Fig. 4, in our suggested system,
the energy costs of the main establishment at end nodes are considerably lower than
the conventional larger-group scheme.
We estimate energy consumption in the proposed approach (data transmission and
reception with ten bytes of protocol headers) by calculating the energy required for
implementation in conjunction with the energy required for communications. The
deducted values, utilized as a model of energy, are summarized in Table 1.
From Fig. 5, it is clear that the energy cost of the proposed system is less in all
the three cases of communication as related to the traditional system. This occurs
0.7
Proposed
Traditional
0.6
Energy cost for Predistribution (J)
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
Size of the network (n)
Fig. 4 Energy costs for different sizes of the network protocols used
Table 1 Estimated energy

Task Energy cost of Energy cost of
costs of nodes
traditional (μJ) proposed (μJ)
For 1-byte 6.72 5.35
transmission
For 1-byte 7.32 6.142
reception
For encryption 41.13 31.54
Fig. 5 Energy costs for different tasks for the communication
as a result of the group key management communicating directly with the central
gateway without wasting the many resources of the group members.
5 Conclusion
IoT is spreading quickly on the Internet, which requires secure connectivity. In reality,
an appropriate algorithm for security decides the privacy of both documented and
unrecorded IoT data sources. While someone directly connected all protocols to scal-
ability and security characteristics, the proposed system is often beyond conventional
energy-consumption schemes. Traditional scheme is more suitable for dispersed IoT
implementations, where group members have to contribute significantly and need
greater randomness to key computation. It is seen that more energy cost load on
responder side, the proposed approach is best-suited to centralized IoT implementa-
tions, where the major source of cryptography is an object with low-energy profiles
on the edge nodes.
References
1. L. Harn, C.-F. Hsu, Z. Xia, Z. He, Lightweight aggregated data encryption for wireless sensor
networks (WSNs). IEEE Sens. Lett. 5, 1–4 (2021). https://doi.org/10.1109/LSENS.2021.306
3326
2. K. Taeeun, The internet-connected device vulnerability information management system in
IoT environment. Int. J. Internet of Things Big Data 4, 17–22 (2019). https://doi.org/10.21742/
IJITBD.2019.4.1.03
3. G. Chakma, N. Skuda, C. Schuman, J. Plank, M. Dean, G. Rose, Energy and area efficiency
in neuromorphic computing for resource constrained devices, in Proceedings of the 2018 on
Great Lakes Symposium on VLSI (2018), pp. 379–383. https://doi.org/10.1145/3194554.319
4611
4. V.G. Kiran. C. Rai, FPGA implementation of simple encryption scheme for resource-
constrained devices. Int. J. Adv. Trends Comp. Sci. Eng. 9, 5631–5639 (2020). https://doi.
org/10.30534/ijatcse/2020/213942020
5. J. Carracedo, A. Corona, Cryptanalysis of a group key establishment protocol. Symmetry
13(332) (2021). https://doi.org/10.3390/sym13020332
6. M. Kumar, A. Kishor, Network traffic encryption by IPSec. Int. J. Comput. Sci. Eng. 7, 912–915
(2019). https://doi.org/10.26438/ijcse/v7i5.912915
7. O. Ahmedova, U. Mardiyev, O. Tursunov, Generation and distribution secret encryption keys
with parameter, in 2020 International Conference on Information Science and Communications
Technologies (ICISCT), vol. 1 (2020), pp. 1–4. https://doi.org/10.1109/ICISCT50599.2020.935
1446
8. P. Jaiswal, S. Tripathi, An authenticated group key transfer protocol using elliptic curve
cryptography. Peer-to-Peer Netw. Appl. 10 (2017).https://doi.org/10.1007/s12083-016-0434-7
9. C.-Y. Lee, Z.-H. Wang, L. Harn, C.-H. Chang, Secure key transfer protocol based on secret
sharing for group communications. IEICE Trans. 94-D, 2069–2076 (2011). https://doi.org/10.
1587/transinf.E94.D.2069
10. Y. Sun, Q. Wen, H. Sun, W. Li, Z. Jin, Z. Huan, An authenticated group key transfer protocol
based on secret sharing. Proc. Eng. 29, 403–408 (2012). https://doi.org/10.1016/j.proeng.2011.
12.731
11. P. Porambage, A. Braeken, C. Schmitt, A. Gurtov, M. Ylianttila, B. Stiller, Group key estab-
lishment for secure multicasting in IoT enabled wireless sensor networks (2015). https://doi.
org/10.1109/LCN.2015.7366358
12. I. Chatzigiannakis, E. Konstantinou, V. Liagkou, P. Spirakis, Design, analysis and performance
evaluation of group key establishment in wireless sensor networks. Electron. Notes Theor.
Comput. Sci. 171, 17–31 (2007). https://doi.org/10.1016/j.entcs.2006.11.007
13. M. Carlier, A. Braeken, Symmetric-key-based security for multicast communication in wireless
sensor networks. Computers 8, 27 (2019). https://doi.org/10.3390/computers8010027
14. A. Bomgni, E. Fute, G. Brel, G. Mdemaya, A. Anastasie, K. Donfack, C.T. Djamegni, A. Leo,
Energy efficient and secured geocast protocol in wireless sensor network deployed in space
(3D). Int. J. Wirel. Mobile Netw. 10(11) (2018).https://doi.org/10.5121/ijwmn.2018.10202
15. W. Jerbi, A. Guermazi, H. Trabelsi, Crypto-ECC: a rapid secure protocol for large-scale wireless
sensor networks deployed in Internet of Things (2020). https://doi.org/10.1007/978-3-030-
48256-5_29
Design of Sub-volt High Impedance Wide
Bandwidth Current Mirror for High
Performance Analog Circuit
P. Anil Kumar, S. Tamil, and N. Raj
Abstract The demand of high performance long battery life portable wearable
devices has forced the electronic industries to build up with new methods for circuit
realization so as to achieve better performance in sub-volt supply. In this paper, the
performance enhancement of widely used analog block, current mirror is done. The
current mirror proposed is flipped voltage follower-based structure, whose perfor-
mance enhancement is achieved in terms of output resistance. In order to boost the
resistance, the output section of current mirror uses regulated cascode stage which
helps in increase of output resistance from 880 to 32 M. The regulated cascode
uses the feedback concept which not only provides the resistance boosting factor
but also the reduced capacitance leads to bandwidth improvement which observed
for proposed current mirror is 2.2 GHz. The complete analysis is done using on
MOSFET models of 180 nm technology at a dual supply voltage of 0.5 V.
Keywords Current mirror · Flipped voltage follower · Regulated cascode ·

Bandwidth · Output resistance
1 Introduction
The performance of any system is decided by the circuit configurations used for
its implementation. For analog systems, one of such fundamental blocks is current
mirror extensively used. The common application of current mirror is in biasing of
amplifiers, active load, current amplification, filtering, level shifter, etc. [1]. The role
of current mirror is to generate the output in form of current as a function of input
current. The ideal characteristic of current mirror which determines its performance
includes wide dynamic range and bandwidth and low input resistance and high output
resistance. Apart from this, the operating voltage is also an important parameter as it
decides the amount of power consumption. However, at low supply, the fulfillment
P. A. Kumar (B) · S. Tamil

N. Raj
Department of ECE, The LNM Institute of Information Technology, Jaipur, India
716 P. A. Kumar et al.
of these ideal requirements becomes difficult. The main obstacle in design of low
power current mirror is the threshold voltage of MOS transistor. It is the minimum
voltage required to turn-on the MOSFET. A number of techniques have been adopted
in current mirror realization to achieve the desired performance [2], among which
few recent reported architecture of current mirrors can be found in [3–10]. In this
paper, the current mirror proposed is based on the low voltage block flipped voltage
follower (FVF) [11] and regulated cascode structure. The FVF is a modified form of
conventional source follower configuration which operates on low supply voltage.
The FVF-based current mirror reported in literature can be found in [12–18]. The
current mirror proposed in this paper has wide dynamic range, giga hertz range
bandwidth, and high output impedance compared to its conventional design. This
paper is divided in five sections as the brief analysis, and discussion of proposed
current mirror is discussed in Sect. 2 which also carries mathematical analysis. The
simulations results are shown in Sect. 3 followed to conclusion in Sect. 4.
2 Proposed Current Mirror
The conventional current mirror based on flipped voltage follower is shown in Fig. 1a.
It includes four N-type MOS transistors (M 1 –M 4 ). As the drain current of M 3 is
constant due to current source I B1 , any change in the input current is sensed by M 1
and accordingly produces suitable change in its gate-to-source voltage (V gs,M1 ) which
modulates the output current (I out ). The V bias is the DC voltage applied to maintain
M 3 and M 4 in saturation.
Iin Iout
Iin
Iout
VDD
IB1
VDD
VDD
M3 M4 M3
M4
M5
M1 M2 M1 M2
VSS VSS
(a) (b)
Fig. 1 a Conventional FVF current mirror; b proposed FVF current mirror
Design of Sub-volt High Impedance Wide Bandwidth … 717
Performing routine small signal analysis gives the input and output resistances of
the FVF current mirror as (1/gm1 ) and (gm4 r 04 r 02 ), respectively, where gmi and r 0i
denote the transconductance and output resistance of related transistor. The observed
output resistance of (gm4 r 04 r 02 ) ranges in kilo ohms which is not sufficient for precise
applications. In this proposed design, the output section is modified by cascode
approach, i.e., regulated cascode as shown in Fig. 1b. The output section is initially
modified using transistor M 5 . The voltage headroom of M 2 is regulated via M 4 and
M 5 transistors. The presence of feedback loop amplifier implemented using M 5 and
I B1 prevents variations in drain-to-source voltage of M 2 ensuring better stability. The
working is similar to that of cascode, however, this configuration yields an additional
multiplying factor of (gm r 0 ) in output resistance [19]. Moreover, the C gd,M4 does not
appear in the current input path as seen with traditional cascode current mirror. The
reduced capacitance improves the bandwidth of the circuit. The effective output
resistance compared to conventional FVF current mirror design is boosted by (gm r 0 )
which turn out in the range of mega ohms.
2.1 Small Signal Analysis
During analysis, the symbols used have their usual meaning and match with stan-
dard spice model parameters of MOS transistors. The operating region of all MOS
transistors is assumed in saturation region.
2.1.1 Input Resistance

The small signal model for calculating the input resistance Rin,prop. of proposed
current mirror is shown in Fig. 2.
Fig. 2 Small signal model I in 1

for calculating input
resistance gm3V2
Vin r
03
gm1V1
r
01
At node one,
V1 − V2
i in = −gm3 V2 + (1)
r03
At node two,
V2 V2 − V1
gm1 V1 + + gm3 V2 + =0 (2)
r01 r03
Since gm r0 1,
gm1
V2 = − V1 (3)
gm3
From (1) and (3),
V1 1
Rin,prop. = ≈ (4)
i in gm1
From (4), it can be observed that compared to conventional current mirror, there
is almost no change in the input resistance; as the input section, no changes were
done.
2.1.2 Output Resistance

The small signal model for calculating the output resistance Rout,prop. of proposed
current mirror is shown in Fig. 3.
At node four,
V4 − V3
i out = gm4 V53 + (5)
r04
At node three,
V3 = i out r02 (6)
At node five,
V5 = −gm5r05 V3 (7)
From (5), (6), and (7),

I out
4
5
gm5 V3 gm4V 53
r r
05 04 Vout
r
02
Fig. 3 Small signal model for calculating output resistance
V4
Rout = = r04 + gm4 gm5r05r02 r04 + r02 (8)
i out
Since gm r0 1,
Rout,prop. ≈ (gm4r04 )(gm5r05 )r02 (9)
whereas for conventional FVF current mirror, it is given as
Rout,conv. ≈ (gm4r04 )r02 (10)
Comparing (9) with (10), a multiplying factor of (gm r 0 ) is observed in output

resistance of conventional FVF current mirror where (gm r 0 ) is the result of using
regulated cascode configuration. The effective resistance of proposed current mirror
gets enhanced from kilo ohm to mega ohm range.
The proposed current mirror circuits shown in Fig. 1a, b and is simulated MOS
model of 180 nm technology at ±0.5 V supply. The simulation results match with
the mathematical analysis. The MOS width and length along with other assumed
parameters for circuit simulations are listed in Table 1. The input bias current is set
to 65 uA ensuring lower offset in the circuit. The output characteristic is shown in
Fig. 4, where input current is wept from 0 to 200 uA in steps of 50 uA.
Table 1 W and L of MOS transistors used in proposed current mirror

MOSFET W (um) L (um) MOSFET W (um) L (um)
M1 25 0.24 M5 2 0.24
M2 25 0.24 M4 5 0.24
M3 5 0.24
Supply = ±0.5 V, IB1 = 10 uA, IB2-IB4 = 30 uA
Fig. 4 Output
characteristics
As seen, the proposed current mirror operates with minimal error. The frequency
response, input resistance, and output resistance plots are shown in Figs. 5, 6, and 7,
respectively. As seen in Fig. 5, the bandwidth gets slightly extended due to reduced
capacitance. For the proposed current mirror, it is 2.2 GHz; whereas for conventional,
it is around 1.5 GHz. Also, there is no change in the input resistance and remains
same in both the current mirrors which here are 820 as seen in Fig. 6. However, a
drastic improvement in output resistance can be seen in Fig. 7, which for proposed,
it is found to be 32 M, and for conventional, it is 880 K.
Fig. 5 Frequency responses

Fig. 6 Input resistances
Fig. 7 Output resistances
The complete simulated simulation results of conventional proposed FVF current

mirrors are shown in tabulated form in Table 2, and also it is compared with recently
reported low power, FVF current mirrors.
Table 2 Comparison of parameters of proposed current mirror with FVF current mirrors
Parameters [14] [15] [16] [17] [18] Conv. CM Prop. CM
Input current range (uA) 300 300 100 1000 0–500 0–200 0–200
Input resistance (ohm) 13.3 12.8 496 68.3 17 820 820
Output resistance (ohm) 34.3G 39.5G 1M 10.5G 750 K 880 K 32 M
Bandwidth (Hz) 210 M 216 M 181 M 402 M 4.5G 1.5G 2.2G
Supply (V) 1 1 0.9 1 ±0.5 ±0.5 ±0.5
Power (uW) 42.5 42.5 150 110 140 110 130
Technology (nm) 180 180 180 180 180 180 180
4 Conclusion
A low voltage current mirror giga hertz range bandwidth having high output resis-
tance has been presented in this paper. The achieved output resistance in mega ohm
range with the help of cascode approach suits its applicability in precise amplifiers.
Also the giga range bandwidth can find number of applications in high speed circuits.
The complete design has been implemented in 180 nm technology at dual supply of
0.5 V. The micro watt power dissipation encourages its applications in low power
electronic devices.
References
1. M. Akbari, A. Javid, O. Hashemipour, A high input dynamic range, low voltage cascode current
mirror and enhanced phase-margin folded cascode amplifier. Iranian Conf. Electr. Eng. 77–81
(2014)
2. F. Khateb, S. Bay, A. Dabbous, S. Vlassis, A survey of non-conventional techniques for low-
voltage low-power analog circuit design. Radioengineering 415–427 (2013)
3. X. Zhang, E. El-Masry. A regulated body-driven CMOS current mirror for low-voltage
applications. IEEE Trans. Circ. Syst. II: Express Briefs 571–577 (2004)
4. P.S. Manhas, S. Sharma, K. Pal, L.K. Mangotra, K.K.S. Jamwal, High performance FGMOS-
based low voltage current mirror. Indian J. Pure Appl. Phys. 355–358 (2008)
5. F. Esparza-Alfaro, A.J. Lopez-Martin, J. Ramírez-Anguloa, R.G. Carvajal, Low-voltage highly-
linear class AB current mirror with dynamic cascode biasing. Electron. Lett. 1336–1338 (2012)
6. N. Raj, A.K. Singh, A.K. Gupta, Low power high output impedance high bandwidth QFGMOS
current mirror. Microelectron. J. 1132–1142 (2014)
7. F. Esparza-Alfaro, A.J. Lopez-Martin, R.G. Carvajal, J. Ramirez-Angulo, Highly linear
micropower class AB current mirrors using Quasi-Floating Gate transistors. Microelectron.
J. 1261–1267 (2014)
8. N. Raj, A.K. Singh, A.K. Gupta, Low-voltage bulk-driven self-biased cascode current mirror
with bandwidth enhancement. Electron. Lett. 23–25 (2014)
9. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high output impedance bulk-driven quasi-floating
gate self-biased high-swing cascode current mirror. Circ. Syst. Signal Process. 2683–2703
(2016)
10. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high performance bulk-driven quasi-floating gate
self-biased cascode current mirror. Microelectron. J. 124–133 (2016)
11. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high bandwidth self-biased high swing cascode
current mirror. Indian J. Pure Appl. Phys. 1–7 (2017)
12. R.G. Carvajal, J. Ramirez-Angulo, A.J. Lopez-Martin, A. Torralba, J.A.G. Galan, A. Carlosena,
F.M. Chavero, The flipped voltage follower: a useful cell for low-voltage low-power circuit
design. IEEE Trans. Circ. Syst. I: Regular Papers 1276–1291 (2005)
13. V.I. Prodanov, M.M. Green, CMOS current mirrors with reduced input and output voltage
requirements. Electron. Lett. 104–105 (1996)
14. S. Azhari, H.F. Baghtash, K. Monfaredi, A novel ultra-high compliance, high output impedance
low power very accurate high performance current mirror. Microelectron. J. 432–439 (2011)
15. Y. Bastan, E. Hamzehil, P. Amiri, Output impedance improvement of a low voltage low power
current mirror based on body driven technique. Microelectron. J. 163–170 (2016)
16. L. Safari, S. Minaei, A low-voltage low-power resistor-based current mirror and its applications.
J. Circ. Syst. Comput. 175–180 (2017)
17. M.S. Doreyatim, M. Akbari, M. Nazari, A low-voltage gain boosting-based current mirror with
high input/output dynamic range. Microelectron. J. 88–95 (2019)
18. N. Raj, Low voltage FVF current mirror with high bandwidth and low input impedance. Iranian
J. Electr. Electron. Eng. 1–7 (2021)
19. A. Torralba, R.G. Carvajal, J. Ramirez-Angulo, E. Munoz, Output stage for low supply voltage
high-performance CMOS current mirrors. Electron. Lett. 1528–1529 (2002)
Low-Voltage Low-Power Design
of Operational Transconductance
Amplifier
Rajesh Durgam, S. Tamil, and Nikhil Raj
Abstract In this paper, the design of low-power low-voltage operational transcon-

ductance amplifier is presented. The amplifier realization is basically current mirror-
based which has wide application in analog circuits. For low-voltage operation, the
bulk-driven technique is used. As the bulk-driven approach has disadvantage of low
transconductance, the proposed amplifier uses bulk-driven transistor in quasi-floating
mode. The combined feature resulted in high-boosted transconductance thereby
improving gain and bandwidth with power dissipation equal to that of bulk-driven
circuit. The design and analysis have been done using 0.18 micron technology at a
dual supply of 0.5 V.
Keywords OTA · QFG MOSFET · Bulk-driven · Gain · Transconductance
1 Introduction
The rapid increase in demand of efficient portable equipment for biomedical appli-
cations has pushed industry to design low-voltage and low-power analog and mixed
signal ICs for long-term use. The general trend followed in SOC design is the
technology downscaling which is easily implemented in digital circuits but not in
analog circuits. The common approach followed is the scaled supply voltage [1].
But, threshold voltage of MOS transistor creates an obstacle in lowering of supply
voltage beyond a certain limit. Using gate-driven (GD) technique, the supply cannot
be lowered below threshold voltage of MOSFET. In view of this, various low-power
(LP) techniques have been proposed in literature. Few commonly known comes are
bulk-driven (BD), level shifter, floating gate (FG), and quasi-floating gate (QFG)
[2–5]. Among stated techniques, the BD has attracted considerable interest for low-
power design due to its simple architecture. In BD MOS transistors, the gate terminal
is fixed to voltage so as to create channel for MOS transistor to turn on, whereas the
R. Durgam (B) · S. Tamil

N. Raj
Department of ECE, The LNM Institute of Information Technology, Jaipur, India
726 R. Durgam et al.
input signal is applied at bulk which controls the I DS current. The issue with BD is
low transconductance and poor frequency response. The decreased transconductance
is visible in poor gain and bandwidth. The objective of this paper is to exploit the
advantage of using BDQFG technique over BD which results in enhanced transcon-
ductance and, hence, improves the unity gain bandwidth (UGB). The effects of using
LP techniques: QFG, BD, and BDQFG on the performance of CM-based opera-
tional transconductance amplifier (OTA) have been analyzed. The paper is organized
as follows: Sect. 2 of the paper covers the brief of bulk-driven and bulk-driven
quasi-floating gate MOSFETs. Section 3 details the proposed OTA design realized
using BDQG as well using GD, BD, and QFG MOSFETs. Simulation results and
conclusion are shown in Sects. 4 and 5, respectively.
2 Low-Power Technique
The MOS is a four-terminal device whose fourth terminal is the bulk. Using the bulk
terminal as a signal, the threshold voltage limitation can be removed. Based on this,
BD technique was first reported in [6].
Few recent articles based on BD for realizing LP circuits can be found in [7–10].
Recently, a new approach named as BDQFG is proposed in literature which uses
QFG MOS transistor in BD mode. Such approach results in high transconductance
and, hence, the improvement in bandwidth over BD and QFG-based circuits [11–
19]. The schematics of BD and BDQFG MOS transistor are shown in Figs. 1a, b,
respectively. In Fig. 1b, the bulk is tied to the input of QFG MOS transistor MN.
Under DC analysis, it works as standard BD, whereas under AC, it combines the
features of BD and QFG.
Fig. 1 N-channel: a
Bulk-driven (BD) and b
Bulk-driven QFG MOS
transistor
Low-voltage Low-power Design … 727
3 Proposed OTA
In this section, the standard OTA choosen is current mirror-based OTA [20, 21].
The current mirror OTA uses three simple two MOS transistor-based CM topology.
The schematic of OTA based on standard GD approach is shown in Fig. 2. The
combination of N-type CM (M7, M8) and P-type CM (M3, M5) and (M4, M6)
is the three basic CM’s used to build OTA. The N-type CM (M9, M10) acts as a
tail current source biased using a constant current source (Ibias ). The generalized
schematic of OTA design based on low-power technique is shown in Fig. 3 where
input signal processing block is represented using a LP technique block which has
three terminals as A, B, and C. This LP technique block can be replaced by technique:
QFG, BD, and BDQFG techniques.
The design equations governing the OTA performance parameters of Fig. 3 are:
(i) Transconductance
Fig. 2 N-channel CM-based VDD
OTA using LP technique

M5 M3 M4 M6
VDD
A B
LP Technique
DC Iout
C
M9 M10
M7 M8
VSS
A B
A B
VDD VDD
VDD VDD A B
MP1 MP2
MP1 MP2
M2 VDD M1 M2 VDD M1 M2
M1
C1 Vin- Vin+ C1 C2
C2 Vin- Vin+
Vin- Vin+
C C C
(i) (ii) (iii)
Fig. 3 N-channel LP technique block: (i) QFG, (ii) BD, and (iii) BDQFG

W
Gm = μn Cox I10 (1)
L 2
(ii) Output resistance
Rout = 1/(gds6 + gds8 ) (2)
(iii) DC gain

W 1
A V = G m Rout = μn Cox I10 (3)
L 2 gds6 + gds8
(iv) Dominant pole
ω−3dB = 1/(1/gds6 + gds8 )C L (4)
(v) Unity gain bandwidth

Gm W
UGB = = μn Cox I10 /C L (5)
CL L 2
From Eq. (3), the DC gain is a function of the input gate transconductance of
M2 (gm2 ) and effective output resistance (Rout ) of the amplifier. Using GD, the
maximum transconductance is achieved, whereas use of mentioned LP techniques
results in low transconductance of M2. As is well-known, QFG uses the capacitor
divider network which attenuates the effective gate voltage of M2 and, thus, reduces
transconductance, whereas in case of BD, the transconductance of M2 is the body
transconductance (gmb2 ) which is usually (0.2–0.4) times of gate transconductance.
For BDQFG, the effective transconductance
of M2 is the combined effect of QFG
MOST gate transconductance gm2,QFG and bulk MOST body transconductance
(gmb2 ), i.e., gm2,BD−QFG = gm2,QFG + gmb2 .
The BDQFG technique offers the high transconductance, low output impedance,
and high DC gain. The parameter affected by output impedance is visible in dominant
pole location which is inversely proportional to the output impedance. In summary,
the OTA unity gain bandwidth (UGB) is maximized for BDQFG with a low power
consumption level. The only drawback associated with BDQFG technique is its
sensitivity to nonlinear effects which causes DC offsets. These offsets can be removed
by proper matching of input MOS transistors and gate input capacitors.
The CM OTA of Fig. 2 has been simulated on 0.18 micron technology at dual supply
on 0.5 V. The performance of OTA has been evaluated with QFG, BD, and BDQFG
techniques and compared to GD approach. The dimensions of MOS transistors and
other parameters assumed for simulation of OTA are listed in Table 1.
The effective transconductance of respective OTA’s is shown in Fig. 4. From the
plots, it is seen that the decreased gate voltage of QFG results in low transconduc-
tance compared to that of standard GD configuration, i.e., G m,QFG ≈ (0.8 − 0.9)G m ,
whereas for BD, it is very low. However, in case of BDQFG, the highest transcon-
ductance can be observed which reflects in improved gain compared to those for BD
and QFG as shown in Fig. 5. The decreased output impedance in BDQFG results
Table 1 W and L of MOS transistors

Transistors W (um) L (um) Transistors W (um) L (um)
M1 15.12 1.26 M6 15.12 0.72
M2 15.12 1.26 M7 15.12 1.26
M3 15.12 1.98 M8 15.12 1.26
M4 15.12 0.72 M9 15.12 1.98
M5 15.12 1.98 M10 15.12 1.98
MP1 0.24 0.24 MP2 0.24 0.24
C1 = C2 = 1 pf, CL = 1 pf, Ibias = 10 uA
Fig. 4 Transconductance of CM OTA

Fig. 5 Frequency response of CM OTA
in better 3DB frequency as compared to the other techniques. The overall effect of
high gain and decreased output impedance for BDQFG is observed in UGB product
which is found to be equivalent to that of conventional GD-based CM OTA but with
an advantage of low power consumption. The transient analysis results are shown in
Fig. 6. From simulations, it can be concluded that BDQFG is a better among others
for low power consumption. The comparative performance of all the abovementioned
CM OTAs as obtained by simulations is summarized in Table 2.
Fig. 6 Transient response of CM OTA

Table 2 Comparative analysis of performance metrics of CM OTAs

Parameters GD QFG BD BDQFG
Transconductance (Gm ) (uA/V) 102 90.5 14.9 103
DC gain 38.4 36.81 21.51 38.31
F3DB (KHz) 183.05 183.83 189.12 189.14
UGB (MHz) 14.11 12.47 2.25 14.30
Phase margin (PM) 65 68 90 64
Power (uW) 27.37 23.47 23.78 23.78
Output impedance (K) 805 798 785 785
Supply ±0.6 V ±0.5 V ±0.5 V ±0.5 V
5 Conclusion
In this paper, a low-power low-voltage design of operational transconductance ampli-

fier is presented. For low-voltage operation, the bulk-driven quasi-floating gate
approach has been used. As a comparison, the proposed is compared with its conven-
tional design based on GD, BD, and QFG techniques. It has been observed that the
processing of input stage determines the overall circuit performance. As observed,
BDQFG has significant advantage over BD parameters.
References
1. B.J. Blalock, P.E. Allen, G.A. Rincon-Mora, Designing 1-V op amps using standard digital
CMOS technology. IEEE Trans. Circ. Syst. II Analog Digital Sig. Process. 769–780 (1998)
2. M. Gupta, R. Pandey, Low-voltage FGMOS based analog building blocks. Microelectron. J.
903–912 (2011)
3. J.M.A. Miguel, A.J. Lopez-Martin, L. Acosta, J. Ramirez-Angulo, R.G. Carvajal, Using floating
gate and quasi-floating gate techniques for rail-to-rail tunable CMOS transconductor design.
IEEE Trans. Circ. Syst. I Regul. Pap. 1604–1614 (2011)
4. C. Garcia-Alberdi, A. Lopez-Martin, L. Acosta, R.G. Carvajal, J. Ramirez-Angulo, Tunable
class AB CMOS Gm-C filter based on quasi-floating gate techniques. IEEE Trans. Circ. Syst.
I Regul. Pap. 1300–1309 (2013)
5. N. Raj, A.K. Singh, A.K. Gupta, Low power high output impedance high bandwidth QFGMOS
current mirror. Microelectron. J. 1132–1142 (2014)
6. A. Guzinski, M. Bialko, J.C. Matheau, Body driven differential amplifier for application in
continuous-time active C-filter. Proc. ECCD, Paris France 315–319 (1987)
7. N. Raj, R.K. Sharma, Modeling of human voice box in VLSI for low power biomedical
applications. IETE J. Res. 345–353 (2011)
8. H. Khameh, H. Shamsi, On the design of a low-voltage two stage OTA using bulk-driven and
positive feedback techniques. Int. J. Electron. 1309–1315 (2012)
9. L. Zuo, S.K. Islam, Low-voltage bulk-driven operational amplifier with improved transcon-
ductance. IEEE Trans. Circ. Syst. I Regul. Pap. 2084–2091 (2013)
10. J. Gak, M.R. Miguez, A. Arnaud, Nanopower OTAs with improved linearity and low input
offset using bulk degeneration. IEEE Trans. Circ. Syst. I Regul. Pap. 1–10 (2013)
11. F. Khateb, Bulk-driven floating-gate and bulk-driven quasi-floating-gate techniques for low-
voltage low-power analog circuits design. AEU-Int. J. Electron. Commun. 64–72 (2013)
12. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high output impedance bulk-driven quasi-floating
gate self-biased high-swing cascode current mirror. Circ. Syst. Sig. Process. 2683–2703 (2015)
13. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high performance bulk-driven quasi-floating gate
self-biased cascode current mirror. Microelectron. J. 52, 124–133 (2016)
14. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high bandwidth self-biased high swing cascode
current mirror. Indian J. Pure Appl. Phys. 55, 245–253 (2017)
15. N. Raj, Low voltage FVF current mirror with high bandwidth and low input impedance. Iranian
J. Electr. Electron. Eng. 1–7 (2021)
16. F. Khateb, W. Jaikl, M. Kumngern, P. Prommee, Comparative study of sub-volt differential
difference current conveyors. Microelectron. J. 1278–1284 (2013)
17. N. Raj, A.K. Singh, A.K. Gupta, Low-voltage bulk-driven self-biased cascode current mirror
with bandwidth enhancement. Electron. Lett. 23–25 (2014)
18. N. Raj, A.K. Singh, A.K. Gupta, High performance current mirrors using quasi-floating bulk.
Microelectron. J. 11–22 (2016)
19. F. Khateb, The experimental results of the bulk-driven quasi-floating-gate MOS transistor. AEU
Int. J. Electron. Commun. 462–466 (2015)
20. P.E. Allen, D.R. Holberg, CMOS Analog Circuit Design (Oxford University Press, USA, 2002)
21. S. Ali, A power efficient gain enhancing technique for current mirror operational transconduc-
tance amplifiers. Microelectron. J. 183–190 (2015)
Automatic Detection of Cerebral
Microbleed Using Deep Bounding Box
Based Watershed Segmentation
from Magnetic Resonance Images
T. Grace Berin and C. Helen Sulochana
Abstract Cerebral microbleeds (CMBs) are also mentioned as cerebral micro

hemorrhages bring about by tectonic abnormalities of the small vessels of the brain.
They have been identified as prime diagnostic biomarker for many cerebrovascular
diseases and cognitive dysfunctions. In current clinical routine, Cerebral microb-
leeds are manually labeled by radiologists but this method is difficult, time devasting
and error prone. In this paper, we propose a new automatic method to detect Cere-
bral microbleedsfrom magnetic resonance images. The usual drawback of watershed
segmentation which is oversegmentation and its sensitivity to false edges. This paper
introduces a novel scheme to overcome the listed limitations by first applying the
Deep Learning Bounding Box (DLBB) and watershed segmentation. All the tests
have been conducted on the genuine Computer Tomography and magnetic resonance
images picture dataset which we have been collected from Worldwide Cancer Center,
Neyyoor, India The collected image dataset consists of 900 images of 74 patients
where each category has 300 MRI images. The dataset also contains the scan images
generated from 64-Slice somatommagnetic resonance images Scanner with voxel
dimension 0.412 × 0.412 × 5.1 mm, image slice dimensions of 512 × 512 and all
images were in DICOM format. This proposed method demonstrates a significant
improvement that may serves as a computer aided tools for radiologists in detecting
microbleeds in magnetic resonance images and achieved a high sensitivity of 96%.
Keywords Cerebral microbleed · Deep learning bounding box · Watershed

segmentation
1 Introduction
Cerebral microbleed (CMB) is tiny perivascular hemosiderin deposits caused

discharge through cerebral small vessels. They can issue from Cerebra-vascular
T. G. Berin (B)
Ponjesly College of Engineering, Nagercoil, India
C. H. Sulochana
Department of ECE, St.Xavier’s Catholic College of Engineering, Nagercoil, India
734 T. G. Berin and C. H. Sulochana
disease, dementia or simply from normal aging. The MRI sequence will envisage the
microbleeds that are open to hemosiderin deposition such as T2*-Gradient Recalled
Echo (GRE) [1, 2]. Modern imaging protocols such as susceptibility weighted
imaging (SWI) that are regularly run at high resolution (≤1 mm3 ), long echo time,
and use the phase image to enhance contrast, are much more sensitive in identi-
fying small bleeds than established protocols [3]. Current publications have shown
that when SWI is weighed up with standard gradient echo imaging there is a three
to six-fold increase in the number of CMBs seen [4]. The examination of microb-
leeds are strenuous and time wasting, as the radiologist or experts need to verify
uphold slice by slice while identifying the black dots to transform from mimics.
CMBs are prevalent in patients with cerebrovascular and cognitive diseases (such
as stroke and dementia), as well as present in healthy aging individuals. Apart from
indicating these vascular diseases, CMBs could also structurally affect their nearby
brain tissues, and further cause neurologic dysfunction, cognitive impairment and
dementia [5] The observer variability for the detection of CMBs is large [6]. Addi-
tionally, manual detection of CMBs is a time consuming task, which can take more
than two hour per Traumatic brain injury (TBI) patient. A Computer Aided Detec-
tion (CAD) system can implies these drawbacks. Several CAD systems have been
developed for the detection of CMBs in other patient populations the existence of
CMBs and their distribution patterns have been recognized as important diagnostic
biomarkers of cerebrovascular diseases. For example, the lobar distribution of CMBs
suggests probable cerebral amyloid angiopathy [7]. Segmentation of image is very
important and can be classified as the most difficult function in image processing.
Segmentation is defined as the grouping of data which share same characteristics
such as color intensities etc. [8]. Generally, the watershed transformation is applied
to image gradient and shows the segmentation results as watershed lines which sepa-
rated the regions. This image gradient method usually produced result with noise and
poor quality of segmentation or oversegmentation [9]. To reduce the effect of over-
segmentation, numerous approach been proposed. For example, watershed technique
based on markers [10], scale space method [11], region merging method [12], partial
different equations methods for image enhancements [13], the combined technique
between wavelet transformation and watershed transformations [14] etc. In water-
shed segmentations, the separation of image basically depends on the image gradient.
Theoretically, the image gradient corresponds to the homogenous gray level of the
image. The nature of image that are low contrast will generate small area of gradient,
resulted distinct regions to be erroneously merged [15]. In watershed segmentations,
the separation of image basically depends on the image gradient. Theoretically, the
image gradient corresponds to the homogenous gray level of the image. The nature
of image that are low contrast will generate small area of gradient, resulted distinct
regions to be erroneously merged [16]. Sliding window processing was employed
that produce good generalization report [17]. To speed up the convergence network
in segmentation multimodal fusion technique was implemented [18]. Deep learning
method [19]. SVM sub classifier [20]. This paper will discuss on the segmentation of
image by deep learning bounding box and watershed transform in which the image
Automatic Detection of Cerebral Microbleed Using Deep Bounding … 735
enhancement technique that are used to prevent oversegmentation and at the same
time also reduce the noise.
Figure 1 shows an overview of the proposed framework, which is composed of two

stages: preprocessing and segmentation. The input images are acquired from the MRI
scanner in DICOM format. In the preprocessing stage, the input image is converted
into gray scale. The acquired converted image having noise, that noise is removed
Fig. 1 Overview of
proposed framework
Fig. 2 Preprocessed ımage,

a original ımage, b filtered
ımage
by anisotropic filter and thus the image is sharpened and smoothened. Subsequently,
in the segmentation, bounding box and watershed transformation is applied.
2.1 Preprocessing: Gray Scale Conversion and Noise

Removal
The first step of preprocessing is to convert RGB image into grayscale image. The
basic purpose of applying color conversion is to reduce the number of colors.. High
frequency noise is present in magnetic resonance images and it is usually removed by
a filtering process. The anisotropic diffusion filter (ADF) was proposed to adaptively
remove the noise from CMB image, maintaining the image edges. After the image
is converted to grayscale image, it is given as an input to the anisotropic filter. For
basis of most sharpening methods anisotropic type of filter is used. When contrast
is enhanced between adjoining areas with little variation in brightness or darkness
image is sharpened. In the anisotropic filter the frequency is decreased which helps
to keep the image with high frequency information. Anisotropic filter is used in order
to increase the brightness of the center pixel kernel. A single positive value is found
in the center of the kernel array, which is totally surrounded by the negative values.
Anisotropic diffusion filter is used for smoothening the magnetic resonance images.
Figure 2 compares both the original image with the filtered image, ADF preserves
the sharpness of edges better than Gaussian blurring.
2.2 Deep Bounding Box and Watershed Transformation
The goal of segmentation is to simplify or change the representation of an image

easier to analyze Watershed segmentation is an intitutive and simple method in
which parallelization is possible for fast computations. Complete partitioning of
CMB image with poor contrast is possible with this approach and contour joining
is not necessary. This method is entirely different from other edge based segmen-
tation methods because the boundaries of an CMB image will be connected and
closed. These boundaries of regions thus obtained belong to the contour of microb-
leeding image. The segmentation efficiency of above said algorithm increase if the
foreground objects and background regions are verified and marked separately. This
concept is referred to as marker controlled watershed segmentation (MCWS). Once
the Bounding Box Segmentation is over region merging process is started. Different
regions of an image are merged to form a single region with some similarity crite-
rion. Bounding Box Segmentation is a fast and simple technique which can efficiently
separate the pixels in a CMB image having similar properties to build large regions
or objects. This method receives a predefined set of seed pixels along with the input
image and these seed pixels point to the objects to be segmented. The seed pixel
is compared to all unallocated neighboring pixels in the image and this enables the
region to grow iteratively. s is the measure of similarity, where s defined as the
difference between mean of pixels in is measured and these pixels are allocated to
the corresponding region. This process repeats until all the pixels in the image are
allocated to any one of the regions. Figure 3 shows the segmented bleeding region.
For implementing Deep Learning Bounding Box (DLBB) algorithm after the
preprocessing phase the steps to be followed are initially the number of DLBB
clusters is selected then initialize the cluster centers. Elements for the partition matrix
is calculated from the cluster center and finally the cluster center value is computed.
Bounding box algorithm is represented by U matrix. The values are stored between 0
and 1, which represents the membership data points for each and every cluster, while
the hard c-means uses only 0 and 1 as the two values for the membership function.
Fig. 3 Detected CMB from MR images, a original image, b filtered image, c locating the seed
bounding box, d CMB region marked, e segmented bleeding region, f DBB segmented coloring
Image, g area red marked Image

c
n
J 1 (u, v) = u imj X j − Vi2 (1)
i=1 j=1
where u is the membership matrix; V is the cluster center matrix; n is the number of
pixel points; c is the number of clusters; X i is the jth measured pixel point; and V i is
the center of cluster i.
3 Results
Cereral microbleeding identification and detection is a crucial and important task in

the segmentation process. Diagnosing the image properties and generating quality
and timely results for medical purpose is a challenging task. Deep Learning Bounding
Box algorithm is implemented to detect brain bleeding from the MRI brain image
data. The proposed strategy has been tried on a 64-bit workstation with a 3.70 GHz
Intel(R) Xenon(R) CPU E5-1630 v4, 32 GB Smash with Quadro K1200 CUDA
gadget utilizing MATLAB 2018b. The two tests have been performed on two diverse
datasets. Within the to begin with explore, the picture dataset is part into 20% testing
and 80% preparing, whereas in another explore we have performed tenfold cross-
validation on the picture dataset. The methodology starts with the preprocessing
stage, followed by the segmentation process and finally the validation process.
From Table 1, by comparing the three methods, peak to noise ratio (PSNR) and
mean square error (MSE) are used to compare the squared error between the original
image and the reconstructed image. There is an inverse relationship between PSNR
and MSE. So a higher PSNR value indicates the higher quality of the image.
Table 1 Performance
Metrics Unsharp masking Bilateral filter Proposed method
analysis of preprocessing
PSNR 26.54 29.61 36.94
SSIM 0.7462 0.8351 0.9156
MSE 0.0058 0.0029 0.00046
Table 2 Performance analysis of segmentation method

Metrics Thresholding [5] Region growing Bounding box Watershed
[3] segmentation
Dice Mean 92.44 90.44 98.10 98.58
SD 5.56 6.84 3.66 1.28
Jaccard score Mean 90.23 89.45 96.47 97.24
SD 8.46 8.95 5.56 2.33
From Table 2, by comparing the three methods the mean dice similarity coeffi-
cients, the proposed method gives the mean 98%. The proposed method had a very
high sensitivity of 96%. The SVM classifier is able to remove most of the false
positives at the loss of some sensitivity. The automated processing had an overall
accuracy of 98.16% and specificity of 95.6%.
4 Discussion and Conclusion
Cerebral microbleeds (CMB) are also referred to as cerebral microhaemorrhages

caused by structural abnormalities of the small vessels of the brain. They have been
identified as a major diagnostic biomarker for many cerebrovascular diseases and
cognitive dysfunctions. In current clinical routine CMBs are manually labeled by
radiologists but this method is difficult, time wasting and error prone. In this paper,
we propose a new automatic method to detect CMBs from magnetic resonance images
(MR images). At present, the analysis of microbleeds is performed by skilled neurol-
ogist based on their database that is by scanning the image, detecting the black dots
and identifying whether black dots are microbleeds or mimics. We propose an effi-
cient and robust technique for segmenting the cerebral microbleeding image using
deep learning bounding box and watershed segmentation. Before segmenting the
image, the image is converted into gray scale and for removing the noise we use a
anisotropic filter. There are many filters for removing the noise but how anisotropic
filter is different from other filters means, there are some reasons. The drawbacks of
some of the filters are (1) Gaussian filter not supported the gray color. (2) Bilateral
filter is used only for sharpening not for smoothening. (3) In histogram equalization
intensity is low to peak. (4) Adaptive equalization is only used for color images.
Above filters have no iteration. Because of these limitations, we propose anisotropic
filter for preprocessing, which smoothens and sharpens the image and based on iter-
ation it removes the noise. After filtering the image, bounding box with water shed
transformation is applied in cerebral microbleed magnetic resonance images. when
we apply this technique the black dots or bleeding are segmented clearly. By using
the watershed transformation all the major and minor regions of bleeding portions are
visible. Although the proposed method has achieved appealing performance with a
high sensitivity of 96%. Experimental results demonstrate that the proposed method
outperforms previous methods by a large margin with higher detection sensitivity and
fewer false positives. The proposed method can be easily adapted to other detection
and segmentation tasks. Although we observed the overall evaluation metric values
of DLBB based watershed segmentation were higher than other methods.
References
1. S.M. Greenberg, M.W. Vernooij, C. Cordonnier, R.A. Salman, F. Edin, S. Warach, J. Lenore,
M. Van Buchem, M.M.B. Breteler, Cerebral microbleeds a field guide to their detection and
interpretation. J. Lan. Neurol. 8, 165–174 (2009)
2. V. Mok, J.S. Kim, Prevention and management of cerebral small vessel disease. J. Stroke.
17(2), 111–122 (2015)
3. A. Charidimou, D.J. Werring, Cerebral microbleeds detection, mechanisms and clinical
challenges. J. Future Neurol. 6, 587–611 (2011)
4. M. Ayaz, A.S. Boikov, E.M. Haacke, D.K. Kido, W.M. Kirsch, Imaging cerebral microbleeds
using susceptibility weighted imaging one step toward detecting vascular dementia. J. Magn.
Reson. Imaging. 31, 142–148 (2009)
5. R.N. Nandigam, A. Viswanathan, P. Delgado, M.E. Skehan, E.E. Smith, J. Rosand, S.M.
Greenberg, B.C. Dickerson, MR imaging detection of cerebral microbleeds: effect of
susceptibility-weighted imaging, section thickness, and field strength. J. Neuroradiol. 30,
338–343 (2009)
6. A. Charidimou, D.J. Werring, Cerebral microbleeds and cognition in cerebrovascular disease
an update. J. Neurolog. Sci. 322(1), 50–55 (2012)
7. B. Geurts, T. Andriessen, B. Goraj, The reliability of magnetic resonance imaging in traumatic
brain injury lesion detection. J. Brain Inj. 26, 1439–1450 (2012)
8. R. Yogamangalam, B. Karthikeyan, Segmentation techniques comparison in image processing.
J. Eng. Technol. 5, 307–313 (2013)
9. W. Bieniecki, Oversegmentation avoidance in watershed-based algorithms for color images. J.
Mod. Probl. Radio Eng. Telecommun. Comput. Sci. 169–172 (2004)
10. A. Fazlollahi, Efficient machine learning framework for computer-aided detection of cerebral
microbleeds using the radon transform. Conf. ISBI. 113–116 (2014)
11. H.J. Kuijf, Efficient detection of cerebral microbleeds on 7.0T MR images using the radial
symmetry transform. J. NeuroImage. 59, 2266–2273 (2012)
12. W. Bian, C.P. Hess, S.M. Chang, S.J. Nelson, J.M. Lupo, Computer-aided detection of radiation-
induced cerebral microbleeds on susceptibility-weighted MR images. J. NeuroImage, Clin. 2,
282–290 (2013)
13. S.R. Barnes, Semiautomated detection of cerebral microbleedsin magnetic resonance images.
J. Magn. Resonan. Imag. 29, 844–852 (2011)
14. C.R. Jung, Combining wavelets and watersheds for robust multiscale image segmentation. J.
Image Vis. Comput. 25, 24–33 (2007)
15. P.R. Hill, C. NishanCanagarajah, D.R. Bull, Image segmentation using a texture gradient based
watershed transform. J. IEEE Trans. Image Process. 12, 1618–1633 (2003)
16. H. Ramadan, C. Lachqar, H. Tairi, A survey of recent interactive image segmentation methods.
J. Computat. Vis. Media 6, 355–384 (2020)
17. S. Lu, K. Xia, S.-H. Wang, Diagnosis of cerebral microbleed via VGG and extreme learning
machine trained by Gaussian map bat algorithm. J. Ambient İntell. Humanized Comput. (2020)
18. D. Nie, L. Wang, E. Adeli, C. Lao, W. Lin, D. Shen, 3-d fully convolutional networks for
multimodal isointense infant brain image segmentation. J. IEEE Trans. Cybern. 49, 1123–1136
(2018)
19. S. Liu, Cerebral microbleed detection usng susceptibility weighted imaging and deep learning.
J. Neuro Image. 198, 271–282 (2019)
20. G.M. Himabindu, R. Murty, Extraction of texture features and classification of renal masses
from kidney images. J. Eng. Technol. 7, 1057–1063 (2018)
New Efficient Tunable Window Function
for Designing Finite Impulse Response
Digital Filter
Raj Kumar and R. P. Rishishwar
Abstract A tunable window function is introduced in this paper for designing

and implementing of finite impulse response (FIR) digital filters. Main-lobe width
(MLW), peak side-lobe amplitude (SLA), leakage factor, and stop-band attenuation
are considered as determining parameters for the quality of proposed window. These
parameters are compared with standard windows parameters. The paper provides
a tunable window function with varying parameters that can be used in designing
differently featured digital filters. The proposed window efficiently accomplishes
desired frequency spectrum specifications and application requirements by providing
smaller side-lobe peaks, lesser leakage factor, and greater stop-band attenuation. A
general-purpose window with certain parameters is introduced that can work in all
specifications efficiently, and a low-pass filter is implemented that can provide better
filtering operation.
Keywords Band pass filter (BPF) · Band stop filter (BSF) · Finite impulse
response (FIR) · High-pass filter (HPF) · Low-pass filter (LPF) · Main-lobe width
(MLW) · Side-lobe amplitude (SLA)
1 Introduction
Digital filters play a crucial role in processing and implementing digital signals.
By passing some desired signals through it, the digital filters can improvise certain
parameters of any signal. It passes particular pass-band frequency and discards the
undesired frequencies while filtering. Basic sub-categories of filters are—low-pass
filter (LPF), high pass filter (HPF), band pass filter (BPF), and band stop filter (BSF).
Digital filters, on the basis of impulse response, has two categories—(IIR) infinite
impulse response filters and (FIR) finite impulse response filters [1]. Designing a
digital FIR filter can be done using any of the three methods: frequency trunca-
tion, windowing, and optimization process. But, window method is considered to
R. Kumar (B) · R. P. Rishishwar

Department of Electronics, Sri Aurobindo College (University of Delhi), Malviya Nagar, New
Delhi, India
742 R. Kumar and R. P. Rishishwar
be quite effective [2]. Frequency truncation generates unwanted oscillations (Gibbs

phenomenon). These oscillations can be reduced by using a window that tapers
smoothly to zero at each end [2]. Some standard window functions are as follows:
rectangular, Hann, Hamming, and Kaiser. Window’s effectiveness can be decided
by considering frequency and application parameters. Preferable frequency spec-
trum parameters for a window are as follows: narrow main-lobe width (calculated
at−3 dB), small peak side-lobe amplitude, and reduced leakage factor [3]. Factors
like, stop-band attenuation (As) and transition width, comprises of application
parameters.
Remaining paper is categorized as: Sect. 2 presents the proposed window function
and its frequency spectrum. Section 3 is about the comparison of proposed window
function with those of standard windows on the basis of frequency response. Section 4
deals with designing a low-pass FIR digital filter using proposed window function
for application purpose. Lastly, conclusion of the paper is summarized in Sect. 5.
2 Proposed Window
2.1 Description of Window
The causal window of length Mis represented by function w(nT ) where ‘n’ ranges
from (0 ≤ n ≤ M − 1). All calculations in this paper have been done considering the
sampling period, T = 1 s [4]. The value of M − 1 is the order of window represented
by N.
2.2 Proposed Window
The proposed window function is shown in Eq. (1)
w(n) = 0.8 × w1 + 0.2 × w2 (1)
where w1 is the modified Hamming window proposed by Mottaghi and Shayesteh

[4], given as,

2π n 6π n
w1 = a0 − a1 cos − a3 cos (2)
M −1 M −1
with coefficients a0 , a1 , and a3 as shown below
0.14
a0 = 0.5363 − , a1 = 0.996 − a0 , a3 = 0.004 (3)
M −1
New Efficient Tunable Window Function … 743
and w2 is the Kaiser window given as:

⎛ 2n 2 ⎞
I0 1 − M−1
w2 = ⎝ ⎠ (4)
I0 (β)
I0 is the zero-order modified Bessel function. The tuning parameter (β) derive the
preferable “MLW–SLA” understanding. The main reason behind using 80% of modi-
fied Hamming window and 20% of Kaiser window is that small amount of Kaiser
window provides lesser leakage factor and better performance at higher orders. But
if we increase its composition more, then it will result in larger side-lobe peaks [3, 6].
The above proposed function has two varying parameters that are as follows: window
order (N) and tuning parameter (β). By changing the value of tuning parameter β,
at different values of N, we can accomplish desired frequency domain specifications
and application requirements. Frequency transform of “Eq. (1)” is given by:
W ( f ) = 0.8 × a0 Hrect ( f )

1 1
− 0.8 × a1 Hrect f − + Hrect f −
M −1 M −1

3 3
− 0.8 × 0.004 Hrect f − + Hrect f −
M −1 M −1
M sin β 2 − (Mπ f )2
+ 0.2 × × (5)
I0 (β) β 2 − (Mπ f )2
where we have
sin(π M f )
Hrect ( f ) = × e−iπ f (M−1) (6)
sin(π f )
|W ( f )| is the magnitude of frequency transform. Figure 1 depicts its normalized dB

graph at β = 7 for N = 200.
3 Performance Evaluation
This section is about comparing frequency spectrum parameters of the proposed

window with some standard window functions and evaluating the effectiveness
of new window. Here, we have simulated Hamming window, modified Hamming
window, Kaiser window, and proposed window using MATLAB Window Designer
tool.
Frequency domain
0
Proposed window
-10
-20
Normalized Magnitude (dB)
-30
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
Fig. 1 Magnitude representation of frequency spectrum by simulation at order N = 200 and tuning
parameter β = 7
3.1 Hamming Window
Since Hamming window [5] and modified Hamming window [4] have an outcome
of smaller peak side-lobe levels, so our window should be tuned to that value of β
which will result in smaller peak side-lobe levels. So, β = 7 results in the desired
window characteristics. Now, we will vary the order N and evaluate the best possible
results.
On comparing frequency spectrum of proposed window with Hamming window,
Fig. 2a, b evaluates smaller peak side-lobe of proposed window at the expense of
main-lobe width for N = 40 and N = 60. But when window order is increased (N
= 200), it offers smaller peak side-lobe amplitude accompanied by same main-lobe
width as of Hamming window. Comparison is shown in Fig. 2c. As per data shown
in Table 1, the proposed window (in comparison with Hamming window) offers 4 −
4.6 dB small peak side-lobe amplitude along with smaller leakage factor (0.02%)
and equal main-lobe width (for higher window order).
3.2 Modified Hamming Window
Figure 2a, b also compares frequency spectrum of proposed window (β = 7) with

modified Hamming window for N = 40 and N = 60. The proposed window has
smaller peak side-lobe amplitude at the cost of larger main-lobe width. But for
higher order (N = 200), main-lobe width becomes equal along with smaller peak
Frequency domain
0
Proposed window
-10 Hamming window
Modified Hammingwindow
-20
-30
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(a)
Frequency domain
0
Proposed window
Hamming window
Modified Hamming window
-20
-40
-60
-80
-100
-120
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(b)
Fig. 2 Magnitude variation (dB) of frequency spectrum for proposed window, modified Hamming
window, and Hamming window functions by simulation for order, a N = 40, b N = 60, c N = 200
side-lobe as shown in Fig. 2c. Table 1 also evaluates that the proposed window has
1.8 − 2 dB smaller peak side-lobe with equal main-lobe width and lesser leakage
factor (0.02%) in comparison to modified Hamming window (for higher window
order). Since Hamming has smaller side-lobe peak than Hann and Bartlett window,
then, proposed window will have smaller amplitude than Hann and Bartlett windows
also.
Frequency domain
0
Proposed window
-10 Hamming window
Modified Hamming window
-20
-30
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(c)
Fig. 2 (continued)
Table 1 Simulated frequency spectrum data for comparing windows

Window Order Main-lobe width (× Side-lobe peak (dB) Leakage factor (%)
π) rad/sample
Proposed N = 40 0.107 −46.77 0.01
N = 60 0.072 −46.73 0.01
N = 200 0.02 −46.7 0.02
Hamming N = 40 0.103 −42.1 0.04
N = 60 0.068 −42.4 0.03
N = 200 0.02 −42.6 0.04
Modified Hamming N = 40 0.102 −44.9 0.02
N = 60 0.068 −44.8 0.02
N = 200 0.02 −44.7 0.03
3.3 Kaiser Window
For concentrating maximum energy in main-lobe, just like Kaiser window, we need
to tune parameter β less than 7. This characteristic is nearly optimal in the sense
of its peak’s concentration around frequency. Kaiser window has the function [4]
containing Bessel’s zero-order function. Since proposed window is behaving better
for larger orders, we operate our window at N = 200. Figure 3a compares Kaiser
window function and proposed window function using frequency spectrum speci-
fications for N = 200 at β = 5.57. Comparison results in equal MLW and 3.7 db
Frequency domain
0
Proposed window
Normalized Magnitude (dB) Kaiser window
-50
-100
-150
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(a)
Frequency domain
0
Proposed window
Kaiser window
-20
-40
-60
-80
-100
-120
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(b)
Fig. 3 Magnitude variation (dB) of frequency spectrum for proposed and Kaiser window function
by simulation at order N = 200, a β = 5.57, b β = 5.3
smaller peak side-lobe amplitude for proposed window (−45 dB compared to −

41.24 dB). Decreasing the value of tuning parameter, that is β = 5.3 (for N = 200),
windows comparison results in equal MLW and 4.46 dB smaller peak side-lobe level
for proposed window (−43.9 dB compared to −39.44 dB). Comparison is shown in
Fig. 3b.
Table 2 Characteristics of FIR filters designed by different window functions for different orders
at tuning parameter β = 5.57 and order M = 200
Order Stop-band attenuation (As) in dB
Hamming window Modified Hamming Kaiser window Proposed window
filter window filter filter filter
N = 40 −52.3 −58.2 −58.49 −58.9
N = 60 −53.4 −58.68 −58.66 −59.35
N = 200 −53.7 −59.1 −59.00 −59.8
4 Application Example
This section tests proposed window in application environment and decides its
utility in practical use. Considering a low-pass finite impulse response filter which
is designed by windowing of infinite impulse response filters. The impulse response
of an ideal low-pass filter is given as [4]:

wc T = π for n = 0
h n (nT ) = sin wc nT (7)
nπ
for n = 0
where wc is the cut-off frequency. Now, the required FIR filter can be achieved by
windowing h n with proposed window [6].
In application example, for cut-off frequency wc = 0.3 × π rad/sample, stop-
band attenuation (As) of filters designed using proposed window is compared with
standard and modified Hamming windows at different orders. Simulation is done
using MATLAB Filter Designer tool.
Table 2 gives the simulated data for comparison. It can be seen that proposed
window filters exhibit better performance in every order when tuning parameter β
= 5.57. But for higher order (N = 200), proposed window filter gives finer value of
stop-band attenuation (−59.8 dB) with a difference of 0.7 dB–6.01 dB as compared
to other windows. Figure 4 compares magnitude response (dB) of FIR proposed
window low-pass filter with standard window low-pass filters (M = 200).
5 Conclusion
Summarizing all simulation results, we can conclude that when window specifica-
tions require small side-lobe peaks then we need to tune β at 7 that provides 2 dB–
4.1 dB smaller side-lobes. But when maximum energy concentration in main-lobe is
required, then we tune β at 5.57 (or less than 7) which gives 3.76 dB–4.46 dB smaller
side-lobes. With same main-lobe width and smaller leakage factor, proposed window
works better for higher order. In application example, FIR low-pass filter of proposed
Magnitude Response (dB)
-10
-20 Proposed window filter

Hamming window filter
-30
Magnitude (dB)
-40
-50
-60
-70
-80
-90
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(a)
-10
-20
Proposed window filter
-30 Modified Hamming window filter
Magnitude (dB)
-40
-50
-60
-70
-80
-90
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(b)
Fig. 4 Comparing magnitude response (dB) of proposed window low-pass filter with standard
window low-pass filters for order M = 200 and tuning parameter β = 5.57, a Hamming window
filter, b Modified Hamming window filter, c Kaiser window filter

0
-10
-20 Proposed window filter

Kaiser window filter
-30
Magnitude (dB)
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(c)
Fig. 4 (continued)
window exhibits higher stop-band attenuation (0.7 dB–6.01 dB) compared to other
windows. In common, if we want a general-purpose window, then tune β at 5.57 for
N = 200 which gives better frequency domain specifications (0.3 dB–3.76 dB) along
with application requirements (0.7 dB–6.01 dB). Moreover, properties like stability
and linearity are also met except for the property discussed in modified Hamming
window paper.
Acknowledgements I am extremely thankful to University Grants Commission (UGC) for

financial support in the form of Junior Research Fellowship (JRF).
References
1. Sulistyaningsih: performance comparison of Blackman, Bartlett, Hanning and Kaiser window

for Radar Digital Signal Processing. in 4th International Conference on Information Technology,
Information Systems and Electrical Engineering (ICITISEE), IEEE (2019)
2. D. Kalaiyarasi, T. Kalpalatha Reddy, A hybrid window function to design finite impulse response
low pass filter with an improved frequency response. in International Conference on Energy,
Communication, Data Analytics and Soft Computing (ICECDS), IEEE (2017)
3. S.K. Mitra, Digital Signal Processing (McGraw-Hill Publication) pp. 446–458
4. M. Mottaghi-Kashtiban, M.G. Shayesteh, New efficient window function, replacement for the
Hamming window. IET Sig. Process. (2011)
5. M. Shil, An adjustable window function to design an FIR filter, in 2017 IEEE International
Conferene Imaging, Vision Pattern Recognition (ICIVPR), IEEE (2017)
6. A. Oppenheim, R. Schafer, Buck: Discreate-time Signal Processing (Prentice Hall Publication,
2nd Edition) pp. 474–482
Brain Tumour Detection Using
Convolutional Neural Networks in MRI
Images
Dontha Pavani, Kavali Durgalaxmi, Bingi Sai Datta, and D. Nagajyothi
Abstract Brain tumour detection is perhaps the most remarkable and difficult task
in the field of medical sciences because manual segmentation can result in incorrect
results and findings. Furthermore, it is the most difficult process when there is a large
amount of data to be categorised. The appearance of brain tumours varies greatly.
Because the tumour and normal tissues have a similar appearance, tumour removal
becomes unquestionable. In this paper, a strategy is proposed to extract the brain
tumour from 2D-magnetic resonance brain images (MRI) by convolution neural
network techniques. The experimental study was carried on a real-time data set with
various tumour sizes, areas, shapes and distinctive image intensities. The techniques
involved are AlexNet, DenseNet, ResNet and the other one is the combination of
VggNet 19 and DenseNet. The accuracies we have got for AlexNet, DenseNet,
ResNet and the other one which is the combination of VggNet19 and DenseNet are
98.5%, 92.3%, 94.6%, 95.4%. The principle aim of this paper is to detect brain tumour
and also observe which technique is giving the best accuracy using MATLAB.
Keywords MRI images · Brain tumour · CNN · Deep learning
1 Introduction
The vast majority of tumours are catastrophic. Brain tumours that are essential arise
in the brain. In the optional type of tumour, the tumour’s movement within the brain
causes effects from other parts of the body [1]. If the tumour is detected and treated
early in the tumour formation process, the patient’s chances of recovery are very
good [2].
D. Pavani (B) · K. Durgalaxmi · B. S. Datta · D. Nagajyothi

Department of Electronics and Communication Engineering, Vardhaman College of Engineering,
Shamshabad, Hyderabad 501 218, India
D. Nagajyothi
e-mail: d.nagajyothi@vardhaman.org
752 D. Pavani et al.
Brain tumour is a collection of abnormal cells that form within the brain. It is
of two types: malignant and benign. Malignant is cancerous, and benign is non-
cancerous. Brain tumours are classified as primary and secondary. Primary tumours
originate in the brain, most of which are nothing but benign, and secondary is caused
when the cancer cells spread to brain from other organs like lungs and breast. These
are called as metastatic brain tumours.
A brain tumour is one of the most serious cancers that can affect both children and
adults. Initial recognition, classification and analysis of brain tumours are especially
important in order to treat the tumour effectively [3]. Early detection of brain tumours
can play a critical role in improving treatment options and increasing the likelihood
of survival [4]. Segmentation of brain tumours manually is a time consuming and
critical task because of the huge data generated in the medical field. Therefore, MRI is
used to detect the tumour, using MRI is also burdensome because of the considerable
amount of data. In this paper, we proposed a method which is the combination of
VggNet and DenseNet for brain tumour detection along with other convolution neural
network techniques.
1.1 Literature Review
Hussain [4] has used BRATS data set which consists of non-tumour and tumour MRI
images. He used FCM for image segmentation, and for the classification purpose, he
used six traditional classifiers, namely K-nearest neighbour (KNN), support vector
machine (SVM), multilayer perceptron (MLP), Naïve Bayes, random forest and
logistic regression which was put into action in scikit-learn and amongst which he
achieved greater accuracy for SVM, i.e. 92.42%. The other proposed method was
using CNN where the five layer CNN is used to classify the tumoured and non-
tumoured images and achieved the accuracy of 97.87%. Vinoth [5] used convolution
neural network for MRI image segmentation where the HGG and LGG parts of the
tumour are found, and the obtained sensitivity is 96.54%. For the tumour classifica-
tion, SVM classifier is used [6–8]. S. Pereira has proposed CNN-based method for
brain tumour segmentation where CNN is built over 3 × 3 kernels. They have also
shown that better segmentation is obtained using intensity normalisation. They have
also found LReLU is more effective when compared to ReLU in training CNN.
2 Proposed Techniques
Transfer learning
Transfer learning has few advantages, yet the primary benefits are saving preparing
time, better execution of neural organisations (by and large) and not requiring a ton
of information.
Brain Tumour Detection Using Convolutional Neural Networks … 753
Normally, a lot of information is expected to prepare a neural network without

any preparation yet such a large amount of information isn’t generally accessible.
This is the place where transfer learning proves to be useful. With transfer learning,
a strong AI model can be worked with relatively little training data in light of the
fact that the model is as of now pre-prepared.
As expert knowledge is needed to create such large labelled data sets, transfer
learning is more valuable. Also, preparing time is diminished in light of the fact
that it can now and again require days or even a long time to prepare a deep neural
network from scratch for solving a complex task.
3 Convolutional Neural Network
One of the most useful types of machine learning is deep learning. CNN is a special
type of DNN; CNN is divided into several layers and has a complex hierarchical
structure. CNN additionally incorporates, input layer, output layer, convolutional
layers, pooling layers, normalisation layers and fully connected layers. At the end of
the day, every CNN is made up of several layers, the primary layers of these are the
convolutional layer and the sub-sampling layer. In this paper, we have implemented
four networks such as AlexNet, DenseNet, ResNet and VggNet [9–12].
Alexnet:
The network is quite closely related to Yann LeCun et al.’s LeNet, on the other hand,
has much more filters per layer and is clustered with convolutional layers. It included
convolutions, max pooling, dropout, data augmentation, ReLU activations, and SGD
with momentum. There are ReLU activations attached to each convolutional and
fully connected layer. AlexNet is eight layers deep convolutional neural network.
This network has been trained using millions of images from the ImageNet database,
and we can load this pre-trained network. It can categorise images into 1000 different
object categories, such as pencil, mouse and so on. Therefore, for a wide range of
images, the network has learned feature extraction. The input size for this network
is 227 × 227.
ResNet:
At the ILSVRC 2015, Kaiming He et al. introduced architecture with “skip connec-
tions” and heavy batch normalisation. The network is called residual neural network
(ResNet). These skip connections, also known as gated units, have a strong resem-
blance towards the late effective components used in RNNs. They were able to create
a CNN with 152 layers whilst maintaining a lower complexity than VggNet using this
methodology. It achieves a top-5 error rate (3.57%) on this data set, outperforming
human performance.
VggNet:
Simonyan and Zisserman introduced VggNet. This network is made up of 16 convo-
lutional layers and has a fairly consistent architecture. VggNet is a network that is
extremely similar to AlexNet, with only 3 × 3 convolutions, and consists contains
a large number of filters. It took 2–3 weeks to train on four GPUs. Right now, for
image feature extraction, VggNet is the most preferred network. The network has
been used in many applications as a feature extractor as the weight configurations
are made publicly available to use. However, the network comprises 138 million
parameters, making it tough to manage.
DenseNet:
In DenseNet, for each layer, extra information is gotten from every one of the past
layers and gives its own feature maps to every succeeding layer. The network uses
concatenation. Each layer is getting an “aggregate information” from the previous
layers. Since each layer gets feature maps from every former layer, the network can
be closely packed, for instance number of channels can be less. The growth rate k
denotes is the extra number of channels that has been added for each layer. Therefore,
DenseNet has higher computational efficiency, and it uses memory effectively.
Optimization:
Optimising the network is crucial to improve the performance of CNN. We use
optimization in training and testing phases. In this execution, we have utilised a
technique called stochastic gradient descent [13].
Stochastic Gradient Descent (SGD):
By calculating the negative gradient, cost issue has been raised after seeing only few
training examples. In CNN, running the whole CNN costs really high. This motivated
the use of SGD. It can be used to overcome the cost issue and still yield the best
results. The below Eq. (1) shows before update
∅ = ∅ − α∇∅ E[J (∅)] (1)
The letter E in the above equation stands for expectation. It is the estimated
value we receive after weighing cost and gradient throughout the entire training set.
The gradient of the parameter is derived using a single or small number of training
samples, and the SGD is used to update the expectation. The below Eq. (2) shows
the update

∅ = ∅ − α∇∅ J ∅; x (i) , y (i) (2)
The proposed method is shown in below Fig. 1, which has CNN techniques to identify
whether the input brain image has the tumour or not. The above-mentioned four
pre-trained models, AlexNet, DenseNet, ResNet and VggNet, are used to achieve
this.
The MRI images of the brain are taken from the Kaggle community. The data set
consists of 260 images in which 127 images are positive and 133 are negative images.
Here, positive represents the images which has brain tumour and negative represents
the images which does not have a brain tumour. The size of the input image is taken
as 224 × 224. Now, we have the data set; the next step is to use this data set and
test convolution neural networks. Below are some of the input images we have taken
shown in Fig. 2.
We used a variety of networks in our proposed strategy, with the densenet201 and
vggnet19 combination providing the best results amongst the other combinations.
There are several types of combining two networks, in our work, we trained two
networks independently, but just like an ensemble model we have combined the
output. That means the model will give the output of the pre-trained model which has
the highest probability so that we can classify the given input image more accurately.
We have used MATLAB R2021a to implement our proposed method. MATLAB
has several pre-trained models. These models have been trained on a huge database
called ImageNet. This database contains 1000 object categories and it consists of
1.2 million images. MATLAB has different toolboxes. Here, we used deep learning
toolbox and installed all the four networks from it. In the toolbox, the pre-trained
AlexNet has 25 layers, ResNet has 177 layers, vggnet19 has 47 layers, and DenseNet
has 708 layers. ReLU is the activation function.
Below are the graphs obtained after the training of the networks. X axis shows the
number of iterations, and Y axis denotes accuracy. The graph shows how the accuracy
of the networks improved after repeatedly training the given 2D MRI images and
the error rate reduced. For training, 130 images are used. For the validation, we have
split the data set into 70:30 ratio. The validation accuracies obtained for DenseNet,
Fig. 1 Block diagram

Fig. 2 MRI images of brain
ResNet and VggNet are 98.33%, 91.67% and 97.50%, respectively. Figures 3, 4, 5
and 6 depict the graphs obtained after training AlexNet, DenseNet, ResNet, VggNet.
From the database, we have taken few images from the GUI by clicking the browse
button. By calling a respective network, it will identify whether the image has a
tumour or not. Below are the few results shown in Fig. 7.
Fig. 3 Graph obtained after training for AlexNet
Fig. 4 Graph obtained after training for DenseNet
Fig. 5 Graph obtained after training for ResNet
Calculation of accuracy
From training, we obtained the below results [14]. AlexNet got the highest accuracy
amongst all which is 98.5%, and from the combinations of networks, the DenseNet
and VggNet combination gave maximum accuracy which is 95.4%.
Fig. 6 Graph obtained after training for VggNet
Fig. 7 Detecting the brain tumour from the 2D-MRI images

Table 1 Comparison of different performance metrics of the four models

Pre-trained models Accuracy Precision Recall F1
AlexNet 0.985 0.965 1 0.982
ResNet 0.946 0.984 0.913 0.947
VggNet 0.969 0.937 1 0.967
DenseNet 0.923 0.953 0.897 0.924
VggNet + DenseNet 0.954 0.906 1 0.950
Metrics Used
To evaluate the performance of the presented models, we used four metrics. The
metrics used are accuracy, precision, recall and F1 score. The equations are given
below as (3), (4), (5) and (6) are used to calculate accuracy, precision, recall and F1,
respectively.
TP + TN
Accuracy = (3)
TP + FN + FP + TN
TP
Precision = (4)
TP + FP
TP
Recall = (5)
TP + FN
precision × recall
F1 = 2 × (6)
precision + recall
The below Table 1 depicts the metrics used. Here, “TP” denotes true positive;
“TN” indicates true negative; “FP” describes false positive, and “FN” represents
false negative.
6 Conclusion
Analysis of medical images has become a tough task because of the large scale
of databases. To analyse such large number of data and accurately predicting the
disease is the biggest problem currently we are facing in the medical field. Using
CNN techniques to predict the disease will reduce the human inaccurate prediction
and reduce the time to identify the disease. In our paper, we have discussed four
pre-trained CNN techniques and the combination of two models. The Alexnet has
achieved the highest accuracy of 98.5%. The main purpose for medical image analysis
is to use the application in the real life, so in future work we will try to improve the
models which is suitable for 3D images.
References
1. D.V. Gore, V. Deshpande,Comparative study of various techniques using deep learning for
brain tumour detection. in 2020 International Conference for Emerging Technology (INCET),
2020, pp. 1–4. https://doi.org/10.1109/INCET49848.2020.9154030
2. M. Siar, M. Teshnehlab, Brain tumour detection using deep neural network and machine
learning algorithm. in 2019 9th International Conference on Computer and Knowledge
Engineering (ICCKE), (2019), pp. 363–368. https://doi.org/10.1109/ICCKE48569.2019.896
4846
3. N. Noreen, S. Palaniappan, A. Qayyum, I. Ahmad, M. Imran, M. Shoaib, A deep learning
model based on concatenation approach for the diagnosis of brain tumour. IEEE Access 8,
55135–55144 (2020). https://doi.org/10.1109/ACCESS.2020.2978629
4. T. Hossain, F.S. Shishir, M. Ashraf, M.A. Al Nasim, F. Muhammad Shah, Brain tumour detec-
tion using convolutional neural network. in 2019 1st International Conference on Advances in
Science, Engineering and Robotics Technology (ICASERT) (2019), pp. 1–6. https://doi.org/10.
1109/ICASERT.2019.8934561
5. R. Vinoth, C. Venkatesh, Segmentation and detection of tumour in MRI images using CNN and
SVM classification. Conf. Emerg. Devices Smart Syst. (ICEDSS) 2018, 21–25 (2018). https://
doi.org/10.1109/ICEDSS.2018.8544306
6. S. Pereira, A. Pinto, V. Alves, C.A. Silva, Brain tumour segmentation using convolutional
neural networks in MRI images. IEEE Trans. Med. Imaging 35(5), 1240–1251 (2016). https://
doi.org/10.1109/TMI.2016.2538465
7. Z. Sobhaninia, S. Rezaei, N. Karimi, A. Emami, S. Samavi, Brain tumor segmentation by
cascaded deep neural networks using multiple image scales. in 2020 28th Iranian Conference
on Electrical Engineering (ICEE) (2020), pp. 1–4. https://doi.org/10.1109/ICEE50131.2020.
9260876
8. S. Kumar, A. Negi, J.N. Singh, H. Verma, A deep learning for brain tumor MRI images semantic
segmentation using FCN. in 2018 4th International Conference on Computing Communication
and Automation (ICCCA) (2018), pp. 1–4. https://doi.org/10.1109/CCAA.2018.8777675
9. Z. Jia, D. Chen, Brain tumor identification and classification of MRI images using deep learning
techniques. in IEEE Access (2020), pp. 1–1. https://doi.org/10.1109/ACCESS.2020.3016319
10. H. Ucuzal, S. Yasar, C. Colak, Classication of brain tumor types by deep learning with convolu-
tional neural network on magnetic resonance images using a developed web-based interface. in
2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies
(ISMSIT) (2019). https://doi.org/10.1109/ismsit.2019.8932761
11. H. Mohsen, E.-S.A. El-Dahshan, E.-S.M. El-Horbaty, A.-B.M. Salem, Classification using
deep learning neural networks for brain tumors. Future Comput. Inf. J. 3(1), 68–71 (2018).
https://doi.org/10.1016/j.fcij.2017.12.001
12. A. Isn, C. Direkoglu, M. Sah, Review of MRI-based brain tumor image segmentation using
deep learning methods. Procedia Comput. Sci. 102, 317–324 (2016). https://doi.org/10.1016/
j.procs.2016.09.407
13. D.V. Gore, V. Deshpande, Comparative study of various techniques using deep Learning for
brain tumor detection. in 2020 International Conference for Emerging Technology (INCET)
(2020). https://doi.org/10.1109/incet49848.2020.9154030
14. T. Saba, A.S. Mohamed, M. El-Aendi, J. Amin, M. Sharif, Brain tumor detection using fusion
of hand crafted and deep learning features. Cognitive Syst. Res. 59, 221–230 (2020). https://
doi.org/10.1016/j.cogsys.2019.09.007
15. M. Siar, M. Teshnehlab, Brain tumor detection using deep neural network and machine learning
algorithm. in 2019 9th International Conference on Computer and Knowledge Engineering
(ICCKE) (2019), pp. 363–368. https://doi.org/10.1109/ICCKE48569.2019.8964846
Design of Circular Patch Antenna
with Square Slot for Wearable
Ultra-Wide Band Applications
S. Rekha, Boga Vaibhav, Guda Rahul Teja, and Pulluri Sathvik
Abstract In this paper, an ultra-wide band antenna made of simple circular patch
is designed. Various design evolutions of the circular patch antenna have been
discussed. A conventional circular patch with partial ground is designed to have an
ultra-wide frequency band. Since there is an impedance mismatch at mid frequencies,
a slit is etched at the ground to improve the matching. Then, a square slot is etched in
the circular patch to have better performance characteristics. The antenna is designed
on FR4 substrate having 1.6 mm thickness. The circular patch having square slot is
operating from 2.7 to 12.1 GHz which is more than an ultra-wide frequency range.
The other antennas are applicable at WiMAX (3.2–3.8 GHz), WLAN (5.1–5.3 GHz),
and X-band (8–12 GHz) frequencies. The proposed antenna is also applicable for
wearable application as its SAR value is 1.5 W/Kg. The proposed design is simulated
using Ansys HFSS simulation software.
Keywords Ultra-wide band frequency · Circular patch · Square slot · WiMAX ·

WLAN · X-band · FR4 substrate
1 Introduction
An antenna is a device used to transmit and receive the electromagnetic signals. The
principal considerations of an antenna are robustness, flexibility, miniaturization,
and economical. Since Federal Communications Commission (FCC) had released
S. Rekha (B) · B. Vaibhav · G. R. Teja · P. Sathvik

Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and
Technology, Ghatkesar, Yamnamper, Secunderabad 501301, Telangana, India
e-mail: rekhas@sreenidhi.edu.in
B. Vaibhav
e-mail: 17311a04cj@sreenidhi.edu.in
G. R. Teja
e-mail: 17311a04dg@sreenidhi.edu.in
P. Sathvik
e-mail: 17311a04cd@sreenidhi.edu.in
762 S. Rekha et al.
3.10–10.60 GHz for the commercial applications, there has been a massive demand
for the ultra-wide band devices [1] because of its wide bandwidth. These UWB band
find its applications in satellites, wireless local area network, military applications,
biomedical applications, and so on.
The history of ultra-wide band antenna started long ago in the early stages of
antenna development such as conical antennas and horn antennas [2]. Nowadays,
microstrip antennas are preferred to attain broad frequency range because they are
compact. The simplest of the UWB antennas is a monopole antenna, but the gain for
monopole antennas is moderate [3]. Sometimes, there is a need to eliminate some
frequencies which can be achieved using slots [4]. In some cases, partial ground is
involved to achieve better performance [5]. Later, slots are introduced in the ground
which is called as defected ground structure (DGS) [6]. Single slot or multiple slots
are cut on the patch in order to achieve necessary performance [7]. Slots are also
employed to get notch band characteristics [8]. Planar monopole antenna has low gain
[9] compared to slot antennas. These slots are differently structured such as circle,
square, rectangle, conical, semi-circular, diamond, and so on [10]. Various structures
are used for different applications. The other factor that defines the performance of
the antenna is the dielectric substrate used to build the antenna [11]. Some antennas
are designed in such a way that both the ground and substrate are on the same plane
using co-planar wave feeding. There are certain antennas consist of more than one
port [12], and they are termed as MIMO antennas.
In this paper, a square slotted circular patch antenna has been designed for ultra-
wide band applications. A circular patch is simple in structure and possess high
gain. In Sect. 2, the techniques, parameters, and dimensions of the antenna design
have been discussed. In Sect. 3, the results (reflection coefficient and the gain) are
elaborated. The paper is concluded in the Sect. 4.
2 Antenna Design
In this research paper, design and evolution of simple circular patch antenna are
discussed. The circular patch antenna has been discussed as four antenna models
as in Fig. 1. All the designs are etched on FR4 lossy epoxy substrate as it is easily
available and cost effective. Figure 1a is the conventional circular patch having partial
ground. Figure 1b is the circular patch having a deep cut in the ground plane in order
to improve the impedance matching characteristics. Figure 1c is the circular patch
having square slot at the middle. Figure 1d is the circular patch having square slot as
well as the deep cut in the ground plane. These designs are simulated using full wave
simulator Ansys HFSS. The total dimension of the design is 32.75 * 37.5 * 1.6 mm3 .
The dimension of the proposed antenna is displayed in Table 1.
The circular patch is excited through a microstrip feed having 50 ohms as
impedance. Addition of slots on the antenna changes the reactance value (capac-
itance or inductance) which in return increases operating frequency band. A square
slot (4 mm * 4 mm) is being created at the center of the circular patch as in Fig. 1c, d.
Design of Circular Patch Antenna with Square Slot … 763
Fig. 1 Geometry of evolution of antennas, a antenna 1—simple circular patch, b antenna 2—

circular patch having slot in ground, c antenna 3—circular patch having square slot, d antenna
4—circular patch having square slot and ground slot
Later, the slot is adjusted slightly to achieve better impedance matching with UWB
frequency. Similarly, ground plane etching is implemented to have good resonance
and the gain characteristics. The result of the antennas has been discussed in the
Sect. 3.
764 S. Rekha et al.
Table 1 Dimensions of the

Parameters Dimensions (mm)
proposed antenna
Area of the substrate (L × W) 32.75 × 37.5
Radius of circular patch (r) 8.5
Feedline width (t) 3
ground height (h) 16
Thickness of substrate 1.6
Square slot side (a) 4
Ground slot height (d) 4.75
The performance of the antenna is analyzed in terms of S parameters and the gain
parameter. In addition, the SAR value is determined for the antenna to claim that it
is applicable for wearable purposes.
3.1 Reflection Coefficient and Gain
The simple circular patch antenna (antenna 1) having partial ground is operating from
2.85–4.8 GHz to 5.8–11.84 GHz. Here, the UWB range is not completely covered. In
order to improve the operating frequency, antenna 2 is designed. Here, a rectangular
slot is etched in the ground plane to match the impedance from 4.8 to 5.8 GHz. Now,
the operating frequency is from 2.7 to 12.06 GHz which covers complete UWB
range. The reflection coefficient of antennae 1 and 2 is compared in the Fig. 2.
In Fig. 3, antenna 3 and 4 are compared. In order to analyze the effects of slots on
the circular patch, a simple square slot is etched at the center of the circle as shown
reflection coefficient of
antenna 1 and 2
reflection coefficient of
antenna 3 and 4
in antenna 3. The operating frequency of antenna 3 is observed from 2.88–4.71 GHz

to 5.5–11.36 GHz. This range of frequency is not covering complete UWB. Hence, a
slot is etched on the ground plane (antenna 4) to match the impedance for uncovered
frequency. Now, the antenna 4 is radiating from 2.7 to 12.11 GHz showing maximum
bandwidth compared to all the other designs.
Figure 4 exhibits comparison of gain plotted against frequencies for all the four
antenna (φ = 0˚). All the antenna models have good gain value stating that the input
power is efficiently utilized to be converted into the radiation. The electric field
and magnetic field are plotted at 9 GHz for the four antenna models are shown in
Fig. 5a–c. The radiations are appreciable for all the four antennas. The radiations
are similar to an omni-directional pattern in which the H plot forms approximately
a circle (in 2D) and E plot varies in with respect to the shape of the radiator. The
details performance comparison of four antennas is listed in Table 2.
Fig. 4 Comparison of gain

plots of 4 antenna models (φ
= 0˚)
766 S. Rekha et al.
(a) (b)
(b) (d)
E field
H field
Fig. 5 E field and H field radiation pattern at 9 GHz of a antenna 1, b antenna 2, c antenna 3, d
antenna 4
3.2 Specific Absorption Rate (SAR)
The proposed antenna can also be used for wearable applications. The specific absorp-
tion rate is the measure of radiation absorbed by the human body while using the
devices. There are some limits to the SAR values. In India, SAR must be less than
1.6 W/kg per 1 g of tissue. The SAR of the proposed antenna is less than the maximum
value. The plot for SAR of proposed model (antenna 4) is given in Fig. 6. Figure 6
shows that the average SAR value of antenna 4 is 1.5 w/kg which is below the SAR
limit. The proposed model is suitable for wearable purpose because of its SAR value
and the compact size of the antenna.
Table 2 Performance comparison of antenna 1, 2, 3 and 4

Antenna Resonant Bandwidth (in Max. Gain at Max. Gain at Applications of
frequency (in GHz) 4.5 GHz 9 GHz (in dB) the operating
GHz) (in dB) band
1 2.85–4.80 1.95 4.8 5.2 WLAN,
and and WiMAX,
5.8–11.84 6.04 X-band
2 2.7–12.06 9.36 4.5 5.1 UWB
3 2.88–4.71 1.82 3.5 5.7 WLAN,
and and WiMAX,
5.5–11.36 5.86 X-band
4 2.7–12.11 9.41 5.0 6.9 UWB
Fig. 6 SAR field for the antenna model 4
4 Conclusion
A design of simple circular patch applicable in UWB frequency is been discussed

in this paper. A square slot on the circular patch and a slot on the partial ground
plane has increased the performance of the antenna. This antenna is operating from
2.7 to 12.11 GHz which covers more than UWB range. The gain of the antenna is
5.1 dB and 6.9 dB at 4.5 GHz and 9 GHz, respectively. This antenna can be used as
768 S. Rekha et al.
wearable purposes because its average SAR value is 1.5 w/kg. The proposed antenna
is a simple, and a cost-effective antenna is applicable in UWB range.
References
1. First report and order on ultra-wide band technology (FCC, Washnigton D.C., 2002)
2. B. Anupama, A. Singh, S. Kavitha, K. Shet, D. Prasad, Bowel shaped and loaded conical dielec-
tric substrate horn antenna for ultra-wide band operation. in 2019 International Conference on
Intelligent Computing and Control Systems (ICICCS), India (2019), pp. 1143–1146
3. S. Rekha, M. Nesasudha, Design of circularly polarized planar monopole antenna with
improved axial ratio bandwidth. Microw. Opt. Technol. Lett. 59(9), 2353–2358 (2017)
4. P. Thongyoy, P. Rakluea, T.N. Ayudthaya, Compact thin-film UWB antenna with round corner
rectangular slot and partial circular patch. in 2012 9th International Conference on Elec-
trical Engineering/Electronics, Computer, Telecommunications and Information Technology,
Thailand (2012), pp. 1–4
5. N.M. Awad, M.K. Abdelazeez, Circular patch UWB antenna with ground-slot. in 2013 IEEE
Antennas and Propagation Society International Symposium (APSURSI), USA (2013), pp. 442–
443
6. P. Jain, B. Singh, S. Yadav, A. Verma, M.A. Zayed, A novel compact circular slotted microstrip-
fed antenna for UWB application. in 2015 International Conference on Communication,
Control and Intelligent Systems (CCIS), India (2015), pp. 22–24
7. I.K. Kim, J. Ghimire, J. Maharjan, I. Nadeem, S.W. Kim, D.Y. Choi, Ultra-wide band (UWB)
microstrip patch antenna with adjustable notch frequencies. in 2019 IEEE International
Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT),
Indonesia (2019), pp. 70–73
8. M. Karmugil, K. Anusudha, Design of circular microstip patch antenna for ultra-wide band
applications. in 2016 International Conference on Control, Instrumentation, Communication
and Computational Technologies (ICCICCT), India (2016), pp. 304–308
9. T.F. Nayna, E. Haque, F. Ahmed, Design of a X band defected ground circular patch antenna
with diamond shaped slot and ring in patch for UWB applications. in 2016 International
Conference on Signal Processing, Communication, Power and Embedded System (SCOPES),
India (2016), pp. 559–562
10. M.A. Kango, S. Oza-Rahurkar, Effect of dielectric materials on UWB antenna for wear-
able applications. in 2017 IEEE International Conference on Power, Control, Signals and
Instrumentation Engineering (ICPCSI), India (2017), pp. 1610–1615
11. M.M. Sharma, I.B. Sharma, R. Agarwal, Circular edge cut diminutive UWB antenna for wire-
less communications. in 2019 EEE Indian Conference on Antennas and Propagation (InCAP),
India (2019), pp. 1–4
12. X. Tang, Z. Yao, Y. Li, W. Zong, G. Liu, F. Shan, A high performance UWB MIMO antenna
with defected ground structure and U-shape branches. Int. J. RF Microwave Comput.-Aided
Eng. 31(2), e22270 (2020)
Design of Baugh-Wooley Multiplier
Using Full Swing GDI Technique
Vamshi Ponugoti, Seetaram Oruganti, Sahithi Poloju, and Srikanth Bopidi
Abstract This paper presents a four-bit Baugh-Wooley multiplier using full swing
gate diffusion input (GDI) technology. In general, addition is a crucial arithmetic
operation and is heavily demanded in VLSI design. These are widely used in digital
signal processing, accumulators, microprocessors and many other applications. So,
the full adder performance decides the overall system performance. Proposed design
reduces the area as it contains a smaller number of transistors compared to other
logic designs. The proposed design as shown 44% decrease in PDP, and the area was
decreased by 6.25% of Baugh-Wooley multiplier.
Keywords GDI · XOR-XNOR · Full adder · Full swing
1 Introduction
Due to heavy demand in VLSI, area and power are the vital factors in chip design.
Nowadays, every applications like mobiles, televisions and in many electronic
gadget’s chips are the fundamental elements. Full adder is the basic operation in many
designs. So, if we improve the performance of full adder, automatically performance
of the design increases [1–3]. Here, in this paper, we are designing a full adder using
full swing GDI technique which consumes less power and less area. And using that
full adder, we designed a four-bit Baugh-Wooley multiplier. It also consumes lesser
area and low power consumption.
V. Ponugoti (B) · S. Oruganti · S. Poloju · S. Bopidi

Department of Electronics and Communication Engineering, Vardhaman College of Engineering,
Hyderabad, India
S. Bopidi
e-mail: srikanth.vlsi.2011@vardhaman.org
770 V. Ponugoti et al.
2 Gate Diffusion Input (GDI) Technique
It is an alternative to complementary metal oxide semiconductor (CMOS) technology.

In GDI technique [4], designs consume low power and low area. Here, we use a
smaller number of transistors compared to CMOS technology, so that we can reduce
the area required for the design. Using GDI technique, we can calculate complex
functions by just using two transistors. The basic GDI cell consists of two transistors
in which we apply supply voltages to gate and P and N shown in the Fig. 1.
In general GDI cell, we connect bulks of PMOS and NMOS to their respective
source terminals. In modified GDI structure [5], we connect bulks of PMOS and
NMOS to VDD and ground, respectively. Modified GDI structure is shown Fig. 2.
Fig. 1 GDI cell
Fig. 2 Modified GDI cell

structure
Design of Baugh-Wooley Multiplier Using Full Swing GDI Technique 771
Table 1 GDI cell functions

N P G Out Function
0 B A A B F1
B I A A +B F2
I B A A+B OR
B 0 A AB AND
C B A A B+AB MUX
0 I A A NOT
B B A A B+AB XOR
B B A A B +AB XNOR
2.1 Basic GDI Functions
We can obtain the following Boolean functions using the basic GDI cell as shown in
the Table 1.
Above table shows the various logic functions which can be obtained using just
two transistors [6]. In conventional CMOS design [7], these logic functions require
about 6–12 transistors. These functions are obtained by just interchanging the inputs
between input terminals. In the above table, F1 and F2 can be called as universal
functions for GDI technique just like NAND and NOR gates in CMOS technology.
Using F1 and F2, we can realize every design using GDI technique [8].
This technique suits best for manufacturing in twin well CMOS process and SOI
silicon on insulator process only as these styles when used with this technique gives
less propagation delay and consumes less power. The GDI cell structure is different
from the overall PTL techniques, and it has some extraordinary highlights. These
features allow improvements in designing a complex circuit easily. These improve-
ments include transistor count reduction, low power dissipation. To understand the
GDI cell properties analysis of basic cell in different cases and configurations are to
be done.
2.2 Analysis of GDI Technique
The widespread problem in PTL is its low swing output signals. It is due to threshold
drop across single channel pass transistors [9]. To overcome this problem, the PTL
that is using at current days and uses additional circuitry. In the same way, general
GDI designs also give low swing output signals to get the full swing, we add extra
transistor to the existing design. Although number of transistors are increasing, we
are getting full swing output. This increase in number of transistors is also less than
actual transistors count in CMOS designs. To comprehend the low swing issue in
GDI technique, we consider a single function F1 and in the same way, it applies to
all designs that are designed in GDI technique. In the function F1, low swing occurs
when A = 0 and B = 0. At this input voltage levels, the voltage of F1 is V tp rather
than 0. This is due to the PMOS pass transistor’s poor high-to-low transition. A = 0,
B = VDD to A = 0 and B = 0 are the only situation where the effect happens in all
transformations. The GDI cell operates as a regular CMOS inverter in about 50% of
the cases (for B = 1) which is widely used as digital buffer for logic level restoration.
2.3 Modified GDI Technique
This modification involves adding an extra transistor at the output which restores the
swing. As compared to simple GDI logic, this technique improves the output voltage,
control and power delay product. This logic style can be produced using a regular
CMOS process. Using the full swing GDI technique, the threshold problem was
solved and the performance swings degradation was improved. This new technique
only involves in increases in only one transistor, although it increases the overall
transistor count, this count is less than the transistors used in complementary metal
oxide semiconductor technology.
F1 and F2 functions using full swing GDI technique are shown below Fig. 3.
These logic functions produce full swing performance, use less power, are energy
efficient and take up a limited amount of space.
3 Full Adder
Full adder is a basic functional block in many applications. A full adder circuit
consists of three inputs (A, B, C) and two outputs (SUM, CARRY) [10]. It is a
combinational circuit which performs various three-bit operations. To design a full
adder XOR, XNOR gates and multiplexer are required as shown in Fig. 4. So here to
design the full adder using GDI technique, first, we designed the XOR and XNOR
Fig. 3 F1 and F2 functions

Fig. 4 Full adder block diagram
gates in GDI technique [11]. The XOR and XNOR signals, as well as their comple-
ment signals, determine the overall power consumption and propagation delay of the
complete adder [12].
3.1 XOR and XNOR Gate
The XOR gate is the basic functional gate in many applications like adder, multiplier,
comparator, etc. The expression of XOR function is given as in (1)
A ⊕ B = A B + AB (1)
In the proposed XOR [13] circuit, we use four transistors as shown in Fig. 5.
There are 18 transistors in the complete adder that was constructed using full
swing gate diffusion input technique. Our design consumes less power compared to
the conventional CMOS full adder which constitutes 28 transistors. Our design takes
less area as it contains a smaller number of transistors. So, the designed full adder
is area efficient and power efficient full adder [14, 15]. Full adder circuit is shown
Fig. 6.
4 Multiplier
Multiplier is a four-bit two input device. The output of the multiplier gives multi-
plied value of given inputs in binary form. There are diverse types of multipliers. The
Fig. 5 XNOR circuit using

GDI technique
Fig. 6 Implementation of full adder using GDI technique

proposed design is a four-bit Baugh-Wooley multiplier using full swing GDI tech-
nique [16]. We designed the multiplier as shown in the Fig. 7 below. It is little different
from the conventional Baugh-Wooley multiplier. It contains a greater number of full
adders when compared to general one. The discussed design consists of two types of
cells, they are grey and white cells. A complete adder and a NAND gate are attached
to the grey cell. A complete adder and a AND gate are attached to the white cell. The
whole structure gives outputs p1 to p8. These p1 to p8 are the output bits of multi-
plied value. In this designed multiplier, the power consumption is more compared to
the base paper; because the design of multiplier, we used contain a greater number
of full adders compared to the referenced one. It contains total of 20 full adders’
internal which functions parallelly. Our design constitutes less area as it contains a
smaller number of transistors in total. As every basic component of this multiplier is
designed using GDI technique, each one takes a smaller number of transistors which
in total reduces the transistor count.
Fig. 7 Block diagram of Baugh-Wooley multiplier

5 Results and Comparison
As the multiplier is designed using GDI technique instead of CMOS technology,

number of transistors used to design the multiplier were reduced and power
consumption also reduced.
The power and delay results are shown in Figs. 8, 9, 10, 11 and in Table 2. The
power calculated for the designed full adder was 211.7E-8 W. This power was very
Fig. 8 Brief overview on results of full adder implemented with different techniques
Fig. 9 Brief overview on power of full adder and Baugh-Wooley multiplier (BWM) implemented
with different techniques
Fig. 10 Brief overview on results Baugh-Wooley multiplier (BWM) implemented with different
techniques
Fig. 11 Percentage decrease

of power, delay, PDP and
transistor count in proposed
design
Table 2 Power delay calculations and results

Circuit Power (uW) Delay (pS) PDP (e-18 J) Number of transistors
Full adder—CMOS 287.6 × 10E-2 37 106.41 28
Full adder—GDI 617.6 × 10E-2 34 209 23
Full adder—FSGDI 211.7 × 10E-2 30 63.51 18
Baugh-Wooley 60 182 10,920 624
multiplier-CMOS
Baugh-Wooley 129 163 21,027 448
multiplier-GDI
Baugh-Wooley 42 145 6090 420
multiplier-FSGDI
less compared to full adder designed in CMOS technology. When we put together
this full adder and combine them into a multiplier, we get the power high power as
we are combining total of 20 full adders which are internally structured in grey and
white cells. As these cells are combined parallelly, the power consumption is little
higher.
6 Conclusion
The designed Baugh-Wooley multiplier using GDI technique is consuming less

power and less area as compared to others technologies. The output of this multi-
plier is also full swing in nature. We used only 18 transistors to design a full adder
circuit. Overall area required to implement the multiplier is also reduced as we used
GDI technique. The designed circuit is in 45 nm technology, so the overall power
is also reduced. As the multiplier is used in many common applications that we use
in our daily life, it is best to have already designed multiplier so that we can use
it whenever needed. So, it is better to use full swing GDI technique over general
GDI technique and CMOS technology. In this proposed design, many advantages
were 44% decrease in PDP and the area was decreased by 6.25% of Baugh-Wooley
multiplier.
References
1. P. Kishore, P.V. Sridevi, K. Babulu, Low power and high speed optimized 4-bit array multiplier
using MOD-GDI technique. in 2017 IEEE 7th International Advance Computing Conference
(IACC) (2017). https://doi.org/10.1109/iacc.2017.0106
2. A.K. Nishad, R. Chandel, Analysis of low power high performance XOR gate using GDI tech-
nique. in 2011 International Conference on Computational Intelligence and Communication
Networks (2011). https://doi.org/10.1109/cicn.2011.37
3. N. Weste, D. Harris, CMOS VLSI Design a Circuits and Systems Perspective (Addison-Wesley,
4thEd, 2011)
4. A. Morgenshtein, I. Schwartz, A. Fish, Gate diffusion input (GDI) logic in standard CMOS
nanoscale process. in 2010 IEEE 26th Convention of Electrical and Electronics Engineers in
Israel (2011)
5. A. Morgenshtein, A. Fish, I. Wagner, Gate-diffusion input (GDI): a power-efficient method
for digital combinatorial circuits. IEEE Trans. Very Large-Scale Intergr. (VLSI) Syst. 10(5),
566–581 (2002)
6. M. Shoba, R. Nakkeeran, GDI based full adders for energy efficient arithmetic applications.
Eng. Sci. Technol. Int. J. (2015)
7. A.M. Shams, D.K. Darwish, M.A. Bayoumi Performance analysis of low power 1-bit CMOS.
IEEE Trans. VLSI Syst. 10(1), 20–29 (2002)
8. V. Adler, E.G. Friedman, Delay and power expressions for a CMOS inverter driving a resistive
capacitive load. Analog Integr. Circ. Sig. Process. 14, 29–39 (1997)
9. T. Bhagyalaxmi, S. Rajendra, S. Srinivas, Power-aware alternative adder cell structure using
swing restored complementary pass transistor logic at 45nm technology, in 2nd International
Conference on Nanomaterials and Technologies (CNT 2014) (2014)
10. C.-K. Tung, Y.-C. Hung, S.-H. Shieh, G.-S. Huang, A low-power high-speed hybrid CMOS
full adder for embedded system. in 2007 IEEE Design and Diagnostics of Electronic Circuits
and Systems conference (2007)
11. E. Abu-Shama, M. Bayoumi, A new cell for low-power adders. in Proceedings of International
Midwest Symposium Circuits System (1995)
12. K.K. Chaddha, R. Chandel, Design and analysis of a modified low power CMOS full adder
using gate diffusion input technique. J. Low Power Electron. 6(4), 482–490 (2010)
13. K. Ravi Kumar, P. Mahipl Reddy, M. Sadanandam, A. Santhosh Kumar, M. Raju, Design of
2T XOR gate based full adder using GDI technique. in International Conference on Innovative
Mechanisms for Industry Applications (ICIMIA 2017) (2017)
14. M.J. Garima, H. Lohani, Design, implementation and performance comparison of multiplier
topologies in power-delay space. Eng. Sci. Technol. Int. J. (2015)
15. A. Shams, T. Darwish, M. Bayoumi, Performance analysis of low-power I-bit CMOS full adder
cells. IEEE Trans. VLSI Syst. IO(1), 20–29 (2002)
16. O.A. Albadry, M.A. Mohamed El-Bendary, F.Z. Amer, S.M. Singy, Design of area efficient
and low power 4-bit multiplier based on full-swing GDI technique. in 2019 International
Conference on Innovative Trends in Computer Engineering (ITCE) (2019). https://doi.org/10.
1109/itce.2019.8646341
VLSI Implementation of the Low Power
Neuromorphic Spiking Neural Network
with Machine Learning Approach
Abstract For biomedical application, analysis of data through modern computa-

tional methodologies is required. Machine learning-based architectures enhance the
way diagnosis is performed. The objective of this research work is to design a
neuromorphic system using nanoelectronics and artificial intelligence for feature
extraction and classification of medical data. The research area is a combination of
nanoelectronics, computer technology, and biology. This research paper presents an
extensive literature survey and descries the analysis, design, and implementation of
neural networks based on human brain functionalities. The spiking neural networks
(SNNs) architecture implementation in very large-scale integration will reduce the
power consumption and miniaturize the device. A detailed review on various spiking
neural networks architectures and methods is presented. In this research paper, an
effort is made for the VLSI implementation of the spiking neural architecture. The
implementation is carried out using Quartus tool and Spartan/Cyclone/Vertex Kits
in 90 nm and 65 nm technology. Power, delay, and area are taken as the performance
metrics.
Keywords FPGA · Spiking neural network · Artificial intelligence ·

Neuromorphic system · Low power · CMOS · Biomedical
1 Introduction
Biomedical system requires advanced computing methods and algorithms for the
analysis of data. Most of the data is either one dimensional or two dimensional.
For neuromorphic signals, hardware and software designs are complex due to their
signal features. The diagnosis and classification process are an important block in the
complete architecture. These signals are mostly having spatiotemporal characteristics
with peaks. Hence to analyze such signals, the data spiking neural networks will
be efficient. These third generation spiking neural networks are computationally
K. Venkateswara Reddy (B) · N. Balaji

Department of ECE, JNTU Kakinada, Kakinada, A.P, India
782 K. Venkateswara Reddy and N. Balaji
powerful and suit well in solving biomedical signal problems. These are also bio-
realistic in nature than their predecessors. The biomedical data follow spikes, binary
pattern, or noise burst which are apt for a neurospiking system. The ease of connecting
the biomedical information with the spiking neural network with low power or area
efficient or speed is a major advantage. But in very large-scale integrated circuit
(VLSI) implementations [1], all parameters cannot be optimized simultaneously.
Either the area or power or delay or efficiency can only be managed [2–7]. Since
neural architecture occupies more area, the power reduction is a challenging task but
speed can be improvised without compromising on the performance [8]. The spiking
neurons represent the analog biomedical input signal in time domain with binary
outputs. This compatibility of the spiking neural networks with the existing digital
systems which outputs analog values in the voltage or current domain is an added
advantage [8, 9]. VLSI chip for spiking neurons is based on asymmetric spike timing
dependent synaptic plasticity (STDP) algorithm while others use the gradient decent
algorithm. The bioinspired algorithms, such as spike timing dependent plasticity and
bat algorithms, are implemented or combined with the existing methods [10–12].
The next challenge lies on the training and optimization of the weight vectors in
each layer of the neural network architecture. The training is performed through
arithmetic calculations and computing of the particular algorithm with respect to
the architecture. These conventional neural network training methods differ from
the weight updating carried out in the biological neural networks. So, the swarm or
artificial intelligence-based learning is to be carried out and the arithmetic network
cannot be used. The main objective of this research work is to compare various spiking
neural networks (SNNs) architectures reported in the literature with different neuron
models along with the performance parameters such as power and delay.
2 Literature Survey
The previous research has presented wide variety of model types for neuromor-
phic systems like biologically-inspired [13] and artificial neural network models
such as Hodgkin-Huxley model, integrate-and-fire model, resonate-and-fire model,
quadratic-integrate-and-fire model, Izhikevich model, FitzHugh-Nagumo model,
Hindmarsh-Rose model, Morris-Lecar model, and Wilson model. The behavior and
characteristics of the biological neuron are to be linked with the artificial neural
model. Some of these models mimic the charge accumulation and neurons firing.
Few other models are based on nonbiologically principle or structure which does not
have neuroscience behaviors. The models are classified as biological plausible and
biological inspired. The former model behaves similar to biological neural systems
while the later replicates the biological neural systems behavior. The others may be of
just neuron models including axons, dendrites, or glial cells. McCulloch-Pitts model
defines the neural network models for spiking network which integrates the neuron
and fires. For the implementation of the algorithm in hardware, it replicates the
cell membrane dynamics, ion channel dynamics, neuron, delay components (axonal
VLSI Implementation of the Low Power Neuromorphic … 783
models), and pre- and post-synaptic neurons (dendritic models) [14]. The biomed-
ical circuits use small signal neural models for diagnose the various pathologies
[15]. In literature, several spiking neural network models were implemented [16,
17]. The feed-forward neural networks, including multilayer perceptron, are used
for implementation of neural spike detectors with several learning methods. The
learning methods are classified as supervised and unsupervised [18–20]. The imple-
mentation of the machine learning approach in hardware is a challenging task. The
low-price embedded systems were less efficient and suitable for portable devices.
The DSP architectures have specific structure for suitable purposes. The implemen-
tation is efficient only if the hardware supports higher computation blocks, necessary
memory, and faster communication. The biomedical applications have three main
blocks, namely preprocessing, feature extraction, and classification. For example,
in image processing application, the input image is processed block by block using
machine learning approaches. Wang et al. [21] presented a practical solution by
processing the input image using convolution neural networks (CNNs). For imple-
mentations, integrated (VLSI) architectures were used for color-based applications.
Complementary metal–oxide–semiconductor (CMOS) was used in dedicated chip
structures. Models were used to evaluate the performance of the learning algorithm
and network in both forward and reverse direction. Combining the advance tech-
nology in memory design, computing methods, and communication for a neuron,
a design was done by Seo et al. [22]. The SRAM provides better inter-neuron
communication among 256 neurons and 64 K binary synapses. The implementa-
tion was done in 45 nm SOI-CMOS. The network was extended for a neuromorphic
processor design by Seo and Seok [23]. Cao et al. [24] designed a spike-based hard-
ware which was energy efficient. Deep CNN was converted into SNN and is found
two orders of magnitude more energy efficient. The implementation was done in
FPGA. The memristor-based spiking neuromorphic networks can be demonstrated
as biology-plausible spike-time-dependent plasticity (STDP) windows in integrated
metal-oxide memristors [25]. Budinski et al. [26] introduced a Newton-type modifi-
cation of temporal Hebbian rule-based learning algorithm of a self-learning spiking
neural network. The mapping of the NN with spike system follows certain steps. Diehl
et al. [27, 28] mapped the RNN on a substrate of spiking neurons. The real-world
systems need practical spiking neuromorphic engine (SNE) which is time based
[13]. In applications like pattern recognition, parallel NN architecture is required
to perform the recognition or detection of patterns. Wang et al. [29] implemented a
large-scale neural networks hardware based on neural engineering framework (NEF)
in field programmable gate arrays (FPGAs). Reconfigurable mixed-signal spiking
neuromorphic architecture chip was developed by Luo et al. [30] with multichip
communication. For synaptic storage and computing, that chip was designed with 256
× 256 static random-access memory (SRAM) cells, 256 x 256 content addressable
memory (CAM) cells, 2 x 256 synapses and 256 neurons. The integrated chip neuron
implements spiking frequency adaptation and through address event representation
(AER) communication protocol provides communication. Pani et al. [31] presented
a modular and efficient FPGA design of an in silico spiking neural network using
Izhikevich model. The Xilinx Virtex 6-based device uses 1440 neurons. The synapses
connected neurons mimic the biologically-inspired associative memory. Wenke et al.

[32] designed and trained crossbar spiking neural network and three-terminal resis-
tive RAM (RRAM) model. The three-terminal RRAM device was used to store
the digits of analog currents. Hsieh et al. [33] designed a chip with memory and
fabricated in 0.18 µm CMOS technology for probabilistic spiking neural network.
Zheng and Mazumder [34] proposed an online learning algorithm for supervised
learning in multilayer spiking neural networks (SNNs). Dorogyy and Kolisnichenko
[26] used the spiking neural networks in the pre-training phase. The advantages
of spiking neuron networks (unsupervised learning) and classical artificial neural
networks (supervised learning) were combined for better efficiency. Camunas-Mesa
et al. [35] presented a hybrid memristor-CMOS approaches to implement large-scale
neural networks with learning capabilities, offering a scalable, and lower-cost alterna-
tive to existing CMOS systems. Farsa et al. [36] presented a neural computing FPGA
hardware unit and a neuromorphic system architecture. The model was based on a
modified leaky integrate-and-fire neuron. The maximum frequency of the imple-
mented neuron model and spiking network is 412.371 MHz and 189.071 MHz,
respectively.
3 Background Methodology
3.1 Neuromorphic System Architecture
The spiking neural networks (SNNs) are faster, accurate, and computationally
powerful. The SNN models are accurately the nervous system and other machine
learning algorithm can be incorporated efficiently. This makes the architecture better
when compared to the conventional artificial neural networks (ANNs) architec-
tures. In the existing work by Farsa et al. [36], machine learning approach toward
spiking neuromorphic using LIF model for neural computing was made. The reported
investigation shows the trade-off between computational complexity and biological
accuracy. A spiking neural network with the LIF neurons for pattern recognition is
depicted in Fig. 1. The network consists of 25 dummy neurons in the input layer, 5
LIF neurons in the hidden layer, and 1 LIF neuron in the output layer.
The neuromorphic system architecture presented in this literature for imple-
menting the SNN model based on NCHU is presented in Fig. 2. Here, the input
provider unit supplies the inputs, control unit takes control of all units with proper
timing while the synaptic weights are stored in memory. Hardware design of the
SNN model in RTL is shown in Fig. 3 which consists of inputs, six NCHUs, single
SRAM block, and output spikes. The NCHUs were implemented with sequential and
pipeline design, whose values are stored in the SRAM. SRAM was designed using
registers to save the ongoing data between two consecutive layers in the network.
NCHU executes every task based on the timing signal, and the results are stored in
Fig. 1 The proposed spiking neural network
Fig. 2 The neuromorphic system architecture
SRAM. The neurons work asynchronously on demand and independently. Figure 4

depicts the low-cost high-speed NCHU based on the modified LIF neuron.
Probabilistic spiking neural network (PSNN) is a friendly algorithm which
combines spiking neurons with probabilistic parameters. It has higher learning
performance under mentioned constraints. The implementation was done by Hsieh
et al. [33] and the learning capability and hardware compatibility were reported. The
PSNN algorithm was implemented using analog VLSI.
Fig. 3 Hardware design of the spiking network model
Fig. 4 The neural

computing hardware unit
(NCHU)
4 Investigation and Experimental Results
The implementation was carried out in Xilinx for FPGA. The several parameters like
LUT, power dissipation, and delay are measured and tabulated. The results are shown
in Table 1, 2, and 3 for various blocks. For implementation Cyclone II, III, Stratix II
and III were used. Cyclone II was providing less power dissipation when compared
to other kits even with the lesser delay. The LUTs occupied by the Stratix III for
the NCHU block is more which shows its larger area. When SRAM is considered,
similar performance is observed.
Table 1 Parameter analysis of NCHU in SNN
FPGA family Device LUT Power dissipation (mW) Delay
Available Used Utilization (%) Core dynamic Core static I/O thermal Total thermal (nS)
power power power power
dissipation dissipation dissipation dissipation
Cyclone II EP2C5F256C6 4608 133 3 13.45 18.11 49.17 80.72 6.747
(90 nm)
Cyclone III EP3C5F256C6 5136 133 3 12.21 46.17 34.53 92.91 5.627
(65 nm)
VLSI Implementation of the Low Power Neuromorphic …
Stratix II EP2S15F484C3 12,480 69 <1 25.6 303.82 76.8 406.22 5.057

(90 nm)
Stratix III65 EP3SL50F484C2 38,000 69 <1 19.83 370.76 50.09 440.67 6.588
787
788
Table 2 Parameter analysis of SRAM in SNN

FPGA Device LUT Power dissipation (mW) Delay
family Available Used Utilization (%) Core dynamic Core static I/O thermal Total thermal (nS)
Cyclone II EP2C5F256C6 4608 96 2 10.24 18.25 118.02 146.51 5.766
Cyclone EP3C5F256C6 5136 96 2 6.30 46.19 68.43 120.92 3.948
III
Stratix II EP2S15F484C3 12,480 96 <1 16.34 304.35 136.44 457.13 4.616
Stratix III EP3SL50F484C2 38,000 96 <1 14.82 371.23 88.88 474.93 5.149
Table 3 Parameter analysis of spiking neural network
FPGA Device LUT Power dissipation (mW) Delay
family Available Used Utilization (%) Core dynamic Core static I/O thermal Total thermal (nS)
Cyclone II EP2C5F256C6 4608 263 <6 22.65 80.72 243.74 347.12 7.227
Cyclone EP3C5F256C6 5136 263 5 13.16 52.01 141.67 206.84 5.466
III
VLSI Implementation of the Low Power Neuromorphic …
Stratix II EP2S15F484C3 12,480 134 <1 35.24 305.97 272.14 613.35 5.218
Stratix III EP3SL50F484C2 38,000 134 <1 19.22 370.71 159.31 549.24 5.129
789
4.1 Implementation of the Neurocomputing Hardware Unit

(NCHU) in SNN
Neurocomputing hardware unit (NCHU) performance for various FPGA kits is

observed and presented in Table 1. From the table, it can be observed that the static
power varies much when compared to dynamic power. In addition, the lower tech-
nology 65 nm consumes more power when compared to higher technology 90 nm.
The delay varies by 20%. The LUT remains the same.
4.2 Implementation of the SRAM in SNN
The SRAM performance for various FPGA kits is presented in Table 2. From the
table, it can be observed that the static power varies much when compared to
dynamic power. In addition, the lower technology 65 nm consumes more power
when compared to higher technology 90 nm. The delay varies by 20%. The LUT
remains the same. Here, the lower technology 65 nm consumes less dynamic power
when compared to 90 nm. Similarly, the same analysis holds for SNN block in which
the dynamic power is reduced in the lower technology with nominal delay.
4.3 Implementation of the Spiking Neural Network
The SNN performance for various FPGA kits is presented in Table 3. From the table, it
can be observed that the static power varies much when compared to dynamic power.
In addition, the lower technology 65 nm consumes more power when compared to
higher technology 90 nm.
5 Conclusion
This paper presents the detail analysis and implementation of the spiking neural
network in field programmable gate array. Several methods of architecture were
reviewed, and the comparative analysis was made. The design is suitable for biomed-
ical application, where efficiency in diagnosis is not compromised. Details of the
architecture and its implementation are reported. For implementation, Quartus tool
in Spartan/Cyclone/Vertex Kits was used in 90 nm and 65 nm and the results are
presented. The analysis of neurocomputing hardware unit, SRAM, and spiking neural
network unit with respect to power, delay, and LUT is done. From the analysis, it is
been found that the power is higher when delay is less and vice versa. Based on the
system for medical application, the choice can be made.
References
1. S. Roy, A. Banerjee, A. Basu, Liquid state machine with dendritically enhanced readout for
low-power, neuromorphic VLSI implementations. IEEE Trans. Biomed. Circuits Syst. 8(5),
681–695 (2014). https://doi.org/10.1109/TBCAS.2014.2362969
2. B. Deng, M. Zhang, F. Su, J. Wang, X. Wei, B. Shan, The implementation of feedforward
network on field programmable gate array. in IEEE 2014 7th International Conference on
Biomedical Engineering and Informatics (BMEI) (2014), pp. 483–487
3. P. Dondon, J. Carvalho, R. Gardere, P. Lahalle, G. Tsenov, V. Mladenov, Implementation of
a feed-forward artificial neural network in vhdl on fpga. in IEEE 2014 12th Symposium on
Neural Network Applications in Electrical Engineering (NEUREL) (2014), pp. 37–40
4. H. Mostafa, A. Khiat, A. Serb, C.G. Mayr, G. Indiveri, T. Prodromakis, Implementation of a
spike-based perceptron learning rule using tio2- x memristors. Front. Neurosci. 9, 357 (2015)
5. G.-M. Lozito, A. Laudani, F.R. Fulginei, A. Salvini, Fpga implementations of feed forward
neural network by using floating point hardware accelerators. Adv. Electr. Electron. Eng. 12(1),
30 (2014)
6. A. Perez-Garcia, G. Tornez-Xavier, L. Flores-Nava, F. Gomez- Castaneda, J. Moreno-Cadenas,
Multilayer perceptron network with integrated training algorithm in fpga. in IEEE 2014
11th International Conference on Electrical Engineering, Computing Science and Automatic
Control (CCE) (2014), pp. 1–6
7. R. Hasan, T.M. Taha, Enabling back propagation training of memristor crossbar neuromorphic
processors. in IEEE 2014 International Joint Conference on Neural Network (IJCNN) (2014),
pp. 21–28
8. F. Castanos, A. Franci, The transition between tonic spiking and bursting in a six-transistor
neuromorphic device. in 2015 12th International Conference on Electrical Engineering,
Computing Science and Automatic Control (CCE), IEEE (2015), pp. 1–6
9. F.L.M. Huayaney, H. Tanaka, T. Matsuo, T. Morie, K. Aihara, A VLSI spiking neural network
with symmetric STDP and associative memory operation. Int. Conf. Neural Inf. Process. 381–
388 (2011). https://doi.org/10.1007/978-3-642-24965-5_43.
10. M. Nouri, M. Jalilian, M. Hayati, D. Abbott, A digital neuromorphic realization of pair-based
and triplet-based spike-timing-dependent synaptic plasticity. IEEE Trans. Circuits Syst. II
Express Briefs 65(6), 804–808 (2018). https://doi.org/10.1109/TCSII.2017.2750214
11. D. Yamashita, K. Saeki, Y. Sekine, IC implementation of spike-timing-dependent synaptic
plasticity model using low capacitance value. in 2014 IEEE Asia Pacific Conference on Circuits
and Systems (APCCAS), Ishigaki (2014), pp. 221–224. https://doi.org/10.1109/APCCAS.2014.
7032759
12. H. Hsieh, K. Tang, Hardware friendly probabilistic spiking neural network with long-term and
short-term plasticity. IEEE Trans. Neural Netw. Learn. Syst. 24(12), 2063–2074 (2013). https://
doi.org/10.1109/TNNLS.2013.2271644
13. T. Liu, W. Wen,A fast and ultra low power time-based spiking neuromorphic architecture for
embedded applications. in 2017 18th International Symposium on Quality Electronic Design
(ISQED), Santa Clara, CA (2017), pp. 19–22. https://doi.org/10.1109/ISQED.2017.7918286
14. E.M. Izhikevich, Which model to use for cortical spiking neurons? IEEE Trans. Neural Netw.
15(5), 1063–1070 (2004)
15. A. Basu, Small-signal neural models and their applications. IEEE Trans. Biomed. Circ. Syst.
6(1), 64–75 (2012)
16. F. Grassia, T. Levi, T. Kohno, S. Saighi, Silicon neuron: digital hardware implementation of
the quartic model. Artif. Life Robot. 19(3), 215–219 (2014)
17. S. Hashimoto, H. Torikai, A novel hybrid spiking neuron: bifurcations, responses, and on-chip
learning. IEEE Trans. Circ. Syst. I: Regul. Pap. 57(8), 2168–2181 (2010)
18. M. Hu, H. Li, Y. Chen, Q. Wu, G.S. Rose, R.W. Linderman, Memristor crossbar-based neuro-
morphic computing system: a case study. IEEE Trans. Neural Netw. Learn. Syst. 25(10),
1864–1878 (2014)
19. J. Burger, C. Teuscher, Volatile memristive devices as short-term memory in a neuromorphic

learning architecture. in Proceedings of the 2014 IEEE/ACM International Symposium on
Nanoscale Architectures. ACM (2014), pp. 104–109
20. Z. Dong, S. Duan, X. Hu, L. Wang, H. Li, A novel memristive multilayer feedforward small-
world neural network with its applications in pid control. Scient. World J. 2014 (2014)
21. L. Wang, J.P. De Gyvez, E. Sanchez-Sinencio, Time multiplexed color image processing based
on a CNN with cell-state outputs. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 6(2),
314–322 (1998). https://doi.org/10.1109/92.678895
22. J. Seo et al.,A 45nm CMOS neuromorphic chip with a scalable architecture for learning in
networks of spiking neurons. in 2011 IEEE Custom Integrated Circuits Conference (CICC),
San Jose, CA (2011), pp. 1–4. https://doi.org/10.1109/CICC.2011.6055293
23. J. Seo, M. Seok, Digital CMOS neuromorphic processor design featuring unsupervised online
learning. in 2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-
SoC), Daejeon (2015), pp. 49–51. https://doi.org/10.1109/VLSI-SoC.2015.7314390
24. Y. Cao, Y. Chen, D. Khosla, Spiking deep convolutional neural networks for energy-efficient
object recognition. Int. J. Comput. Vision 113(1), 54–66 (2015)
25. M. Prezioso et al.,Spiking neuromorphic networks with metal-oxide memristors. in 2016 IEEE
International Symposium on Circuits and Systems (ISCAS), Montreal, QC (2016), pp. 177–180.
https://doi.org/10.1109/ISCAS.2016.7527199
26. Y. Bodyanskiy, A. Dolotov, I. Pliss, M. Malyar, A fast learning algorithm of self-learning
spiking neural network. in 2016 IEEE First International Conference on Data Stream Mining &
Processing (DSMP), Lviv (2016), pp. 104–107. https://doi.org/10.1109/DSMP.2016.7583517
27. P.U. Diehl, G. Zarrella, A. Cassidy, B.U. Pedroni, E. Neftci, Conversion of artificial recurrent
neural networks to spiking neuralnetworks for low-power neuromorphic hardware. in IEEE
International Conference on Rebooting Computing (ICRC) (2016), pp. 1–8
28. P.U. Diehl, B. U. Pedron, A. Cassidy, P. Merolla, E. Neftci, G. Zarrella, Truehappiness: neuro-
morphic emotion recognition on truenorth. in IEEE 2016 International Joint Conference on
Neural Networks (IJCNN) (2016), pp. 4278–4285
29. R. Wang, C.S. Thakur, G. Cohen, T.J. Hamilton, J. Tapson, A. van Schaik, Neuromorphic
hardware architecture using the neural engineering framework for pattern recognition. IEEE
Trans. Biomed. Circuits Syst. 11(3), 574–584 (2017). https://doi.org/10.1109/TBCAS.2017.
2666883
30. C. Luo, Z. Ying, X. Zhu, L. Chen, A mixed-signal spiking neuromorphic architecture for
scalable neural network. in 2017 9th International Conference on Intelligent Human-Machine
Systems and Cybernetics (IHMSC), Hangzhou (2017), pp. 179–182. https://doi.org/10.1109/
IHMSC.2017.47
31. D. Pani, P. Meloni, G. Tuveri, F. Palumbo, P. Massobrio, L. Raffo, An FPGA platform for real-
time simulation of spiking neuronal networks. Front. Neurosci. 11(90), 1–13 (2017). https://
doi.org/10.3389/fnins.2017.00090
32. S. Wenke, A. Rush, T. Bailey, R. Jha,Novel spiking neural network utilizing short-term and
long-term dynamics of 3-terminal resistive crossbar arrays. 2017 IEEE 60th International
Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA (2017), pp. 432–435.
https://doi.org/10.1109/MWSCAS.2017.8052952
33. M. Atsumi, Sequence learning and planning on associative spiking neural network. in
Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat.
No.02CH37290), Honolulu, HI, USA (2002), pp. 1649–1654. https://doi.org/10.1109/IJCNN.
2002.1007765
34. N. Zheng, P. Mazumder, Online supervised learning for hardware-based multilayer spiking
neural networks through the modulation of weight-dependent spike-timing-dependent plas-
ticity. IEEE Trans. Neural Netw. Learn. Syst. 29(9), 4287–4302 (2018). https://doi.org/10.
1109/TNNLS.2017.2761335
35. L.A. Camunas-Mesa, B. Linares-Barranco, T. Serrano-Gotarredona,Neuromorphic spiking
neural networks and their memristor-CMOS hardware implementations. Materials 12, 1–28
(2019). https://doi.org/10.3390/ma12172745
36. E.Z. Farsa, A. Ahmadi, M.A. Maleki, M. Gholami, H.N. Rad, A low-cost high-speed neuro-
morphic hardware based on spiking neural network. IEEE Trans. Circ. Syst. II Express Briefs
66(9), 1582–1586 (2019). https://doi.org/10.1109/TCSII.2019.2890846
IoT-Based Energy Saving
Recommendations by Classification
of Energy Consumption Using Machine
Learning Techniques
G. Siva Naga Dhipti, Baggam Swathi, E. Venkateswara Reddy,

and G. S. Naveen Kumar
Abstract The quick advancement of metropolitan headway in the previous decade

requires sensible and practical arrangements to save electricity. One of these solu-
tions has been the creation of energy saving policies based on energy forecasting in
smart environments. This paper presents and investigates prescient energy consump-
tion model based on Internet of things for smart cities. Governments all throughout
the planet are effectively seeking after research on smart urban areas as an endeavor
to seek after the city’s complexity and supportability. To make a smart city, the opti-
mization of electrical energy consumption should take place effectively. With the
advancement of smart city, it is normal that countless Internet of things gadgets will
be introduced in different structures in the city and a great deal of energy use data will
be estimated. At the point when numerous Internet of things gadgets are associated
with the Web, a lot of data is created. Information utilized incorporate estimations of
temperature and humidity sensors from a remote organization, regardless of whether
from a close by air terminal station and recorded energy utilization of lighting instal-
lations. The AI project examines information separating to eliminate non-prescient
boundaries and highlight positioning. The house temperature and humidity condi-
tions were checked with a ZigBee remote sensor network. This proposal introduces
an Internet of things-based infusion by running a data driven model among edge and
cloud layer to expect the energy use of a design logically. To achieve a continuous
response from the system, the machine learning (ML) computation runs in the edge
center point to anticipate the next hour energy premium of the construction.
Keywords Internet of things · Electrical energy consumption · Feature ranking ·

Wireless sensor network · Machine learning · ZigBee
G. S. N. Dhipti · B. Swathi · E. V. Reddy (B) · G. S. N. Kumar

Department of CSE, Malla Reddy University, Hyderabad, India
796 G. S. N. Dhipti et al.
1 Introduction
1.1 Concepts
Revolution in the field of electricity is partitioned into four levels. During the
first transformation, new wellsprings of energy were found for functioning of the
machines. The enormous uprooting of coal and the creation of steam engines are
outstanding advancement arranges at the next stage [1]. The subsequent stage
called as enormous engineering and power was a time of vast development in
trade. The third upheaval presented personal computers and the initial stage of
resemblance advancements, like communication, which entitled computerization for
stream chains [2].
An immense range of current advancements like communication system,
insightful robots, and things over the Web, AI and ML is appropriate to the fourth
modern revolution [3, 4]. The Internet of things is combination of interrelated figuring
gadgets, motorized, and computerized machines or individuals that have unique iden-
tifications and the ability to transfer information to an organization without involve-
ment of human or computerized transfer. IoT can possibly upgrade the conceit in
various regions including health benefits, shrewd urban communities, manufacturing,
agribusiness, irrigation, and the energy area [5].
1.2 Motivation
Energy productivity offers financial rewards in long haul by diminishing the expense
of fuel imports/supply, energy generation, and decreasing outflows from the energy
area. For improving energy proficiency and a more ideal energy the executives,
a successful examination of the continuous information in the energy production
network assumes a key part [6]. Internet of things avail sensors and resembling inno-
vations for distinguishing and communicating constant information, which entitle
quick estimations and absolute infusions [7]. In this proposal, we deliberate the
intent of Internet of things in all stages of the energy flow.
Here, the proposal is to point the expected commitment of Internet of things to
effective usage of energy, decrease of energy stipulation, and expanding the portion
for non-conventional sources of electric energy. The flow of energy supply is shown
in Fig. 1.
2 Internet of Things
Data on Web are arising modernization which exploits the Web and plans to display
network linking actual gadgets or “things” [8]. The actual gadgets includes home
IoT-Based Energy Saving Recommendations by Classification of Energy … 797
Fig. 1 Energy supply chain
appliances and industrial equipment. Taking advantage of fitting sensors and resem-
bling organizations, this equipment can set important instructions and contributes
authorized supervisions for individuals. Major components of IoT are shown in Fig. 2,
which involves the flow as data collection from smart devices thereby processed
according to the protocols and taken to the cloud for data analysis to develop a well-
defined system for efficient usage. In Fig. 2, clearly shown that major components
of IoT.
Fig. 2 Major components of IoT

3 Implementing Technologies
3.1 Sensor Devices
Sensing components are the critical play role of Internet of things [9]. These compo-
nents are used to gather and transfer information progressively. They improve
adequacy, usefulness, and assumes a basic part in accomplishment of IoT [10, 11].
Temperature sensing device is utilized for identifying the vacillations for warming
and freezing a system [12]. Hotness is a significant. On the side of electrical energy
usage, the hotness sensing devices are utilized to amplify the presentation of a frame-
work when hotness varies changes at typical tasks. Humidity sensing component is
utilized to recognize the dampness and moistness in atmosphere. The proportion of
dampness noticeable in atmosphere to the most noteworthy measure of dampness at
a specific hotness of air around is called comparative humidity [13].
3.2 Actuators
An actuator is a machine part or framework that moves or controls the instrument

or the framework. Sensors in the gadget sense the climate, then, at that point control
signals are created for the actuators as per the activities expected to perform. They
take electrical contribution from the computerization system, change the contribution
to activity, and follow up on the devices and machines inside the IoT frameworks
[14]. The examination in [15] puts forward a remote sensing devices and mechanism
organization to give an Internet of things system.
3.3 Data Transmission System
Remote transmission structures assume significant part in actuating Internet of things.

Remote system interfaces the sensor gadgets to IoT protocols and achieves each and
every detailed instruction correspondence among these objects of Internet of things.
The remote system for transmission is chosen by comparing the norms. Here is an
evaluation in Table 1 to classify each of the technologies when applied with Internet
of things.
ZigBee is an innovation in data transmission, which is intended to make individual
area grid and marks limited scope applications. It is not difficult to execute and wanted
to give minimal expense, low-information rate, and extremely unfailing networks for
low-power applications [16, 17]. In the electrical energy, the model IoT utilizations of
ZigBee incorporate lighting frameworks, smart grids, home robotization, and modern
mechanization.
Table 1 Resemblance between different wireless technologies

Technology Range Data rate Power usage Security Installation Example
(battery life) cost application
LoRA ≤50 km 0.3–38.4 kbps Very low High Low Smart
(8–10 years) buildings
NB-IoT ≤50 km ≤100 kbps High High Low Smart grid
(1–2 years) comm
LTE-M ≤200 km 0.2–1 Mbps Low High Moderate Smart
(7–8 years) meter
Sigfox ≤50 km 100 bps Low High Moderate Smart
(7–8 years) buildings
Weightless <5 km 100 kbps Low (very High Low Smart
long) meter
Bluetooth ≤50 m 1 Mbps Low (few High Low Smart
months) home
ZigBee ≤100 m 250 kbps Very low Low Low Smart
(5–10 years) metering
RE
Satellite Very 100 kbps High High Costly Solar and
long > 1500 km wind
power
plants
3.4 Data and Evaluation
Processing IoT information is a difficult issue. Since, IoT information known as big
data refers to tremendous measure of organized and unorganized information, created
from sensors, program, smart, or intellectual gadgets. Because of the attributes of
big data, which are 3 V’s [18], it should be productively handled and investigated
[19]. Various steps involved in evaluation are as shown in Fig. 3.
4 IoT in the Field of Power Industry
IoT can assume a pivotal part in lessening energy misfortunes and bringing down
release of carbon dioxide into the atmosphere [20]. The maintenance of electrical
energy dependent on IoT can witness constant electric energy utilization and intensify
the degree of mindfulness of the usage of electrical energy at any level of the energy
flow [21, 22]. IoT advances will support to analyze each device in a city. Construction,
metropolitan systems, energy organizations, and functions could be associated with
sensing devices. These associations can guarantee an energy-effective brilliant city by
consistent checking of information accumulated from sensors. As far as cooperative
effect of smart grid, it is displayed in Fig. 4 in an automatized city fitted out with
Fig. 3 Proposed methodology
Fig. 4 A centralized data

connectivity in a smart city
concept
Internet of things networks, various segments of the city can be associated together
[23].
4.1 Smart Meters
The energy utilization in urban communities can be partitioned into various parts;
private structures (domestic) and business (administrations). The domestic energy
utilization in the private area incorporates lighting, heating, cooling, warming,
and aeriation (Fig. 5). Electrical energy exploitation commonly represents 50% of
consumption in commercial constructions. Consequently, supervision of high voltage
Fig. 5 Share of residential

energy consumption
alternating current system is substantial in lowering the usage. IoT devices can be
assumed to be a noteworthy to lessen the wastage of electrical energy. By finding
some indoor regulators reliant on residence, non-residence can be listed out. When
an unused region is distinguished, a several possible steps can be taken to bring down
energy utilization.
5 Results
The statistical analysis of various attributes like appliances, lights, T1(temperature

in kitchen area), RH_1(humidity in kitchen area), T2(temperature in living room),
RH_2(humidity in living room), T3(temperature in laundry room), RH_3(humidity
in laundry room), T4(temperature in office room), RH_4(humidity in office room),
T5(temperature in bathroom), RH_5(humidity in bathroom), T6(temperature outside
the building), RH_6(humidity outside the building), T7(temperature in ironing
room), and RH_7(humidity in ironing room) is summarized in Table 2. Univariate
plots for important attributes like appliances, lights, T1, and RH_1 versus density are
drawn in Figs. 6, 7, 8, and 9, respectively, and a bivariate plot for appliances versus
time is shown in Fig. 10. A data frame is drawn out for all the attributes which are
shown in Fig. 11. And finally, after complete analysis on various machine learning
techniques in support of the proposed method, comparison table for performance of
proposed and conventional method is shown in Table 3.
802
Table 2 Summary of attributes appliances, lights, T1, RH_1, T2, RH_2, etc.,
Vare N Mean 8d Median Trimmed mad min max range skew Kurtoals 8θ
Date 1 19,735 NaN NA NA NaN NA Inf −Inf −Inf NA NA NA
Appliances 2 19,735 98 103 60 73 30 10 1080 1070 3 14 1
Lights 3 19,735 4 8 0 2 0 0 70 70 2 4 0
T1 4 19,735 22 2 22 22 1 17 26 9 0 0 0
RH_1 5 19,735 40 4 40 40 4 27 63 36 0 0 0
T2 6 19,735 20 2 20 20 2 16 30 14 1 1 0
RH_2 7 19,735 40 4 40 41 4 20 56 36 0 1 0
T3 8 19,735 22 2 22 22 2 17 29 12 0 0 0
RH_3 9 19,735 39 3 39 39 3 29 50 21 0 −1 0
T4 10 19,735 21 2 21 21 2 15 26 11 0 0 0
RH_4 11 19,735 39 4 38 39 5 28 51 23 0 −1 0
T5 12 19,735 20 2 19 19 2 15 26 10 1 0 0
RH_5 13 19,735 51 9 49 50 6 30 96 67 2 5 0
T6 14 19,735 8 6 7 8 6 −6 28 34 1 0 0
RH_6 15 19,735 55 31 55 56 40 1 100 99 0 −1 0
T7 16 19,735 20 2 20 20 2 15 26 11 0 0 0
RH_7 17 19,735 35 5 35 35 5 23 51 28 0 −1 0
T8 18 19,735 22 2 22 22 2 16 27 11 0 0 0
RH_8 19 19,735 43 5 42 43 5 30 59 29 0 0 0
G. S. N. Dhipti et al.
Fig. 6 Univariate analysis:

plotting on appliances

plotting on lights
6 Conclusion
The proposed RF model will in general achieve better with a lower number of
energy levels and contrasted and the ordinary technique. Rather than going through
regression-based load forecasting from the regular technique, the created classi-
fier pre-processed the mathematical esteemed information into levels and afterward

plotting on T1

plotting on RH_1
anticipated them utilizing a more straightforward order measure. The two classifiers
perform better with a lower number of energy levels.
Fig. 10 Bivariate analysis:

plotting on appliances and
time
Fig. 11 Identification of outliers on appliances

Table 3 Classification performance of proposed and conventional method

Classifier models Classification accuracy
std_min std_ave std_max
Three-level Conventional 0.0032 0.0106 0.9131
Proposed 0.0012 0.0048 0.0100
Five-level Conventional 0.0024 0.0093 0.0206
Proposed 0.0023 0.0068 0.0148
Seven-level Conventional 0.0049 0.0100 0.0175
Proposed 0.33 0.0070 0.0123
References
1. P.N. Stearns, Reconceptualizing the industrial revolution. J. Interdiscip. Hist. 42, 442–443
(2011)
2. M. Jensen, The modern industrial revolution, exit, and the failure of internal control systems.
J. Financ. 48, 831–880 (1993)
3. H. Kagermann, J. Helbig, A. Hellinger, Wahlster, W, Recommendations for Implementing the
Strategic Initiative Industrie 4.0: Securing the Future of German Manufacturing Industry;
Final Report of the Industrie 4.0 Working Group; Forschungsunion: Frankfurt/Main, Germany
(2013)
4. S.K. Datta, C. Bonnet, MEC and IoT based automatic agent reconfiguration in industry
4.0. in Proceedings of the2018 IEEE International Conference on Advanced Networks and
Telecommunications Systems (ANTS), Indore, India, 16–19 December 2018, pp. 1–5
5. G.S. Naveen Kumar, V.S.K. Reddy, Detection of shot boundaries and extraction of key frames
for video retrieval. Int. J. Know.-Based Intell. Eng. Syst. 24(1), 11–17
6. Y.S. Tan, Y.T. Ng, J.S.C. Low, Internet-of-things enabled real-time monitoring of energy
efficiency on manufacturing shop floors. Procedia CIRP 61, 376–381 (2017)
7. K. Tamilselvan, P. Thangaraj, Pods—a novel intelligent energy efficient and dynamic
frequency scalings for multi-core embedded architectures in an IoT environment. Microprocess.
Microsyst. 72, 102907 (2020)
8. K. Haseeb, A. Almogren, N. Islam, I. Ud Din, Z. Jan, An energy-efficient and secure routing
protocol for intrusion avoidance in IoT-based WSN. Energies 12, 4174 (2019)
9. S.D.T. Kelly, N.K. Suryadevara, S.C. Mukhopadhyay, Towards the implementation of IoT for
environmental condition monitoring in homes. IEEE Sens. J. 13, 3846–3853 (2013)
10. E. Venkateswara Reddy, M.Ramesh, M. Jane, A comparative study of clustering techniques
for big data sets using apache mahout. in 3rd IEEE International Conference on Smart City
and Big Data 2016, Sultanate of Oman, April 2016
11. G. Di Francia, The development of sensor applications in the sectors of energy and environment
in Italy, 1976–2015. Sensors 17, 793 (2017)
12. ITFirmsCo. 8Types of Sensors that Coalesce Perfectly with an IoT App. (2018)
13. A.S. Morris, R. Langari, Level measurement. in A.S. Morris, R. Langari (eds) Measurement
and Instrumentation (2nd ed, Academic Press, Boston, MA, USA, 2016), pp. 531–545
14. V Reddy Eluri, C Ramesh, SN Dhipti, D. Sujatha, Analysis of MRI based brain tumor detection
using RFCM clustering and SVM classifier. in International Conference on Soft Computing
and Signal Processing(ICSCSP-2018) springer Series, June 22, 2018
15. J. Blanco, A. García, J. Morenas, Design and implementation of a wireless sensor and actuator
network to support the intelligent control of efficient energy usage. Sensors 18, 1892 (2018)
16. G.S. Naveen Kumar, V.S.K. Reddy, High-performance video retrieval based on spatio-temporal
features. in Microelectronics, electromagnetics and telecommunications (Springer, Singapore),
pp. 433–441
17. I. Froiz-Míguez, T. Fernández-Caramés, P. Fraga-Lamas, L. Castedo, Design, implementation

and practical evaluation of an IoT home automation system for fog computing applications
based on MQTT and ZigBee-WiFi sensor nodes. Sensors 18, 2660 (2018)
18. S. Baggam, K. Murthy, An effective modeling and design of closed-loop high step-up DC–DC
boost converter. in Intelligent System Design (Springer, Singapore, 2021), pp. 303–311
19. I. Stojmenovic, Machine-to-machine communications with in-network data aggregation,
processing, and actuation for large-scale cyber-physical systems. IEEE Internet Things J. 1,
122–128 (2014)
20. M. Hossain, N. Madlool, N. Rahim, J. Selvaraj, A. Pandey, A.F. Khan, Role of smart grid in
renewable energy: an overview. Renew. Sustain. Energy Rev. 60, 1168–1184 (2016)
21. Intel IT Centre. Big Data Analytics: Intel’s IT Manager Survey on How Organizations Are
Using Big Data (Technical Report; Intel IT Centre: Santa Clara, CA, USA, 2012)
22. A. Bhardwaj, Leveraging the Internet of Things and Analytics for Smart Energy Management
(TATA Consultancy Services, Mumbai, India, 2015)
23. S.P. Mohanty, Everything you wanted to know about smart cities: the internet of things is the
backbone. IEEE Consum. Electron. Mag. 5, 60–70 (2016)
Author Index
A Chavan, Pundalik, 11
Aathukuri, Lokesh, 197 Chopra, Ashish, 597
Abisheek, K., 505 Chourasia, Bharti, 643
Acharya, Dinesh U., 1
Adusumalli, Vyshnavi, 219
Agajyelew, Bekele Worku, 381
D
Agrawal, Shruti, 597
Das, Arunava, 187
Ahuja, Akansha, 119
Ajay, K. D. K., 447 Datta, Bingi Sai, 751
Akuri, Sree Rama Chandra Murthy, 197 Deepak, N. R., 11
Amuru, Deepthi, 663 Deore, Siddhesh, 295
Ankam, Praveen, 83 Dey, Ranadeep, 107
Arthi, K., 145 Dhipti, G. Siva Naga, 795
Dileep, P., 257
Dimmita, Nandini, 197
B Dinesh Kumar, R., 505
Bachche, Ruturaj, 295 Durani, Homera, 429
Balaji, N., 781 Durga Devi, P., 553
Bale, Mahesh Babu, 219 Durgalaxmi, Kavali, 751
Bansal, Pratosh, 175 Durgam, Rajesh, 725
Barlapudi, Mounika, 655 Durgam, Thirupathi, 705
Berin, T. Grace, 733
Bharathi, B., 305
Bhatt, Nirav, 429 F
Bhave, Sameer, 175 Firdausi, Tauseef Jamal, 187
Bhise, Pratibha R., 371
Bichave, Aditya, 295
Bodkhe, Aryak, 535
Boggavarapu, Venkata Bharath Krishna, G
219 Gaikwad, Vinayak, 535
Bopidi, Srikanth, 769 Ganatra, Amit, 35
Geetha, M., 1, 165
Giri Prasad, M. N., 457
C Gottumukkala, V. S. S. P. Raju, 239
Challa, Archana, 219 Goyal, Shimpy, 49
Chandrakala, M., 553 Gupta, Sudhanshu, 535
© The Editor(s) (if applicable) and The Author(s), under exclusive license 809
to Springer Nature Singapore Pte Ltd. 2022
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6
810 Author Index
H Malladhi, Nagarjuna, 673

Harivinod, N., 621 Malleswara Rao, V., 447
Harshini, Malladi, 83 Mandava, Ajay Kumar, 681
Harshwardhan, Chougule, 133 Mangalampalli, Sudheer, 263
Heera, Cherukumpalem, 633 Maniyar, Simran N., 371
Massoudi, Mahboob, 97
Megana, K. R. S., 571
J Mehul, Lokhande, 133
Jana, Saikat, 187 Menon, Riya, 119
Jarathi, Shivani, 83 Mohan Kumar Naik, B., 491
Jayasree, P. V. Y., 571 Mohan, K. Venkata Murali, 273
Jhunjhunwala, Anuj, 597 Munagekar, Ameya, 413
Jones, Aida, 505 Munot, Bhushan, 327
Joshi, Meghana Mukunda, 23
Jyothi, B., 693
Jyothsna, Undrakonda, 563 N
Nagajyothi, D., 751
Nagarathna, N., 23
K Naga Satish, G., 479
Kalyan, Madhari, 251 Nagpal, Paras, 597
Kanade, Vijay A., 611 Nandi, Piyush, 187
Katarya, Rahul, 97 Narisetty, Srinivasa Rao, 155
Kathirvel, A., 393 Naveen Kumar, G. S., 405
Kaur, Prabhpreet, 59 Nazeem, Hina, 663
Keshav, Kirti, 585 Noorbasha, Sayedu Khasim, 469
Khalandar Basha, D., 521 Nuchu, Yeswanth Surya Srikar, 155
Khan, Mohammed Omar, 23
Kocherla, Ravi Teja, 263
Kodati, Sarangam, 273, 283 O
Koppula, Venkata Koti Reddy, 655 Oruganti, Seetaram, 769
Krishna, Vempati, 479
Kulkarni, Sonali B., 371
Kumar, Akansha, 413 P
Kumaran, N., 239 Pangare, Jui, 327
Kumar, Ch. Manohar, 563 Patel, Vibha, 35
Kumar, Gaurav, 441 Patibandla, Anitha, 693
Kumar, G. S. Naveen, 795 Patil, Rachana, 133, 295
Kumar, K. M. V. Madan, 273 Paul, Swarna Kamal, 187
Kumar, P. Anil, 715 Pavani, Dontha, 751
Kumar, Raj, 741 Poloju, Sahithi, 769
Kumbale, Niyathi Srinivasan, 23 Ponnapalli, V. A. Sankar, 563
Kumbhare, Vedant Arvind, 145 Ponugoti, Vamshi, 769
Pradhan, Ashish Kumar, 585
Prasad, D. Rajendra, 643
L Prasad, M., 381
Lalitha, R. V. S., 479 Priya, J., 305
Lunawat, Sonali Sagarmal, 327 Priya, R. L., 119
Pujari, Sneha, 327
M
Madesh, M., 505 R
Madhan, E. S., 393 Raghavendran, Ch. V., 479
Madhura Prabha, R., 315 Raj, N., 715, 725
Mahammad, Eliyaz, 673 Raju, Rollakanti, 521
Author Index 811
Ramamurthy, Garimella, 343, 351 Sucharitha, M., 693

Ram, G. Bhaskar Phani, 673 Sudha, Gnanou Florence, 469
Rampurawala, Abduttayyeb, 327 Sujithra, T., 205
Ranjana, S., 305 Sulochana, C. Helen, 733
Rao, B. Dinesh, 165 Suman, Bandi, 545
Rao, K. V. Narayana, 263 Suneetha, J., 11
Rao, R. Chinna, 571 Sunny, Dhadiwal, 133
Rao, Siddi Madhusudhan, 337 Suprasen, K., 571
Rao, S. Srinivasa, 571 Swamy, Tata Jagannadha, 343, 351
Rapaka, Anuj, 263 Swathi, Baggam, 405, 795
Ravi, G., 273, 283 Swathi, N., 563
Rayapudi, Shivani, 197
Reddy, E. Venkateswara, 795
Reddy, Kumbala Pradeep, 283 T
Reddy, T. Vinay Simha, 563 Tabassum, Husna, 11
Reddy, Yaminidhar, 343 Tailor, Jaishree, 35
Reddy, Y. Vijay Bhaskar, 219 Tamil, S., 643, 715, 725
Reenu Rita, P. S., 305 Teja, Guda Rahul, 761
Regentova, Emma E., 681 Thakurta, Parag Kumar Guha, 107
Rekha, S., 761 Thawal, Siddhi, 327
Rishishwar, R. P., 741 Thorat, Chetana, 327
Rohit, Naikade, 133 Tikkisetty, Manikanta, 197
Roy, Chandan Kumar, 359 Tolamatti, Gauravi, 119
Tolani, Namrata, 119
S
Sadiwala, Ritesh, 359, 705 U
Sandeep, M., 251 Uma Devi, G., 545
Sannareddy, Varshitha, 655 Upadhya, K. Jyothi, 165
Sasikala, S., 315
Sathvik, Pulluri, 761
Satyanarayana, A. N., 545 V
Satyanarayana, D., 457 Vaibhav, Boga, 761
Seelam, Nagarjuna Reddy, 655 Vanitha, K., 457
Sekhar, V. Chandra, 239 Velakanti, Gouthami, 83
Selvaraj, Navaneethan, 393 Venkataram, Pallapa, 585
Shah, Ronak, 535 Venkateswara Reddy, E., 405
Shankar, Vadthyavath, 633 Venkateswara Reddy, K., 781
Shastry, Nikhil S., 23 Vijayakamal, M., 337
Shekar, B. H., 621 Vijayalakshmi, V., 491
Shetty, Roopashri, 1 Vijay, T. K., 381
Shreya, K., 571 Vishal, B., 205
Shyamala, G., 1 Vuduthuri, Gali Reddy, 655
Singh, Amritanshu Kumar, 145 Vuppu, Shankar, 83
Singh, Rajiv, 49
Siva Naga Dhipti, G., 405
Somanaidu, U., 521 W
Sreekanth, Nara, 283 Walia, Harnehmat, 59
Sreelekha, A., 257
Sreenivasu, Morukurthi, 381
Sree, Pokkuluri Kiran, 263 Y
Srinivas Reddy, G., 521 Yaduvanshi, Rajveer Singh, 441
Srinivas, T., 585 Yeshwanth, G. Srinivasa, 571
Sriram, K. G., 205 Yeshwanth, K., 673

2022 CHVR Lalitha ICSCSP 2021 Proceedings

Uploaded by

Copyright:

Available Formats

2022 CHVR Lalitha ICSCSP 2021 Proceedings

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2022 CHVR Lalitha ICSCSP 2021 Proceedings

Uploaded by

Copyright:

Available Formats

Advances in Intelligent Systems and Computing 1413

More information about this series at https://link.springer.com/bookseries/11156

Soft Computing and Signal

Jiacun Wang K. T. V. Reddy

ISSN 2194-5357 ISSN 2194-5365 (electronic)

Sri. Ch. Mahendar Reddy, Secretary, MRGI

Dr. V. S. K. Reddy, Director

Dr. S. Srinivasa Rao, Principal

Dr. Suresh Chandra Satapathy, Professor, KIIT, Bhubaneswar

Dr. P. H. V. Sesha Talpa Sai, Dean R&D, MRCET

Prof. P. Sanjeeva Reddy, Dean, International Studies

Dr. T. Venu Gopal, HOD, CSE

Dr. S. Shanthi, Professor, CSE

Prof. K. Kailasa Rao, Dean, Placements

Mr. K. Sudhakar Reddy, Assistant Professor, IT

Dr. Hushairi Zen, Professor, ECE, UNIMAS, Malaysia

Mr. N. Sivakumar, Assistant Professor, CSE

Technical Program Committee

Dr. K. M. Rayudu, Associate Professor, CSE

Dr. M. Jayapal, Associate Professor, CSE

Ms. D. Radha, Associate Professor, CSE

Mrs. Gayatri, Associate Professor, CSE

Mr. T. Satish Kumar, Associate Professor, MBA

Mr. Manoj Kumar, Assistant Professor, CSE

Mrs. Radha, Associate Professor, CSE

Mr. V. Kamal, Associate Professor, CSE

Mr. G. Ravi, Associate Professor, CSE

International and National Advisory Committee

Dr. Heggere Ranganath, Chair of CS, University of Alabama, Huntsville, USA

The International Conference on Soft Computing and Signal Processing (ICSCSP-

Hyderabad, India V. Sivakumar Reddy

Data Preprocessing and Finding Optimal Value of K for KNN Model . . . 1

Textlytic: Automatic Project Report Summarization Using NLP

Mining Challenger from Bulk Preprocessing Datasets . . . . . . . . . . . . . . . . . 257

Futuristic View of Internet of Things and Applications

Comparative Analysis of Body Biasing Techniques for Digital

Sign Language Recognition Using Convolution Neural Network . . . . . . . 655

Design of Baugh-Wooley Multiplier Using Full Swing GDI

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 809

V. Sivakumar Reddy is Professor at the Department of Electronics and Communi-

V. Kamakshi Prasad is Professor in the Department of Computer Science and

Jiacun Wang received a Ph.D. in Computer Engineering from Nanjing University

K. T. V. Reddy Alumni of IIT Bombay, is presently working as Campus Director

Roopashri Shetty, M. Geetha, Dinesh U. Acharya, and G. Shyamala

Abstract K-nearest neighbor (KNN) is a simple classifier used in the classification

Keywords Accuracy · Classification · KNN · Preprocessing · Feature selection

R. Shetty (B) · M. Geetha · D. U. Acharya

Preprocessing is an important step in data mining. Presence of missing attributes,

The overall methodology of modified KNN is depicted in Fig. 1. To remove the

Fig. 2 Sample dataset

data = pd.read csv(‘cancer.csv‘).

3.2 Feature Selection Method

A feature selection method is implemented which extracts relevant features by

Fig. 3 Features removed using feature selection method

3.3 Choosing Suitable Value of K

Fig. 4 Misclassification error

Fig. 5 Accuracy plot

4 Results and Analysis

Fig. 6 Performance of KNN