2022 CHVR Lalitha ICSCSP 2021 Proceedings
2022 CHVR Lalitha ICSCSP 2021 Proceedings
2022 CHVR Lalitha ICSCSP 2021 Proceedings
V. Sivakumar Reddy
V. Kamakshi Prasad
Jiacun Wang
K. T. V. Reddy Editors
Soft Computing
and Signal
Processing
Proceedings of 4th ICSCSP 2021
Advances in Intelligent Systems and Computing
Volume 1413
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de
Janeiro, Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
Indexed by DBLP, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and
Technology Agency (JST).
All books published in the series are submitted for consideration in Web of Science.
For proposals from Asia please contact Aninda Bose (aninda.bose@springer.com).
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Conference Committee
Chief Patron
Sri. Ch. Malla Reddy, Hon’ble Minister, Govt. of Telangana, Founder Chairman,
MRGI
Patrons
Conference Chair
Convener
Publication Chair
v
vi Conference Committee
Co-Convener
Organizing Chair
Organizing Secretaries
Coordinators
Organizing Committee
Web Developer
Session Chairs
Proceedings Committee
Publicity Committee
Registration Committee
Hospitality Committee
Certificate Committee
Decoration Committee
Transportation Committee
xi
xii Preface
for his valuable support and encouragement till the successful conclusion of the
conference.
We express our heartfelt thanks to our Chief Patron Sri. Ch. Malla Reddy,
Founder Chairman, MRGI, Patrons Sri. Ch. Mahendar Reddy, Secretary, MRGI,
Sri. Ch. Bhadra Reddy, President, MRGI, Convener Prof. P. Sanjeeva Reddy, Dean,
International Studies, and Dr. T. Venugopal, Dean, MRCET.
We would also like thank the Organizing Secretaries Dr. K. Mallikarjuna HOD,
ECE, Dr. T. Venu Gopal, HOD, CSE, and Dr. G. Sharada, HOD, IT, for their valuable
contribution. Our thanks also to all coordinators and the organizing committee as well
as all the other committee members for their contribution in successful conduct of
the conference.
Last but certainly not least, our special thanks to all the authors without whom
the conference would not have taken place. Their technical contributions have made
our proceedings rich and praiseworthy.
xiii
xiv Contents
xxi
xxii About the Editors
books and research papers and is Associate Editor of several international journals.
He has also served as a program chair, a program co-chair, a special sessions chair
and a program committee member for several international conferences. He is the
secretary of the Organizing and Planning Committee of the IEEE SMC Society and
has been a senior member of IEEE since 2000.
1 Introduction
Data mining has major application in medical field. Medical practitioners have come
up with a lot of algorithms which help in the prediction of diseases. K-nearest neigh-
bors (KNN) algorithm is a supervised method of data mining which is widely used
in the classification of disease [1].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 1
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_1
2 R. Shetty et al.
2 Literature Review
The performance of KNN classifier [3] is mainly dependent on the metric used to
calculate the distance in order to identify the K-nearest neighbors of a data point.
Generally, standard Euclidean distance is used. Huge storage of data is used in this
work so that diagnosis based on the historical data can be done. It computes the prob-
ability of occurrence of a particular ailment by using KNN algorithm which increases
the accuracy of the diagnosis. The algorithm can be used to enhance the automated
diagnoses, which include diagnosis of multiple diseases with similar symptoms.
Garcia et al. [4] discussed various techniques of preprocessing data like data
reduction, data normalization, data integration, data transformation, handling missing
values, feature selection, and dealing with noisy data.
Jiang et al. [5] summarized various drawbacks of KNN and discussed a method
to overcome that. The improvement of distance function is done by eliminating the
least relevant attributes during the distance calculation between two data points.
Parvin et al. [6] applied weighted KNN on test samples after checking the validity
of all samples in the trained dataset. The validity of the data considers robustness
and stability value of trained samples with respect to all its neighbors.
Song et al. [7] proposed clustering-based feature selection algorithm named FAST.
FAST divides the features into clusters, and the feature that is highly associated
Data Preprocessing and Finding Optimal Value of K for KNN Model 3
to the target class from each cluster is selected which forms a subset of features.
This produces subset of useful and independent features. Minimum spanning tree
clustering method is used for clustering.
Li et al. [8] discussed various feature selection methods like stable feature selec-
tion, multi-view feature selection, distributed feature selection online feature selec-
tion. The problems with these methods and their applications are also analyzed and
discussed.
Salama et al. [9] attempted to improve Parkinson’s diagnosis using multiple feature
evaluation approach (MFEA) and classification using machine learning algorithms.
MFEA selected the best set of features which helped in improving the performance
of the model.
Chan et al. [10] proposed a feature selection method based on KNN ensemble
classifier. It finds the significant attribute using an iterative approach. When the
number of features extracted increases compared to the number of observations,
effectiveness and robustness of the model are increased.
It is understood from the literature review that KNN algorithm is comparatively
slow since all the instances have to be reviewed for every new data point. The perfor-
mance of KNN algorithm degrades with irrelevant attributes present in the dataset.
Deciding the optimal value of K helps in improving the performance of the system.
The work concentrates on filtering and eliminating irrelevant attributes from the
dataset and finding suitable value of K for KNN algorithm.
3 Methodology
Fig. 1 Schematic
representation of proposed
algorithm
3.1 Preprocessing
The records are read from the dataset using the command:
The algorithm is implemented in Python. The dataset is split in the ratio 70:30
for training and testing, respectively. Tenfold technique is used for validating the
dataset.
The distance between the new data point and the points of existing trained data
samples is calculated using the Euclidean distance, and K points will be selected
based on the smallest K Euclidean distance. New data point is put into a class where
majority of K points belong.
The attributes present in the correlated features set have to be removed from the
training as well test data as they are irrelevant which can be done by the following
code snippet:
6 R. Shetty et al.
Using the feature selection method, 17 features are removed from the original
dataset and are shown in Fig. 3.
KNN algorithm requires the user to input the value of K. User intervention is required
for doing this. In order to avoid the manual input K, a method is required which
automatically decides the value of K. Suitable value of K improves the overall
performance of the classifier.
Two strategies are considered to get the suitable value of K.
1. By computing the misclassification error: KNN is executed for different values
of K and their misclassification errors are compared. The model with the
minimum misclassification error is considered. Misclassification error for
different values of K is shown in Fig. 4.
2. Finding the accuracy of the model for different values of K: The accuracies for
the models with K value as 2, 3, 4, and 6 are plotted, and the model with the
best accuracy is considered. Accuracy plot for different values of K is shown in
Fig. 5.
It is observed from the above two strategies that value 3 is suitable for K.
Results are analyzed when KNN algorithm is executed with following two cases.
1. Without preprocessing and manual input of K, (KNN)
2. With feature selection method for preprocessing and choosing suitable value of
K (modified KNN)
Figure 6 shows the performance of KNN, and Fig. 7 shows the performance of
modified KNN.
Confusion matrix for KNN is shown in Fig. 8
Total, n = 171,
Accuracy = (T.P. + T.N.)/Total = (94 + 54)/171 = 0.865
Misclassification rate = (F.P. + F.N.)/Total = 23/171 = 0.135.
Total, n = 171,
Accuracy = (104 + 49)/171 = 0.895
Misclassification rate = 9/171 = 0.105.
From the above analysis, it can be observed that the accuracy of KNN is 86.5%,
and the accuracy of modified KNN is 89.5%.
KNN algorithm is one of the simple classifiers for classifying medical data. But, the
performance of the model depends on the data used and the value of K considered.
Data Preprocessing and Finding Optimal Value of K for KNN Model 9
Hence, preprocessing the data by removing the irrelevant attributes present in the
dataset and choosing the suitable value for K help in improving the performance of
KNN model.
The dataset considered here is small, and this can be extended for large datasets.
Other preprocessing techniques, normalization can be used to improve the quality of
the data.
References
1. Z. Deng et al., Efficient kNN classification algorithm for big data. Neurocomputing 195, 143–
148 (2016)
2. H.K. Chantar, D.W. Corne, Feature subset selection for Arabic document categorization using
BPSO-KNN, in 2011 Third World Congress on Nature and Biologically Inspired Computing
(IEEE, 2011), pp. 546–551
3. H.S. Khamis, K.W. Cheruiyot, S. Kimani, Application of k-nearest neighbor classification in
medical data mining. Int. J. Inf. Commun. Technol. Res. 4(4) (2014)
4. S. Garcia, J. Luengo, F. Herrera, Data Preprocessing in Data Mining (Springer, 2015)
5. L. Jiang et al., Survey of improving k-nearest-neighbor for classification, in Fourth International
Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), vol. 1 (IEEE, 2007),
pp. 679–683
6. H. Parvin, H. Alizadeh, B. Minaei-Bidgoli, MKNN: Modified k-nearest neighbor, in Proceed-
ings of the World Congress on Engineering and Computer Science, vol. 1 (Citeseer,
2008)
7. Q. Song, J. Ni, G. Wang, A fast clustering-based feature subset selection algorithm for high-
dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2011)
8. Y. Li, T. Li, H. Liu, Recent advances in feature selection and its applications. Knowl. Inf. Syst.
53(3), 551–577 (2017)
9. S.A. Mostafa et al., Examining multiple feature evaluation and classification methods for
improving the diagnosis of Parkinson’s disease. Cogn. Syst. Res. 54, 90–99 (2019)
10. C.H. Park, S.B. Kim, Sequential random k-nearest neighbor feature selection for high-
dimensional data. Expert Syst. Appl. 42(5), 2336–2342 (2015)
Prediction of Cardiac Diseases Using
Machine Learning Algorithms
Abstract Coronary illness or heart strokes are most common diseases across the
world. To reduce this disease, prediction of heart strokes is needed to be done.
Many machine learning are already existing in the medical field. Machine learning
algorithms can be used to predict coronary illness or heart strokes. In this paper,
various machine learning algorithms are applied on the dataset to predict the heart
strokes or coronary illness. Dataset used is Cleveland’s dataset, where it contains 14
attributes which are the medical parameters of the patients. Few attributes are the
results obtained from medical tests. Algorithms like decision tree, logistic regres-
sion, random forest, MLP, bagging are applied on the dataset to predict heart strokes.
Implementation is done by dividing dataset into training and testing datasets, and
algorithms are applied to find out the accuracies in predicting the heart strokes. Imple-
mentation is done using RStudio and WEKA. Even, we applied the same implemen-
tation by eliminating few attributes in dataset. The idea behind reducing (eliminating)
attributes is that few attributes in the dataset are results of medical tests done to the
patients. Sometimes, medical tests can be costly. So, if we can get the same accuracy
with reduced attributes, we can conclude that prediction can be done with a smaller
number of medical tests. From the implementation and analysis done, it shows that
logistic regression provides the highest accuracy followed by decision tree.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 11
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_2
12 J. Suneetha et al.
used in predicting of heart strokes or heart attacks. Heart strokes or coronary illness
is a major and most commonly occurring disease. It can affect a person of any age
between 35 and 75, and elderly get effected mostly. Death rate of heart strokes is
very high [4, 5].
In order to reduce the death rate, predicting heart strokes beforehand is the solu-
tion. So, predicting is the only way to control death rate and stop the strokes from
occurring. So, the first thing that has to be done is “prediction.” Now, prediction
can be done using machine learning. Machine learning models and techniques are
used for analyzing and predicting. There are many machine learning algorithms
exist that can do the predictions. But, an effective way of prediction should be
done, i.e., which algorithms provide better accuracy is to be done. Thus, this is
the thought process involved in choosing the title “heart disease prediction using
different machine learning algorithms.”
Predictions are done using different algorithms, and analysis should be done in
order to choose the best algorithm. Best suitable algorithm can be selected by consid-
ering the accuracy in prediction. The algorithms which give the highest accuracy are
considered as effective algorithm for predicting. Machine learning techniques and
models help in analyzing the data and make predictions. In order to reduce the
death rate, we need a model that detects or predicts a heart stroke. Predicting is the
main solution for these kinds of diseases. So, our first main solution to problem is
predicting using machine learning. Now, we need dataset in order to analyze and
predict. In this paper, I have used Cleveland dataset from UCI repository. Cleveland
dataset contains 14 attributes. In which, 13 attributes are the medical parameters of
patients, and 1 attribute shows end result, i.e., the heart stroke result (1—if that heart
stroke is occurred and 0—for no stroke). Few attributes are the results obtained from
medical tests. Medical tests have been carried out, and the results are considered.
Thus, Cleveland dataset is best suited for analyzing heart strokes occurrences [7–9].
Now, we have dataset to analyze. Next comes the prediction process. Prediction
can be done by using various types of algorithms. Now, the question is what is the
best algorithm to use to get accurate predictions. In order to find the best algorithms,
all algorithms should be implemented on dataset, and the accuracies need to be
compared. Thus, the main objective is to predict the heart strokes and also to provide
an effective suitable algorithm for the prediction.
The main objective of the paper is predicting the heart strokes and finding an
effective and suitable algorithm that provides the highest prediction accuracy from
the used algorithms. After getting the suitable algorithms, the implementation can
be further continued by eliminating attributes from dataset.
We have mentioned that Cleveland dataset contains the results of medical tests.
Sometimes, medical tests can be expensive. But if we get the same accuracy even after
eliminating few attributes, we can conclude that money spent on medical tests can be
reduced. For example, assume that we are getting 85% accuracy for 14 attributes, and
now, we have eliminated a least significant attribute, and we got same 85% accuracy
[10]. Then, we can conclude that few medical tests can be eliminated or ignored which
further concludes that predictions can be done using a smaller number of medical
Prediction of Cardiac Diseases Using Machine Learning Algorithms 13
tests. The objective here is to obtain high accuracy results even after eliminating
some attributes.
2 Existing System
In Ref. [1], the authors have provided a summarization on the current research work
on predicting cardiovascular diseases. They have provided information regarding the
types of heart diseases, the types of data mining techniques, and also the various data
mining available for analyzing and predicting the disease. They have also suggested
that more data cleaning and pruning can be done to provide more accuracy in
predicting [1]. The authors implemented C4.5 and PCL approaches for using the
traditional data and proteomic profiling data rules. In general, they are rule-based
classifiers. Implementation was done on bio-medical data taken from UCI reposi-
tory. The most preferred method is C4.5 with two issues, single coverage constraint
and fragmentation problem which affect the accuracy, but this weakness or disad-
vantage is overcome by PCL which is superior to bagging and boosting. The C4.5
approach with the issues affecting its accuracy C4.5 is a decision tree-based single
classifier. PCL uses significant rules which are followed by decision trees which help
in overcoming the issues of C4.5 [2].
The neural systems have demonstrated to be the most well-known and advancing
part machine learning. Multilayer perceptron is used to predict the heart disease rate.
It is a supervised neural network algorithm. It has three layers input, output, and
hidden layers between input and output layer. In this paper, the general data of the
patients are collected like age, sex, blood pressure, diabetes, cholesterol, obesity,
and heart rate. And, the data they have collected are from devices and sensors like
fitbit, Alivekor, Healthgare. They have considered the listed attributes. They also
mentioned that the data in Cleveland dataset are results of expensive medical tests.
So, they have considered their own generic parameters and applied multilayered
perceptron [3].
In [4], authors have used the logistic regression that shows the logistic curve
(between). The linear model equation is: y = b0 + b1 x, (y = mx + c) where c is the
constant and m is the slope which defines the steepness of the curve. Logistic regres-
sion and decision tree are used to make predictions [4]. The authors implemented
naive Bayes and hidden naive Bayes to predict the heart strokes. Hidden naive Bayes
gives a remarkable performance than traditional naive Bayes algorithm. They used
hidden naive Bayes in order to provide an accurate model for cardiovascular disease.
With respect to attribute dependence, hidden naive Bayes is more accurate classifi-
cation. It is a structure extension-based program and needs more time for training.
This proposed approach is done by discretization and IQR filter to increase the HNB
efficiency. With the dependent attributes, they have 100% accuracy [5]. The authors
have proposed the use of data mining algorithm in identification of heart disease with
an accuracy of 52.33%. They have combined the attributes related to ECG and the
clinical symptoms of the patients to detect the heart disease.
14 J. Suneetha et al.
The algorithms used by this system are naive Bayes algorithm, decision list algo-
rithm, and KNN algorithm [6]. Extreme learning machine techniques are where feed-
forward neural network is used for classification and regression. The main advantage
of EML is that it is the fast-learning algorithm without the re-iterations. The dataset
used is Cleveland dataset with 14 attributes. Prediction model is designed in such a
way that the output obtained is four groups (0–4). Instead of predicting heart strokes
as 0 or 1, it provides the range of health conditions. This is done in order to increase
the accuracy. The accuracy obtained is 80% [7].
The authors have proposed a scalable model that monitors the heart disease using
SPARK and Cassandra framework. This framework is relied on real-time classifica-
tion model for continuous track of patients. The model or the system has two main
objectives streaming processing and visualization. This is not a predictive model;
this is for monitoring the heart disease continuously [8]. Learning vector is a neural
network algorithm. It is a nearest vector neighbor classifier. They have implemented
the algorithm for different number of epochs and different neurons. The dataset they
have used is 14 attribute datasets from UCI repository. The predictions were based on
the accuracies obtained from the implementations. The performance of the algorithm
with different number of epochs and the performance of algorithm with different are
calculated and compared.
The accuracy obtained by the learning vector quantization is 85% [9]. In this
paper Naive Bayes Classifier and the other is Decision Tree. They used WEKA
tool for building the model and predicting the heart strokes. WEKA tools are for
applying data mining techniques and machine learning algorithms. WEKA tools
reduce the complexities of writing code. Comparative analysis of algorithms is done
and concluded that naive Bayes classifier is accurate than decision tree [10].
3 Proposed Work
This section discusses the flow of the implementation and the methodologies
followed. I have got the dataset. And, there are few fundamental checking needs
to be done before using the data, like checking whether missing values are there in
dataset. If missing values are found, then we need to add the missing values using
mean, median, or mode method. In the Cleveland dataset, there are no missing values.
The block diagram in Fig. 1 shows the outline of the process.
The dataset here represents the Cleveland dataset. The data preprocessing can
be done before and after splitting the data. The preprocessing includes the cleaning
of the data. The dataset should be divided into two parts, one is training dataset
and the other into testing dataset. Once the data are split into training and testing,
we can start implementing the algorithms on the training data. The training of data
provides accuracy of predictions. Once training is done, we need to test the algorithm
on testing dataset. The actual prediction or the accuracy is obtained by the testing
data. This block diagram just represents the outline of the process. Let us get a
Prediction of Cardiac Diseases Using Machine Learning Algorithms 15
detailed understanding using a flow diagram. Fig. 2 represents the flow diagram of
the implementation work.
From the flow diagram, it is clear that the data are split into training and testing
dataset. The preprocessing of the data is needed to be done. Forward selection and
backward substitution are done in order to get a subset of attributes, and then the
machine learning algorithms are needed to be applied.
Once training is done, we have to do the testing part where it provides the accu-
racy of the predictions. The implementation is carried out using Cleveland dataset.
Algorithms like decision tree, logistic regression, bagging, random tree, multilayer
perceptron, and, finally, random forest are used to predict the heart strokes. All
these algorithms are implemented on dataset RStudio, and WEKA tools are used
to implement the algorithms on dataset. Decision tree, bagging, and logistic regres-
sion are implemented in RStudio. Random forest and random tree and multiple
layer perceptron are done using WEKA tools. Sequence of steps is followed in the
implementation.
Step 1: Data preprocessing. Missing values are handled.
Step 2: Divide dataset into training set and testing set. This implementation I used
(60, 40), (70, 30), (80, 20) as training and testing ratios.
Step 3: Applied each algorithm and predicted the accuracy for 14 attribute datasets.
Step 4: Accuracies for each model are obtained.
Step 5: Compare and analyze the accuracies and provide an efficient model for
predicting heart disease.
Step 6: Eliminate the least significant attribute and repeat the implementation and
find the accuracies.
Step 7: Analyze both the accuracies obtained and provide the results.
Firstly, the missing values of the dataset are handled. Missing values can be
replaced by mean, median, mode values of the particular attributes. In the Cleveland
dataset, there were no missing values. In the second step, the dataset should be
divided into two parts, training and testing data. Generally, the training and testing
ratios must be 2/3 and 1/3, respectively. I have taken the three different set of ratios
for training and testing datasets. I have taken 60:40, 70:30, and 80:20 for training and
testing, respectively. All the algorithms are implemented to the training and testing
datasets.
Machine learning algorithms used are decision tree, random forest, logistic regres-
sion, multilayer perceptron, and bagging. After the implementation and getting the
accuracies of the all the algorithms on the dataset, create another dataset with reduced
attributes. In the reduced attribute dataset, eliminate two attributes “ca” and “thal.”
After eliminating two attributes from the dataset, apply the same implementation and
find out the accuracy. The reason behind reducing the elements is in order to check
whether to get same accuracy which is equal to actual accuracies.
Prediction of Cardiac Diseases Using Machine Learning Algorithms 17
Decision tree comes under supervised machine learning algorithm. Decision tree can
be used on both numerical data as well as categorical data. Decision tree finally gives
solutions in categorical form, i.e., 0/1 or TRUE/FALSE. The graphical representation
of decision tree is given in Fig. 3.
The accuracies are calculated using confusion matrix. Confusion matrix is gener-
ally used to describe the performance of the algorithm used. Once confusion matrix
is obtained, the accuracy can be calculated. The decision tree algorithm is applied
for three training and testing ratios. For training ratio as 60 and testing ratio as 40,
the accuracy got is 75.806, and with training ratio as 70 and testing ratio as 30, the
accuracy received is 82.002%. And finally, with training ratio as 80 and testing as
20, the accuracy is 77.409. For the reduced attribute dataset, the following are the
accuracies. For ratio of 60 training and 40 testing, the accuracy is 69%. For 70 and
30 training and testing, the accuracy obtained is 77, and for 80 and 20 training and
testing ratio, the accuracy is 74%. Thus, these are the accuracies obtained for the
reduced dataset.
Prediction of Cardiac Diseases Using Machine Learning Algorithms 19
Logistic regression is a probability model. The unlike linear regression, the logistic
regression helps in predicting the nonlinear data. The simple hypothesis used by
logistic regression is y = mx + c, where y is the output, m is the slope, x is the input,
and c is the intercept.
This is the simple formula used in logistic regression. The confusion matrix for
the logistic regression is given below in Fig. 4.
The logistic regression accuracies are for the training and testing ratios of 60 and
30, respectively, it is 85% accurate. Table 1 is the confusion matrix of 70 training
and 30 testing data ratios. It has the highest accuracy rate of 88.76404%. And finally,
for the training and testing ratio of 80 and 20, respectively, the accuracy got is 84%.
For the reduced attribute dataset, the accuracy of 60 training and 40 testing is 79.03.
The accuracy for 70 training and 30 testing is 78.91, and the accuracy for 80 training
and 20 testing is 77.04. Thus, these are the accuracies for reduced attribute dataset.
The following table provides the results, i.e., the accuracies provided by different
algorithms. The table contains the predictions obtained by each algorithm for
different training and testing ratios. From Table 2, logistic regression has the highest
accuracy with 88.74% followed by decision tree. The least accuracy was given
by multilayer perceptron. The main objective of this paper is to predict the heart
strokes. Logistic regression algorithms provide the highest accuracy. Random tree
and bagging provide the same accuracies around 70–75%.
From Table 2, it is proved that logistic regression has provided a better accuracy
results compared to the remaining algorithms. Logistic regression has almost given
85 and above in three sets of training and testing. The accuracy provided by logistic
regression for predicting heart strokes is 88.76% followed by decision tree with
85.02%, and the least accuracy was given by multilayer perceptron. Bagging and
random forest have given accuracies between 70 and 75.
Thus, by comparing the accuracies percent, we can conclude that the logistic
regression has given the highest accuracy in predicting. Apart from comparing the
algorithms with the resulted accuracies, one more comparative study has been done,
i.e., analysis of accuracies of reduced attributed dataset with the full attribute dataset. I
have eliminated two attributes and implemented the same algorithms expecting to get
the accuracies similar or near to the obtained accuracies. The details of the accuracies
are given in Table 3.
By comparing the accuracies obtained from complete dataset and reduced dataset,
the accuracies obtained in the complete dataset are more when compared to reduced
attribute dataset, and if we look at Table 3, logistic regression provides the highest
Table 2 Accuracies in
Train and test ratios 60:40 70:30 80:20
predicting heart strokes
Decision tree 69 77 74
Logistic regression 79.03 78.31 77.04
Multilayer perceptron 52.35 44.4 39.44
Bagging 55.65 58.49 60.27
Random forest 67.39 71.38 74.95
accuracy. By this, we can say that among the various algorithms used, logistic
regression has provided the greater accuracies.
5 Conclusion
From the experiments and the analysis done, the prediction of heart strokes can
be done effectively using logistic regression. The logistic regression has provided
the highest accuracy of 88.79%. This conclusion is also supported by the analysis of
results obtained from the implementations done on the reduced attributes. The results
of the reduced attribute dataset are analyzed and concluded that logistic regression
has provided the highest accuracy. Thus, this analysis done supports the second
objective which is providing effective algorithm suited for predicting heart strokes.
References
1. M. Learning, Heart disease diagnosis and prediction using machine learning and data mining
techniques: a review. Adv. Comput. Sci. Technol. (2017)
2. J. Li, L. Wong, Using rules to analyse bio-medical data: a comparison between C4.5 and PCL
(Institute for Infocomm Research, Singapore, 2005)
3. A. Gavhane, G. Kokkula, I. Pandya, K. Devadkar, Prediction of heart disease using machine
learning, in Proceedings of the 2nd International Conference on Electronics, Communication
and Aerospace Technology (ICECA 2018)
4. M.P. Kiran, N.R. Deepak, Crop prediction based on influencing parameters for different states
in India—the data mining approach, in 2021 5th International Conference on Intelligent
Computing and Control Systems (ICICCS) (2021), pp. 1785–1791. https://doi.org/10.1109/
ICICCS51141.2021.9432247
5. K. D’cruz, C. Kumar, A.M. Kumar, M. Gawali, A. Shivashankar, in Prediction of Heart Disease
Using Machine Learning Techniques. CIS 490 Machine Learning University of Massachusetts
6. M.A. Jabbar, S. Samreen, Heart disease prediction system based on hidden Naıve Bayes classi-
fier, in International Conference on Circuits, Controls, Communications and Computing (Oct,
2016)
7. A. Rajkumar, G.S. Reena, Diagnosis of heart disease using data mining algorithm. Glob. J.
Comp. Sci. Technol. 10, 38–43 (2010)
8. S. Ismaeel, A. Miri, D. Chourishi, Using the extreme learning machine (ELM) technique for
heart disease diagnosis, in IEEE Canada International Humanitarian Technology Conference
2015 (May, 2015)
9. N. Thanuja, N.R. Deepak, A convenient machine learning model for cyber security, in 2021 5th
International Conference on Computing Methodologies and Communication (ICCMC) (2021),
pp. 284–290. https://doi.org/10.1109/ICCMC51019.2021.9418051
10. A. Ed-Daoudy, K. Maalmi, Real-time machine learning for early detection of heart disease
using big data approach, in International Conference on Wireless Technologies, Embedded
and Intelligent Systems (April, 2019)
A Comprehensive Approach
to Misinformation Analysis
and Detection of Low-Credibility News
1 Introduction
Social media has become the source of news for many Internet users today. It is easy
to access, cheap and always available. News spreads particularly fast on social media
as it comprises a large age group who are active regularly. One of the main problems
is that information on social media is not investigated or cross-verified before posted
to the public which leads to unsubstantiated rumours spreading like wildfire. Many
people are susceptible to perceiving news on social media as authentic and reliable.
The more a person is exposed to a certain article or news, especially from reliable
sources, the more easily they are persuaded by it. Bots play a pivotal role in the
spread of misinformation on the Internet. They can post, tag and comment at very
high frequencies allowing this fake news to spread with extensive exposure. Another
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 23
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_3
24 M. M. Joshi et al.
reason bots are able to spread news is because of their capability to search and retrieve
information that is invalidated. Bots tend to post very regularly under hot topics
using trending hashtags. Once the bots have introduced the fake information on the
Internet, it is then the people who begin reposting it, thus giving the information more
exposure. Fake news tends to be more controversial and eye-catching which helps
create the turmoil needed for its spread. Detection of fake news is a field of study that
is still in its rudimentary form due to the several limitations it faces. One of the reasons
detection of fake news is difficult is the pace at which fake news spreads. Before it
is even classified as fake news, the damage would probably be done. The design of
alleviation and intervention techniques for misinformation has received less attention
in social media research, mainly due to the obstacle of designing applicable user
behaviour models [1]. Detection of bots is also difficult as it requires the establishment
of certain user behavioural characteristics which allow pristine distinction between a
regular user and a bot. Bots do have certain characteristics that make them stand out
when compared to a regular user. A regular user would spend time setting up their
social media profile while a bot would generally has only the most basic information
filled out. Bots tend to post at very high frequencies compared to the average interval
between posts of a regular user. Our model aims to deter the genesis of fraudulent
news articles or tweets spread via social neural bots. Early detection of such bots will
enable us to avert the spread of such unverified information, reducing or possibly
eliminating the negative impacts triggered by it.
Our solution allows social media users to be more aware of the information they
read online and its authenticity. Our solution consists of three main components—a
Bot Detection Model, Tweet Classifier and News Article Classifier which employ
various machine learning techniques, like XGBoost Classifier, Passive Aggressive
Classifier, etc. Our models are ultimately integrated into a web application using
Flask.
In today’s day and age, a method to analyse and detect the spread of fake news is
crucial. In [1], there is deliberation about the use of unsupervised machine learning
techniques and methods to define user behavioural categories over behaviour dimen-
sions. However, supervised machine learning can be constructed with each through
already labelled datasets which our solution aims to achieve. Further, [2] presents
a detailed review to detect false and misleading news on social media, including
existing algorithms from a data mining perspective, fake news characterizations on
psychology, social and psychological theories and representative datasets. Further-
more, in [3], during analysis, it is observed that most of the fake information found
on social media were generated by bots. These results show that suppressing social
A Comprehensive Approach to Misinformation … 25
Although social media platforms like Twitter and Facebook are praised for their
potential to convey essential information, their power is widely misused to influ-
ence people for several reasons. Twitter bots are considered popular misinformation
spreaders. Many methods for detecting these neural bots have been suggested, all of
which process vast volumes of social media posts and make use of network struc-
ture, temporal dynamics and sentiment analysis. Writers of [5] address an approach
to detecting Twitter bots using classifiers that are trained to differentiate between
real and fake accounts. They aim to identify features that are easy to extract while
maintaining accuracy and focusing on language-agnostic features Chavoshi et al.
[6] developed a correlation finder to identify correlated user accounts on social
media platforms like Twitter. The observations concluded that if the users are highly
synchronous in nature, then they are most likely bots. A deep bot detection model is
proposed in [7] to learn a large representation of social media users and then detect
social bots by modelling social activity and content information jointly. Paper [8]
proposes a behaviour enhanced deep model (BeDM) for bot detection. Using the
deep learning approach, BeDM fuses content information and behaviour informa-
tion. The authors of [9] propose a deep neural network based on the contextual long
short-term memory (LSTM) architecture that detects bots at the tweet level using
both content and metadata. They demonstrate that their architecture can achieve high
classification accuracy (AUC > 96%). Simple user-profile-based features like default
profile, geo enabled, followers count are features that the writers of [5] have used in
their study. Similarly, other features such as content-based features can be extracted
for further analysis, which we are implemented in our bot detection model.
Fake news articles are generated using Natural Language Processing (NLP) tech-
niques. This is called “neural fake news”. Since these methods are being used to
generate fake news, they can also be used to detect it and study the characteristics as
well. The research proposed in [10] suggests the use of different machine learning
26 M. M. Joshi et al.
Large Language Modelling is an NLP modelling approach that has a wide variety of
functionalities, a popular use case being the one where models learn how to predict
missing words or the next few words of a sentence. Zellers et al. [17] proposes the use
of three large language models—Grover (AllenNLP), BERT and GPT-2 Detector,
which are used to predict if a chunk of text is written by a neural bot, which employs
the previously mentioned models. The studies presented in these papers indicate
the high precision and accuracy associated with using Grover for detecting Neural
Fake News. Along the same lines, [18] delineates a comparison between the GPT-2
and Grover Large Language Model. Their studies reveal that the GROVER-based
discriminator is a refined version of GROVER, which consists of three sizes, i.e.
GROVER-Base, GROVER Large and GROVER-Small. Their findings show that the
GROVER model is superior to the GPT-2 model since it has a larger dataset and
since it can detect neural fake news written by various large language models.
Liar, Fake or Real News and Combined Corpus. The revelation made by the authors
was that when sentiment and lexical features were being used, SVM and Logistic
Regression models operated the best compared to the other traditional machine
learning models. The use of natural language processing, text analysis, web crawling
and machine learning models together further increases the accuracy of our solution
in detecting the authenticity of a given piece of news.
The studies that we have come across rely specifically on a single method,
however, to improve accuracy, we are proposing a hybrid and holistic approach,
that relies on several attributes and works as a combination of the aforementioned
techniques.
associated with tweets like tweet length, number of retweets, number of hashtags etc.,
to classify a given tweet as generated by a bot or not. The aforementioned models
are aimed at understanding the source of the tweet and verifying the originality of
the Twitter account. The third component is a fake news article classifier which is
trained using scraped data from fake and authentic news sources and RSS Feeds.
This Machine Learning Model classifies a given article as fake or real depending on
the accuracy associated with its authenticity in Fig. 1.
The use of these three models together makes for a holistic solution dependent on
linguistic, profile-based and context-based features, which improve the accuracy of
our solution. This makes our solution reliable and robust.
In our initial analysis, it was evident that using simple features to classify information
is not sufficient, and a combination of various features is necessary. Every machine
learning application requires extensive data and feature engineering and analysis.
Feature importance analysis refers to techniques that assign a score to input features
based on how useful they are at predicting a target variable. This helps understand
the data and models better. Our methodology involves rigorous data analysis and
feature engineering to improve accuracy and precision.
The bot detection feature of the solution has the following components:
Twitter API: Twitter API is used to extract user-profile features and other impor-
tant details from Twitter by simply providing the required username. The infor-
mation collected is then converted into a format that can be easily fed to the
trained models. The Twitter API script serves the main purpose of returning this
information which serves as the input to the model.
Cresci 2017 Dataset: The Dataset being used to train this model is the cresci 2017
dataset which includes users and tweets information for genuine, traditional, and
social spambot Twitter accounts.
XGBoost: The XGBoost algorithm is an implementation of the gradient boosted
decision trees which yields high speed, high accuracy and it dominates datasets
on classification and regression problems. XGBoost is essentially a decision tree-
based machine learning ensemble that is used to solve regression and classification
problems.
Feature Extraction: A Principal Component Analysis was used to determine the
features which contribute significantly to the output of prediction, these features
are shown in Table 1.
A Comprehensive Approach to Misinformation … 29
The tweet classification model also makes use of the cresci 2017 dataset from which
the features shown in Table 2 are being extracted. Similar to the Bot Detection model,
the tweet classification model yields highest accuracy with the XGBoost algorithm
in comparison to Logistic Regressions and other classification algorithms.
Dataset and Web Scraper: To train the machine learning model, the data is a
combination of data from the Fake and Real News dataset on Kaggle and the
content from scraping 10 news article sources like CNN, NDTV, InfoWars, etc.
The python packages feed parser and newspaper are used to scrape articles from
news sites and RSS feeds.
Passive Aggressive Classifier: The passive aggressive classifier is an online-
learning based machine learning algorithm that is similar to perceptron models.
Since the data is continuously being updated and scraped from sites on a regular
basis, this model helps to deal with the large amounts of incoming data. This
model yields the highest accuracy of 92% in comparison to other classifiers like
SVM, Naïve Bayes, etc.
For the web interface, a server is hosted through flask on which the project is run.
Flask is a popular Python web framework used for developing web applications. The
machine learning models are integrated with the UI for a seamless user experience.
After extensive analysis, implementation and integration of all the models
discussed, a working model was developed which is capable of producing highly
accurate results.
5 Results
The graphs for average article sentiment as shown in Fig. 2 shows the sentiment
analysis results in which we can see that there is a difference between the sentiments
shown by the fake and real news. The sentiment in fake news articles sways between
positive and negative sentiments implying that it is intended to influence the percep-
tions of people. In Fig. 3, we can see that the sentiment in the headlines of fake news
articles has more variance as most people usually just read the headlines rather than
the whole news article itself. We can also see that the number of sentimentally neutral
articles are more in real news articles compared to fake news articles.
For the Twitter Bot Detection model, we implemented the XGBoost Machine
Learning algorithm, which gave us an accuracy of 95.14%. The accuracy of the
Tweet Classification Model is 75.92% using the XGBoost Classifier. The Passive
Aggressive classifier yields highest accuracy for fake news classification compared
to the other models such as Linear Regressions, Support Vector Machines, Naïve
Bayes and Random Forest which yielded an average accuracy of 80%. Through
this application, we can understand how various models can be tuned to higher
accuracies, and with further visualizations and graphical representations, we can
better understand the nuances behind the spread of misinformation.
A Comprehensive Approach to Misinformation … 31
6 Conclusion
The spread of misinformation has become a ubiquitous concern that has increased
the demand for smart detection systems to identify and analyse fake news and its
sources. With easy access to social media, the need for such systems is much higher
than it was a decade ago. Low-credibility news detection is a field of study still in
its primitive stage due to several limitations. Our extensive research study shows
that the authenticity of news articles can be tracked, traced, and validated against a
set of reliable sources by combining Artificial Intelligence, Machine Learning and
Data Mining techniques. The identification of fake news and Twitter bots depends
significantly on Natural Language Processing. Supervised machine learning models
can be used to classify users as bots along with the Large Language Model. Since
the classifiers depend on the training data, one possible challenge would be that
different classifiers would be required for articles of different lengths. Our solution
relies on linguistic, context-based, user-profile-based and social features to detect and
analyse fake news and misinformation, making it a holistic approach to detecting fake
news. Our solution shows that by employing extensive text analysis, natural language
processing and machine learning techniques, the spread of misinformation can be
detected early and stopped before the damage is done.
References
1. Z. Rajabi, A. Shehu, H. Purohit, User behavior modelling for fake information mitigation on
social web, in Social, Cultural, and Behavioral Modeling, ed. by R. Thomson, H. Bisgin, C.
Dancy, A. Hyder (SBP-BRiMS, 2019)
2. K. Shu, A. Sliva, S. Wang, J. Tang, H. Liu, Fake news detection on social media: a data mining
perspective. SIGKDD Explor. Newsl. 19(1), 22–36 (2017)
3. C. Shao, G.L. Ciampaglia, O. Varol et al., The spread of low-credibility content by social bots.
Nat. Commun. 9, 4787 (2018)
4. L. Tian, X. Zhang, Y. Wang, H. Liu, Early detection of rumours on twitter via stance transfer
learning, in Advances in Information Retrieval, ed. J. Jose et al. (ECIR 2020, Lecture Notes in
Computer Science, vol. 12035 (Springer, Cham)
5. J. Knauth, Language-agnostic twitter-bot detection, in Proceedings of the International
Conference on Recent Advances in Natural Language Processing (RANLP 2019)
6. N. Chavoshi, H. Hamooni, A. Mueen, Debot: twitter bot detection via warped correlation, in
Icdm (2016)
7. C. Cai, L. Li, D. Zeng, Detecting social bots by jointly modeling deep behavior and content
information, in Proceedings of the 2017 ACM on Conference on Information and Knowledge
Management (2017)
8. C. Cai, L. Li, D. Zengi, Behavior enhanced deep bot detection in social media, in 2017 IEEE
International Conference on Intelligence and Security Informatics (ISI)
9. S. Kudugunta, E. Ferrara, Deep neural networks for bot detection. Inform. Sci. 67, 312–322
(2018)
10. R. Manna, A. Pascucci, J. Monti, Profiling fake news spreaders through stylometry and lexical
features. UniOR NLP @PAN2020 Notebook for PAN at CLEF 2020
11. V.L. Rubin, N.J. Conroy, Y. Chen, S. Cornwell, Fake news or truth? using satirical cues to detect
potentially misleading news (Language and Information Technology Research Lab (LIT.RL),
A Comprehensive Approach to Misinformation … 33
Faculty of Information and Media Studies, University of Western Ontario, London, Ontario,
Canada)
12. A. Aggarwal, A. Chauhan, D. Kumar, M. Mittal, S. Verma, Classification of fake news by
fine-tuning deep bidirectional transformers based language model, 163973
13. H. Rashkin, E. Choi, J. Jang, S. Volkova, Y. Choi, Truth of varying shades: analyzing language
in fake news and political fact-checking (2017)
14. M.D. Ibrishimova, K. Li, A machine learning approach to fake news detection using knowledge
verification and natural language processing, in INCoS
15. M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy, Text document classification based on a
least square support vector machines with singular value decomposition. Int. J. Comput Appl.
27(7), 21–26 (2011)
16. M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy, S.C. Sapathy, A survey of cross-domain
text categorization techniques, in International Conference on Recent Advances in Information
Technology RAIT-2012, ISM-Dhanabad, IEEE Xplorer Proceedings (2012). 978-1-4577-0697-
4/12
17. R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, Y. Choi, Defending
against neural fake news (2019). arxiv: 1905.12616
18. W. Zhong, D. Tang, Z. Xu, R. Wang, N. Duan, M. Zhou, J. Wang, J. Yin, Neural deep fake
detection with factual structure of text (2020). arxiv: 2010.07475
19. J.Y. Khan, Md.T.I. Khondaker, A. Iqbal, S. Afroz, A benchmark study on machine learning
methods for fake news detection (2019)
Evaluation of Machine Learning
Algorithms
for Electroencephalography-Based
Epileptic Seizure State Recognition
Abstract Epileptic seizures are caused by abnormal brain activities in which person
with epilepsy exhibits unusual behaviour, sensations and sometimes loss of aware-
ness. Recognition of seizure states could aid in predicting epileptic seizures and
better treatment. Electroencephalogram (EEG) is generally used technique to record
the electrical activity of the brain. EEG can be used to predict epileptic seizures by
identifying the preictal state of the EEG data signal. The work presented here focuses
on comparing the performance of traditional machine learning algorithms with using
and without using feature extraction methods for recognizing the state of seizure.
Standard traditional machine learning algorithms, such as k-nearest neighbour, deci-
sion tree, Gaussian naive Bayes, multilayer perceptron, quadratic discriminant anal-
ysis, random forest and support vector machine have been used for the classification
of epileptic seizure states. Various performance evaluation parameters used for the
comparative analysis are: accuracy, sensitivity, specificity, precision, false positive
rate, F1-score, S1-score and area under ROC curve. The standard dataset of Bonn
University has been used to perform the experimentation. The work proves that
feature extraction approaches improve the performance of machine learning classi-
fiers in EEG-based epileptic seizure state recognition problems. Random forest and
Gaussian naïve Bayes outperform all other classifiers considering binary and ternary
classification approaches.
V. Patel (B)
Department of Computer Engineering, C. G. Patel Institute of Technology, UTU, Bardoli, Gujarat,
India
e-mail: vibha.patel@utu.ac.in
J. Tailor
Shrimad Rajchandra Institute of Management and Computer Application (SRIMCA), UTU,
Bardoli, Gujarat, India
e-mail: jaishree.tailor@utu.ac.in
A. Ganatra
Department of Computer Engineering, DEPSTAR, Charotar University of Science and
Technology(CHARUSAT), Anand, Gujarat, India
e-mail: amitganatra.ce@charusat.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 35
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_4
36 V. Patel et al.
1 Introduction
Epileptic seizure state recognition problem focuses on identifying the state of the
seizures based on the electroencephalography (EEG) data of the person with epilepsy.
EEG data can be collected by either invasive method, called iEEG (intracranial EEG)
or non-invasive method, called sEEG (scalp EEG). There are four states of seizure in
EEG: interictal, preictal, ictal and postictal [1–4]. Recognition of these states leads
to the applications called epileptic seizure detection and epileptic seizure prediction.
The difference between these two approaches is illusive. Many overlapping liter-
atures have also been reported in the analysis of seizure detection algorithms and
seizure prediction algorithms. The difference between epileptic seizure detection and
prediction is described as follows:
Seizure Detection: It is the problem of detecting the ‘ictal’ state amongst all, which
is used by practitioners to identify the presence of seizure in the recorded EEG
signals. This task is generally carried out manually by experts, and it is prone to
human errors. A. Sharmila et al. have used DWT-based feature extraction method
for seizure detection using linear and nonlinear classifiers [5]. Ali Shoeb et al. have
used support vector machine to detect onset of epileptic seizures on EEG recordings
in a patient-specific approach [6]. Xiashuang Wang et al. have used novel random
forest model combined with grid search optimization for epileptic EEG detection [7].
Cristian Donos et al. have used random forest classifier to provide a seizure detection
algorithm that can be used for an implantable closed-loop stimulation device [8]. Md
Mursalin et al. have presented a novel analysis method for detecting epileptic seizure
from EEG signal using improved correlation-based feature selection method with
random forest classifier [9]. Mohammad Khubeb Siddiqui have presented a review
of machine learning classifiers for epileptic seizure detection [10].
Seizure Prediction: The problem of seizure prediction can be simplified as detecting
the ‘preictal’ state amongst all. It is clinically proven that there are early signs of
seizures before it actually occurs, which can be differentiated in EEG signals as
preictal state. If the preictal state is effectively identified before significant amount
of time, it can act as an alarm for the person with epilepsy or its caregivers. Han-Tai
Shiao et al. have used SVM-based seizure prediction system that achieves robust
prediction of preictal and interictal iEEG segments from dogs with epilepsy [11].
Theoden Netoff et al. have proposed a patient-specific classification algorithm based
on support vector machine to distinguish preictal and interictal features extracted
from EEG recordings [12]. Yanli Yang et al. have proposed support vector machine-
based classifier that used permutation entropy for epileptic seizure prediction [13].
Piotr W. Mirowski have compared L1-regularized logistic regression, convolutional
networks and support vector machine for epileptic seizure prediction from iEEG [14].
Evaluation of Machine Learning Algorithms … 37
Khansa Rasheed et al. have presented a vast review on machine learning approaches
for predicting epileptic seizures using EEG signals [15].
Though ample amount of work has been done in the field of epileptic seizure state
detection and prediction, there is no clinical applicability till date. This is because
of the requirement of high sensitivity with very law false positive rate in highly
imbalance and noisy data. Also, in this era of high computing power and cloud-
based resource availability, research is more inclined towards deep learning-based
machine learning approaches. This work contributes to the direction of evaluating
the performance of classical machine learning approaches. Experimentation was
conducted to test the following hypothesis: traditional machine learning algorithms
work best when: First, dataset is low scale and not noisy; second, balanced classes
have been considered; third, appropriate feature extraction methods have been used
before classification phase.
Following machine learning algorithms have frequently been used in the literature for
the task of epileptic seizure state recognition: support vector machine, random forest,
Gaussian naïve Bayes, multilayer perceptron, k-nearest neighbour, quadratic discrim-
inant analysis and logistic regression. Fernandez-Delgado et al. [16] have evaluated
179 classifiers on 121 datasets, which represents UCI database and other real-time
problems. They derived that random forest as the best performer followed by SVM
with Gaussian and polynomial kernels, neural networks and boosting ensembles are
better amongst all. This work evaluates the performance of eight machine learning
algorithms: k-nearest neighbour (KNN), decision tree classifier (DTC), Gaussian
naïve Bayes (GNB), multi-layer perceptron (MLP), quadratic discriminant anal-
ysis (QDA), support vector machine with Gaussian kernel (SVM-G), support vector
machine with polynomial kernel (SVM-P) and random forest (RF). These algorithms
were tested with various parameter values to record the best results on the specified
dataset.
3 Feature Extraction
EEG features extraction plays very important role [17] in the performance of clas-
sification for seizure state recognition. Features extraction methods can be classi-
fied into four broad categories [18]: (1) time domain, (2) frequency domain, (3)
38 V. Patel et al.
time–frequency domain and (4) nonlinear methods. Amjed S. Al-Fahoum et al. [19]
have mentioned in their study that each method has its own pros and cons which
makes it suitable for special type of applications. Frequency domain methods may
not provide quality performance for some EEG signals, whereas time–frequency
methods may not provide detailed information on EEG analysis as much as frequency
domain methods. Previous work on automated detection of normal and epileptic
classes uses the following features: nonlinear pre-processing filter, entropy measures,
time and frequency domain features, wavelet transform-based features, FFT-based
features, relative wavelet energy, genetic programming-based features and cross-
correlation and PSD [18]. Amongst these, time and frequency domain features, FFT-
based features and wavelet transform-based features are repeatedly used by different
researchers.
The process of selecting features for EEG analysis is arbitrary to a large extent
since the researcher often must guess the importance of features for every single task.
This comes with the risk of using less useful and redundant features while ignoring the
most important features [20]. Deep learning comes with the advantage of automated
feature extraction and selection. It is completely up to the model training process; i.e.
no handcrafted features are required to build the model. However, this comes with
the requirement of high computing power and larger training data. The following
section describes description of handcrafted features used in this work for evaluation
of the performance of machine learning algorithms.
Total five features were extracted from EEG signals: detrended fluctuation anal-
ysis (DFA), Petrosian fractal dimension (PFD), Higuchi fractal dimension (HFD),
singular value decomposition entropy (SVD entropy) and Fisher information.
Detrended Fluctuation Analysis (DFA): It is a popular method to analyse long-
range temporal correlations in time series of many different research areas but
in particular also for electrophysiological recordings [21]. DFA provides unique
insights into the functional organization of neuronal systems [19]. DFA is a simple
mathematical method but very efficient to investigate the power law of long-term
correlations of non-stationary time series. The process to compute DFA has been
adapted from [22].
Petrosian Fractal Dimension (PFD): For a time series, PFD is defined as follows
[22],
log10 N
PFD = (1)
log10 N + log10 (N /(N + 0.4Nδ ))
where N is the series length and N δ is the number of sign changes in the signal
derivative.
Evaluation of Machine Learning Algorithms … 39
4 Dataset Description
The standard dataset of Bonn University [28] has been used for the experimentation.
This dataset is the commonly dataset for the task of epileptic seizure state recognition
problems. It is consisting of five different categories (denoted A–E) of EEG record-
ings. Sets A and B contain scalp EEG recordings of healthy volunteers, whereas
sets C and D contain intracranial recording of persons with epilepsy in seizure-free
intervals. Set E contains the intracranial EEG recordings of persons with epilepsy in
seizure period. Each set contains 23.6 s duration 100 single channel EEG segments.
Each segment consists of 4097 dimensional samples (173.61 Hz).
5 System Design
There are two types of model being tested for this work, classification algorithms with
the use of handcrafted features and classification algorithms without the use of hand-
crafted features. Figure 1 shows the training phase, one with extracting handcrafted
features and another without feature extraction methods. Figure 2 shows the predic-
tion phase of the models. The following sections further elaborate experimentation
details and performance evaluation parameters consideration.
5.1 Experimentation
The task of epileptic seizure state recognition can be defined as the classification of
seizure states into either normal, ictal or interictal of the dataset under consideration.
Three scenarios have been considered for justifying the seizure state recognition: (i)
40 V. Patel et al.
normal state versus ictal state (NS vs. IS), (ii) normal state versus interictal state (NS
vs. IIS), (iii) interictal state versus ictal state (IIS vs. IS) and (iv) normal state versus
interictal state versus ictal state (NS vs. IIS vs. IS). Table 1 shows details of the same.
Total 19 experimentations have been carried out for each of the eight algorithms.
To remove the overfitting in traditional machine learning algorithms, k-fold cross-
validation was followed. Parameters are manually tuned for best performance which
includes maximum depth in decision tree classification, value of k for k-nearest
neighbour and number of layers in multilayer perceptron.
TP + TN TP + TN
Accuracy = = (2)
TP + TN + FP + FN P+N
TP TP
Sensitivity = = (3)
TP + FN P
Evaluation of Machine Learning Algorithms … 41
TN TN
Specificity (SP) = = (4)
TN + FP N
TP
Precision = (5)
TP + FP
42 V. Patel et al.
TP
Recall = (6)
TP + FN
FP
False Positive Rate = = 1 − SP (6)
TN + FP
Precision × Recall
F1 = 2 × (7)
Precision + Recall
Sensitivity × Specificity
S1 = 2 × (8)
Sensitivity + Specificity
These equations have been derived from the confusion matrix of the classification
where TP is true positive, TN is true negative, FP is false positive, and FN is false
negative. TP and TN indicate the correct numbers of positive and negative predictions,
respectively. FP and FN indicate the number of incorrect predictions for negative and
positive cases, respectively. It is important to note that sensitivity is also known as
recall or true positive rate (TPR), and specificity is called true negative rate (TNR).
Precision is also called positive predictive value (PPV). F1-score is a harmonic mean
of precision and recall, whereas S1-score is the harmonic mean of sensitivity and
specificity. The area under the curve (AUC) is used to quantify the area covered by
ROC curve. An idle classifier exhibits AUC of 1.0, which is not real to achieve.
However, the range from 0.6 to 0.9 is considered to be the performance of a good
classifier. A random classifier would exhibit the AUC score of 0.5. Table 2 shows
the best and worst performance values for various parameters considered here for
comparing the classifiers.
Evaluation of Machine Learning Algorithms … 43
6 Results
It can be summarized from the performance of various classifiers that the Gaus-
sian naive Bayes and random forest algorithms outperform other traditional machine
learning algorithms—k-nearest neighbour, decision tree classifier, multilayer percep-
tron, quadratic discriminant analysis and support vector machines. Also, the deci-
sion tree algorithm improves drastically if the input dataset is applied task-specific
feature extraction approaches. The hypothesis has been proved, which stated that the
traditional machine learning algorithms work best when: first, dataset is low scale
and not noisy; second, balanced classes have been considered; third, appropriate
feature extraction methods have been used before classification phase. This work
also derives that traditional machine learning algorithms proved to be more efficient
for problems satisfying the aforesaid considerations. Applying larger datasets with
higher dimensions and noise is the future work which can be used to conquer the
conclusion.
References
1. P. Bashivan, I. Rish, M. Yeasin, N. Codella, Learning representations from EEG with deep
recurrent-convolutional neural networks (2015)
2. X. Wei, L. Zhou, Z. Chen, L. Zhang, Y. Zhou, Automatic seizure detection using three-
dimensional CNN based on multi-channel EEG. BMC Med. Inform. Decis. Mak. 18 (2018).
https://doi.org/10.1186/s12911-018-0693-8
Evaluation of Machine Learning Algorithms … 47
3. S.M. Usman, M. Usman, S. Fong, Epileptic seizures prediction using machine learning
methods. Comput. Math. Methods Med. 2017 (2017). https://doi.org/10.1155/2017/9074759
4. B. Świderski, S. Osowski, A. Cichocki, A. Rysz, Epileptic seizure prediction using Lyapunov
exponents and support vector machine. Lecture Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007). https://
doi.org/10.1007/978-3-540-71629-7_42
5. A. Sharmila, P. Geethanjali, DWT based detection of epileptic seizure from EEG signals using
Naive Bayes and k-NN classifiers. IEEE Access 4, 7716–7727 (2016). https://doi.org/10.1109/
ACCESS.2016.2585661
6. A. Shoeb, J. Guttag, Application of machine learning to epileptic seizure detection, in ICML
2010—Proceedings, 27th International Conference on Machine Learning (2010)
7. X. Wang, G. Gong, N. Li, S. Qiu, Detection analysis of epileptic EEG using a novel random
forest model combined with grid search optimization. Front. Hum. Neurosci. 13 (2019). https://
doi.org/10.3389/fnhum.2019.00052
8. C. Donos, M. Dümpelmann, A. Schulze-Bonhage, Early seizure detection algorithm based on
intracranial EEG and random forest classification. Int. J. Neural Syst. 25 (2015). https://doi.
org/10.1142/S0129065715500239
9. M. Mursalin, Y. Zhang, Y. Chen, N.V. Chawla, Automated epileptic seizure detection using
improved correlation-based feature selection with random forest classifier. Neurocomputing
241 (2017). https://doi.org/10.1016/j.neucom.2017.02.053
10. M.K. Siddiqui, R. Morales-Menendez, X. Huang, N. Hussain, A review of epileptic seizure
detection using machine learning classifiers. Brain Inf. (2020). https://doi.org/10.1186/s40708-
020-00105-1
11. H.T. Shiao, V. Cherkassky, J. Lee, B. Veber, E.E. Patterson, B.H. Brinkmann, G.A. Worrell,
SVM-based system for prediction of epileptic seizures from iEEG signal. IEEE Trans. Biomed.
Eng. 64 (2017). https://doi.org/10.1109/TBME.2016.2586475
12. Y. Park, L. Luo, K.K. Parhi, T. Netoff, Seizure prediction with spectral power of EEG using
cost-sensitive support vector machines. Epilepsia 52 (2011). https://doi.org/10.1111/j.1528-
1167.2011.03138.x
13. Y. Yang, M. Zhou, Y. Niu, C. Li, R. Cao, B. Wang, P. Yan, Y. Ma, J. Xiang, Epileptic seizure
prediction based on permutation entropy. Front. Comput. Neurosci. 12 (2018). https://doi.org/
10.3389/fncom.2018.00055
14. P.W. Mirowski, Y. LeCun, D. Madhavan, R. Kuzniecky, Comparing SVM and convolutional
networks for epileptic seizure prediction from intracranial EEG, in Proceedings of the 2008
IEEE Workshop on Machine Learning for Signal Processing, MLSP 2008 (2008). https://doi.
org/10.1109/MLSP.2008.4685487
15. K. Rasheed, A. Qayyum, J. Qadir, S. Sivathamboo, P. Kawn, L. Kuhlmann, T. O’Brien, A.
Razi, Machine learning for predicting epileptic seizures using EEG signals: a review. IEEE
Rev. Biomed. Eng. (2020). https://doi.org/10.1109/RBME.2020.3008792
16. M. Fernández-Delgado, E. Cernadas, S. Barro, D. Amorim, Do we need hundreds of classifiers
to solve real world classification problems? J. Mach. Learn. Res. 15, 3133–3181 (2014). https://
doi.org/10.1117/1.JRS.11.015020
17. M.A. Rahman, W. Ma, D. Tran, J. Campbell, A comprehensive survey of the feature extraction
methods in the EEG research. Lecture Notes in Computer Science (including subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 7440 LNCS, pp. 274–283
(2012). https://doi.org/10.1007/978-3-642-33065-0_29.
18. U.R. Acharya, S. Vinitha Sree, G. Swapna, R.J. Martis, J.S. Suri, Automated EEG analysis of
epilepsy: a review. Knowledge-Based Syst. 45, 147–165 (2013). https://doi.org/10.1016/j.kno
sys.2013.02.014
19. A.S. Al-Fahoum, A.A. Al-Fraihat, Methods of EEG signal features extraction using linear
analysis in frequency and time-frequency domains. ISRN Neurosci. 2014, 1–7 (2014). https://
doi.org/10.1155/2014/730218
20. M.A. Mazurowski, M. Buda, A. Saha, M.R. Bashir, Deep learning in radiology: an overview of
the concepts and a survey of the state of the art with focus on MRI. J. Magn. Reson. Imaging.
49, 939–954 (2019). https://doi.org/10.1002/jmri.26534
48 V. Patel et al.
21. G. Nolte, M. Aburidi, A.K. Engel, Robust calculation of slopes in detrended fluctuation analysis
and its application to envelopes of human alpha rhythms. Sci. Rep. 9, 1–16 (2019). https://doi.
org/10.1038/s41598-019-42732-7
22. F.S. Bao, X. Liu, C. Zhang, PyEEG: an open source python module for EEG/MEG feature
extraction. Comput. Intell. Neurosci. 2011 (2011). https://doi.org/10.1155/2011/406391
23. T.Q.D. Khoa, V.Q. Ha, V. Van Toi, Higuchi fractal properties of onset epilepsy elec-
troencephalogram. Comput. Math. Methods Med. 2012 (2012). https://doi.org/10.1155/2012/
461426
24. M. Čukić, M. Stokić, S. Simić, D. Pokrajac, The successful discrimination of depression from
EEG could be attributed to proper feature extraction and not to a particular classification
method. Cogn. Neurodyn. 14 (2020). https://doi.org/10.1007/s11571-020-09581-x
25. P. Boonyakitanont, A. Lek-uthai, K. Chomtho, J. Songsiri, A review of feature extraction and
performance evaluation in epileptic seizure detection using EEG. Biomed. Signal Process.
Control (2020). https://doi.org/10.1016/j.bspc.2019.101702
26. Y. Zhang, S. Yang, Y. Liu, Y. Zhang, B. Han, F. Zhou, Integration of 24 feature types to accurately
detect and predict seizures using scalp EEG signals. Sensors (Switzerland) 18 (2018). https://
doi.org/10.3390/s18051372
27. R.A. Fisher, Theory of statistical estimation. Math. Proc. Cambridge Philos. Soc. 22 (1925).
https://doi.org/10.1017/S0305004100009580
28. Andrzejak, R.G., Lehnertz, K., Mormann, F., Rieke, C., David, P., Elger, C.E.: Indications
of nonlinear deterministic and finite-dimensional structures in time series of brain electrical
activity: dependence on recording region and brain state. Phys. Rev. E - Stat. Physics, Plasmas,
Fluids, Relat. Interdiscip. Top. 64, 8 (2001). https://doi.org/10.1103/PhysRevE.64.061907
29. Powers, D.M.W.: Evaluation: from precision, recall and f-measure to ROC, informedness,
markedness & correlation. J. Mach. Learn. Technol. 2 (2011)
30. Moghim, N., Corne, D.W.: Predicting epileptic seizures in advance. PLoS One 9 (2014). https://
doi.org/10.1371/journal.pone.0099334
31. U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, H. Adeli, Deep convolutional neural network
for the automated detection and diagnosis of seizure using EEG signals. Comput. Biol. Med.
100, 270–278 (2018). https://doi.org/10.1016/j.compbiomed.2017.09.017
Lung Disease Detection
and Classification from Chest X-Ray
Images Using Adaptive Segmentation
and Deep Learning
Abstract Detection and classification of lung disease detection using recent method-
ologies have become an important research problem for smart computer-aided diag-
nosis (CAD) tools. The emergence of deep learning brings automation across the
different domains to address the concerns related to manual techniques. The chest
X-ray image remains one of the effective tools for lung disease detection such as pneu-
monia. This paper presents a framework for pneumonia disease detection and classifi-
cation from the raw X-ray images. The proposed framework consists of image prepro-
cessing, adaptive segmentation, features extraction, and automatic disease detection.
Raw X-ray images are preprocessed by applying the lightweight and effective filtering
algorithm. The region of interest from the preprocessed image has been located
by using the adaptive segmentation algorithm. We propose a dynamic threshold
mechanism followed by morphological operations for adaptive segmentation. The
hybrid feature vector has been implemented using visual, texture, shape, and inten-
sity features. For disease detection and classification, the hybrid features are normal-
ized using robust normalization and then automatic deep learning classifier model
recurrent neural network (RNN) with long short-term memory (LSTM) designed.
The simulation results show that the proposed model outperformed state-of-the-art
similar methods.
1 Introduction
The lung diseases such as lung cancer, pneumonia, and the recent novel COVID-19
[1–3] have become the major threat to human beings. Compared to lung cancer, pneu-
monia is caused by various reasons including COVID-19 which leads to a significant
mortality rate due to infectious behavior. The detection of lung cancer is performed
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 49
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_5
50 S. Goyal and R. Singh
using techniques like X-ray, magnetic resonance imaging (MRI), computed tomog-
raphy (CT), and isotope. Among these, CT and X-ray chest imaging techniques are
frequently used for the detection of various lung diseases. X-ray technique is cost-
effective with similar kind of outcomes compared to CT scan. Hence, many doctors
recommended the chest X-ray for the analysis of lung diseases like pneumonia [4,
5]. The recent progress of using the Internet of Things (IoT) for smart health care
systems required a smart disease monitoring system as well [6–11]. The goal of
this paper is to propose a framework of lung disease detection and classification for
accurate health assessment using the effective methods of computer vision and soft
computing. The objectives of this framework are to enhance the accuracy of pneu-
monia disease detection and to minimize the detection time. These objectives of
robust and reliable lung disease detection and classification motivate the framework
proposed in this paper. Rest of the paper is organized as follows: Sect. 2 presents the
study of related works. Section 3 illustrates the proposed model. Section 4 demon-
strates the simulation results. Section 5 concludes the proposed work and summarizes
major findings.
2 Related Works
This paper proposed the model fusion RNN-LSTM (FRNN-LSTM) for efficient and
robust lung diseases detection and classification from the input X-ray images.
3 Proposed Methodology
This section presents the methodology and design of the proposed framework for
lung disease detection and classification using chest X-ray images. As shown in
Fig. 1, the input chest X-ray image is first preprocessed for quality enhancement
by removing the noise and improving the low-contrast regions. To remove noise
and low-contrast regions, preprocessing plays a very significant role. The next step
is adaptive segmentation which aims to localize the region of interest according to
image structure. From the segmented image, four types of features are extracted and
The input chest X-ray image I has been preprocessed in the proposed model FRRN-
LSTM by applying intensity value adjustment, median filtering, and histogram
equalization. The first operation focused on adjusting the image intensity values
for low-contrast X-ray images. This technique is mainly used to enhance the contrast
as:
I 1 = imadjust(I ) (1)
The proposed adaptive segmentation method focuses to address the problems such as
over-segmentation, inaccurate extraction, adaptability, and high computation time.
The segmentation algorithm has been designed using region growing approach and
morphological operations. The segmentation algorithm has been shown in Table
1. The segmentation method aims to find accurate ROI extraction with minimum
computation time. As shown in Algorithm 1, the segmentation starts with the edge
detection followed by dividing it into N number of grids. For each grid, we applied
the dynamic thresholding approach to perform the segmentation. Once all grids are
segmented, those are replaced in the original image. The post-processing has been
Lung Disease Detection and Classification … 53
In this work, we calculated four features from the segmented images which are visual,
texture, intensity, and geometric invariant features into the vectors F1 , F2 , F3 , and
F4 . The visual features are extracted by the histogram of oriented gradients (HOG)
descriptor. The various kinds of surface features were separated utilizing gray-level
co-event matrix (GLCM) with four balances. Furthermore, eight geometric invariant
features were removed from the segmented image. From each segmented image,
54 S. Goyal and R. Singh
F = {F1 , F2 , F3 , F4 } (3)
The fused feature vector contains different kinds of features extracted from the ROI
image; it leads to significant variations among them. The features with a higher range
play a decisive role in the training process of machine learning algorithms. Therefore,
feature normalization is required to enhance speed and accuracy. Normalization is
used to bound features into two numbers like 0 to 1. We applied min–max and robust
normalization methods as represented in Eqs. (4) and (5), respectively.
(F − min(F))
F min _ max = (4)
(max(F) − min(F))
For classification purpose, we have designed the hybrid deep learning model for
early prediction of lung diseases using RNN and LSTM. The RNN-LSTM model is
formed to overcome the vanishing gradient problem and improve the performance
compared to other soft computing methods.
To validate the effectiveness of the proposed lung disease detection and classifi-
cation model, we have implemented the model FRNN-LSTM in MATLAB along
with other soft computing techniques such as support vector machine (SVM), arti-
ficial neural network (ANN), K-nearest neighbor (KNN), and ensemble classifiers.
The performances of using ANN, SVM, KNN, and ensemble classifier have been
compared with the RNN-LSTM technique by dividing the entire dataset into 70%
training and 30% testing ratio. We have collected the lung disease dataset from the
well-known public research repository [23]. The dataset called COVID-19 Radiog-
raphy Database (C19RD) has a collection of chest X-ray images at Qatar University.
The C19RD consists of 2905 samples in three classes like normal chest (1341),
COVID-19 pneumonia (219), and viral pneumonia (1345).
Lung Disease Detection and Classification … 55
This section presents the comparative results using aforementioned dataset along
with soft computing techniques such as SVM, ANN, KNN, ensemble methods,
and RNN-LSTM with varying features normalization methods. Figures 1, 2, and
3 demonstrate the results of accuracy, precision, and recall, respectively. The results
demonstrated the outcomes of soft computing and feature normalization methods by
using similar computer vision approaches of image enhancement and ROI extraction.
Among all the features normalization techniques, the proposed feature normaliza-
tion method improved the performance of accuracy, precision, and recall metrics
compared to raw features and min–max normalization technique. The raw features
without applying any features normalization led to poor classification performances.
The soft computing methods are also investigated in the above results using the
C19RD dataset. The deep learning model RRN-LSTM-based FRNN-LSTM has
shown enhanced lung disease classification performance compared to other methods.
Fig. 2 Accuracy analysis for a Features normalization analysis and b Soft computing techniques
analysis
Fig. 3 Precision analysis for a Features normalization analysis and b Soft computing techniques
analysis
56 S. Goyal and R. Singh
Fig. 4 Recall analysis for a Features normalization analysis and b Soft computing techniques
analysis
Using the deep learning model RNN-LSTM, detection accuracy, and F1-score perfor-
mance have been enhanced by 4% approximately compared to the second-best soft
computing techniques in Figs. 2 and 3 and 4.
We have implemented and evaluated the existing methods such as COVIDetec-
tioNet [15], CNN using ResNet23 (CNN-RN) [16], Se-ResNeXt-50 [17], CNN using
ensemble approach (CNN-E) [18]. All techniques were implemented on an Intel I5
processor with 4 GB RAM. The comparative analysis has been presented using two
metrics, average detection accuracy and average processing time. Table 2 demon-
strates the comparative study of the proposed FRNN-LSTM model with the existing
methods. These all methods have been implemented with common hyperparameters
such as the number of epochs (70), minimum batch size (27), gradient threshold (1),
and execution environment (CPU). The number of the hidden layers has been set
for 100 in the proposed RNN-LSTM model. Under this hyperparameter setting, we
have received the best classification performances. We have selected these methods
as closely related to the proposed model of lung disease detection using the chest
X-ray images dataset. Additionally, these methods claimed significant results for this
domain using chest X-ray image datasets in Table 2.
Table 2 Comparative
Methods Detection accuracy Training and
analysis of the proposed
(%) detection time
FRNN-LSTM method
(seconds)
COVIDetectioNet 91.34 1879
CNN-RN 92.67 2873
ResNeXt-50 91.58 2762
CNN-E 92.54 2489
FRNN-LSTM 95.04 1289
Lung Disease Detection and Classification … 57
5 Conclusions
This paper presented the framework of detecting and classifying the chest X-ray
image based on lung disease detection and classification. We focused on pneumonia
disease which can be caused by either COVID-19 or bacterial/viral infections. The
model has been designed using preprocessing, adaptive segmentation, hybrid features
extraction and normalization, and automatic classification. The design of each phase
elaborated in this paper with the core goal is to improve the recognition accuracy
and minimize the processing time. The simulation results reveal that the proposed
method outperforms the existing techniques and is able to predict lung disease with
better accuracy in less time.
References
1. D.S. Smith, E.A. Richey, W.L. Brunetto, A Symptom-based rule for diagnosis of COVID-19.
SN Compr. Clin. Med. 2, 1947–1954 (2020). https://doi.org/10.1007/s42399-020-00603-7
2. E. Elibol, Otolaryngological symptoms in COVID-19. Eur. Arch. Otorhinolaryngol. (2020).
https://doi.org/10.1007/s00405-020-06319-7
3. E. Salepci, B. Turk, S.N. Ozcan et al., Symptomatology of COVID-19 from the otorhino-
laryngology perspective: a survey of 223 SARS-CoV-2 RNA-positive patients. Eur. Arch.
Otorhinolaryngol. (2020). https://doi.org/10.1007/s00405-020-06284-1
4. A. Khatri, R. Jain, H. Vashista, N. Mittal, P. Ranjan, R. Janardhanan, Pneumonia identification
in chest X-ray images using EMD, in Trends in Communication, Cloud, and Big Data, ed. by
H. Sarma, B. Bhuyan, S. Borah, N. Dutta. Lecture Notes in Networks and Systems, vol. 99
(Springer, Singapore, 2020). https://doi.org/10.1007/978-981-15-1624-5_9
5. L.A. Rousan, E. Elobeid, M. Karrar et al., Chest x-ray findings and temporal lung changes
in patients with COVID-19 pneumonia. BMC Pulm. Med. 20, 245 (2020). https://doi.org/10.
1186/s12890-020-01286-5
6. H.B. Mahajan, A. Badarla, A.A. Junnarkar, CL-IoT: cross-layer Internet of Things protocol for
intelligent manufacturing of smart farming. J. Ambient Intell. Human Comput. (2020). https://
doi.org/10.1007/s12652-020-02502-0
7. R. Patel, N. Sinha, K. Raj, D. Prasad, V. Nath, Smart healthcare system using IoT. in Nanoelec-
tronics, Circuits and Communication Systems, ed. by V. Nath, J. Mandal. NCCS 2018. Lecture
Notes in Electrical Engineering, vol. 642 (Springer, Singapore, 2020). https://doi.org/10.1007/
978-981-15-2854-5_15.
8. H.B. Mahajan, A. Badarla, Application of Internet of Things for smart precision farming:
solutions and challenges. Int. J. Adv. Sci. Technol. Dec. 2018, 37–45 (2018)
9. M.M. Islam, A. Rahaman, M.R. Islam, Development of smart healthcare monitoring system
in IoT environment. SN Comput. Sci. 1, 185 (2020). https://doi.org/10.1007/s42979-020-001
95-Y
10. H.B. Mahajan, A. Badarla, Experimental analysis of recent clustering algorithms for wireless
sensor network: application of iot based smart precision farming. J. Adv. Res. Dyn. Control
Syst. 11(9). https://doi.org/10.5373/JARDCS/V11I9/20193162
11. H.B. Mahajan, A. Badarla, Detecting HTTP vulnerabilities in IoT-based precision farming
connected with cloud environment using artificial intelligence. Int. J. Adv. Sci. Technol. 29(3),
214–226 (2020)
12. D. Dansana, R. Kumar, A. Bhattacharjee et al., Early diagnosis of COVID-19-affected patients
based on X-ray and computed tomography images using deep learning algorithm. Soft Comput.
(2020). https://doi.org/10.1007/s00500-020-05275-y
58 S. Goyal and R. Singh
13. I.D. Apostolopoulos, T.A. Mpesiana, Covid-19: automatic detection from X-ray images
utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 43,
635–640 (2020). https://doi.org/10.1007/s13246-020-00865-4
14. T.D. Pham, Classification of COVID-19 chest X-rays with deep learning: new models or fine
tuning? Health Inf. Sci. Syst. 9, 2 (2021). https://doi.org/10.1007/s13755-020-00135-3
15. M. Turkoglu, COVIDetectioNet: COVID-19 diagnosis system based on X-ray images using
features selected from pre-learned deep features ensemble. Appl. Intell. (2020). https://doi.org/
10.1007/s10489-020-01888-w
16. C. Butt, J. Gill, D. Chun, B.A. Babu, Deep learning system to screen coronavirus disease 2019
pneumonia. Appl. Intell. (2020). https://doi.org/10.1007/s10489-020-01714-3
17. S. Hira, A. Bai, S. Hira, An automatic approach based on CNN architecture to detect Covid-19
disease from chest X-ray images. Appl. Intell. (2020). https://doi.org/10.1007/s10489-020-020
10-w
18. N. Gianchandani, A. Jaiswal, D. Singh et al., Rapid COVID-19 diagnosis using ensemble deep
transfer learning models from chest radiographic images. J. Ambient Intell. Human Comput.
(2020). https://doi.org/10.1007/s12652-020-02669-6
19. M. Nath, C. Choudhury, Automatic detection of pneumonia from chest X-rays using deep
learning, in Machine Learning, Image Processing, Network Security and Data Sciences, ed. by
A. Bhattacharjee, S. Borgohain, B. Soni, G. Verma, X.Z. Gao. MIND 2020. Communications
in Computer and Information Science, vol. 1240 (Springer, Singapore, 2020). https://doi.org/
10.1007/978-981-15-6315-7_14
20. M.F. Hashmi, S. Katiyar, A.G. Keskar, N.D. Bokde, Z.W. Geem, Efficient pneumonia detection
in chest Xray images using deep transfer learning. Diagnostics 10(6), 417 (2020). https://doi.
org/10.3390/diagnostics10060417
21. https://www.kaggle.com/
22. G. Himabindu, M. Ramakrishna Murty, et al., Classification of kidney lesions using bee swarm
optimization. Int. J. Eng. Technol. 7(2.33), 1046–1052 (2018)
23. G. Himabindu, M. Ramakrishna Murty, et al., Extraction of texture features and classification
of renal masses from kidney images. Int. J. Eng. Technol. 7(2.33), 1057–1063 (2018)
A Quantitative Analysis for Breast
Cancer Prediction Using Artificial Neural
Network and Support Vector Machine
Abstract The medical data is increasing rapidly day by day. The number of patients
of different disease is rising, and it is difficult for the radiologists to analyze the
data, detect, and predict the disease with accurate result. It is important to achieve
better performance and classify the disease using different methodologies as database
is large. Hence, the review of different state-of-the-art techniques using machine
learning and deep learning algorithms is included. The literature is best for the clas-
sification of breast cancer and different medical images. The diagnosis and prediction
are done using training (80%) and testing (20%) samples on benchmark dataset. Both
artificial neural network and support vector machine are compared using parameters,
i.e., accuracy, precision, and recall. Experimental results show that SVM is better
compared to ANN.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 59
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_6
60 H. Walia and P. Kaur
worker. Pre- processing is an important step before the classification of the medical
images. It helps in getting better quality images which further helps in accurate
diagnosis. Many state-of-the-art denoising methods have been introduced over the
years in order to enhance the quality of medical images. ML and DL techniques are
gaining an important role in image denoising.
The medical images captured by CT, ultrasound, MRI use radiations that are harmful
for human body, and in order to take clear images, the body has to be exposed to
stronger radiations for longer period of time. CAD system improves the quality of
image taken by CT, MRI and, hence, helps in diagnosis of diseases from the images
correctly. CAD is used in the diagnosis of lung cancer, breast lesion in mammog-
raphy, diabetic retinopathy, brain tumor, etc. Computer-aided diagnostic (CAD) helps
radiologists in the US-based detection of disease and, hence, helps in lowering the
efforts and reliance on the operator [1]. CAD also helps in improving the sensitivity
and specificity of ailment diagnostic. It helps in the detection and diagnosis of lesion.
In the detection phase, the lesion is separated from the normal tissue, and in diagnosis
phase, the lesion is assessed to give the diagnosis. In the CAD system, the feature
extraction is designed by the human which affects the results efficiency. It is the
diagnosis made by the physician taking into account the output of the computer as
the second opinion [2].
Step 1: First, the data is collected from hospitals and laboratories; this data could
contain all kinds of information such as medical images as well as the personal data
of the patients. The required medical images are acquired and then preprocessed.
Step 2: In the preprocessing phase, the image quality is enhanced. The images
are removed of background artifacts; images are filtered using different filters
for removal of different kinds of noises. The various noises affecting images are
discussed below.
Step 3: After the preprocessing, the images are segmented. In the segmenta-
tion process, the medical image is divided into different segments, and only the
useful part is used for further processing. Features are extracted and selected using
methods like GLCM.
Step 4: Then, the ROI is classified using various algorithms like support vector
machine, K-nearest neighbor, convolutional neural network. Figure 1 shows the
block diagram of CAD.
A Quantitative Analysis for Breast Cancer Prediction … 61
Database
Image Image
collected data
Pre-
from hospitals Acquisitions Processing
and laboratories
Feature
Classification
Extraction and Segmentation
selection
3 Medical Images
Due to increase in the technology, there are various ways to procure images of
human anatomy for the diagnosis of disease. The images modalities include ultra-
sound (US), magnetic resonance imaging (MRI), computed tomography (CT), X-ray,
positron emission tomography (PET). These images can assist in evaluating disease
in different situation. These medical images contain noises which are to be removed
for accurate diagnosis. The different noises affecting the various medical images are
discussed.
Computed tomography also known as X-ray CT. The CT images are used to provide
information regarding the hard tissues of the human body [8]. It uses combination
of various X-rays that is taken from various angles, that is, processed by computer
to produce a tomographic image. CT is used in lesion, tumor detection. It is also
a non-invasive technique and, hence, is painless [5]. The CT images use ionizing
radiations. Ionizing images accumulate in the body and have harmful effects; the
body is exposed to these radiations for a longer period of time. It is recommended to
use less intensity radiations so that the harm caused is less. The image obtained may
not be clear but can be processed using various methods and used for diagnosis.
MRI is used to capture images of that area of the body where other imaging modalities
cannot see appropriately. It makes use of the oscillating magnetic filled and the radio
waves. These radiations cause the hydrogen atoms present on the water molecules of
the tissue to get excited and align in a straight line. Radio waves are sent to deflect
the atoms. After the removal of the magnetic field, the hydrogen atoms release the
electromagnetic wave absorbed in the form of radio waves [5]. MRI provides images
of brain, fetal movements, etc. It is a non-invasive technique and, hence, is painless.
Magnetic resonance images are used for brain imaging for detection of tumors. These
images are non-radioactive and non-aggressive in nature [9]. These images contain
the rician noises.
3.4 X-Rays
X-rays are a form of electromagnetic waves. These radiations can penetrate through
the human skin and provide images of the bones beneath the skin. The image is
formed by the differential absorption of the X-ray beam that forms the images.
These radiations are harmful for the body, so minimum exposure should be given for
such radiations. These images help in detection of bone fractures, bone dislocation,
etc. The X-ray images are affected by Poisson’s noise [10].
The medical imaging devices like MRI, CT, US generate huge amount of images,
and these images may contain noises due to the surroundings’ error while taking the
A Quantitative Analysis for Breast Cancer Prediction … 63
images or while transferring of the images. The various noises present in the medical
images are speckle noise, Gaussian noise, salt and pepper noises, etc. These types
of noises degrade the original quality of the picture and affect the diagnosis process,
so there is a need to preprocess these images before performing any diagnosis.
Speckle noise occurs in an image formed due to scattering. There are a number
of elementary scatterers inside each resolution cell that reflect the incident wave.
The wave that scatters back undergoes constructive and destructive interferences
randomly which form a granular pattern called “speckle” and reduces visual image
quality. It occurs in the US and MR images and caused while image transferring or
any other internal or external factors such as air gap, beam forming process. Speckle
noise results in random variations of the result signal which raises the gray level in
the images. Speckle noise can be removed using Lee filter or the Kaun filter [11].
Salt and pepper noises show as white (salt) and black (pepper) spots on the images.
This noise generally has bright pixel in the dark portions and dark pixels in the bright
portion appearing as white and black dots [12]. It gives low-quality image. It is an
impulsive noise and, hence, is removed using median filter that uses the nonlinear
technique.
64 H. Walia and P. Kaur
The Poisson noise is present in the X-ray and nuclear imaging like the PET. This
noise is made by the random behavior of the photons. It is a selected quantum
noise. The Poisson noise assumes that each pixel in the image is taken from the
Poisson distribution [14]. The different noises affecting the medical images have
been discussed. It is important for the removal of these modalities in order to get a
clear noise-free image that leads to easy detection and diagnosis of disease by the
health care workers or machines like CAD as discussed above. So, the various image
filtering techniques are discussed.
Machine learning algorithms are used for classifying the images and feature
extraction after preprocessing of the images. The classification algorithms are
discussed.
ANN is a self-learning approach. The basic unit of this network is a neuron, which
is based on the biological neural networks present in humans. The artificial neural
network architecture consists of an input layer, n-number of hidden layer, and a
output layer. The pattern recognition is performed by learning from examples in
ANN. It is used for sequence and pattern recognition, medical diagnosis, etc. ANN
is used for both textual data as well as images. It has high accuracy for text data
classification [16–18]. The algorithms have limitations as provided in Table 1, so we
require hybrid algorithms for classification and image denoising.
Table 1 Medical image and machine learning
Author Application Dataset Parameters Techniques Advantages Drawbacks
Title
Publication
Year
Author: Wang, S., Text analysis, Varies Accuracy, cost, Cluster analysis “Cost reduction, “The role of
Summer, R.M. Computer-aided application-wise propagating skills support vector disseminating statistical machine
Title: “Machine detection and machine, Naïve expertise, improves learning
learning and radiology.” diagnosis, brain Bayes, artificial accuracy” approaches is not
[21] Publication: function from fMR, neural networks, defined”
Elsevier content-based image linear models,
Year: 2012 retrieval for MRI ensemble learning,
reinforcement
learning
Author: Moon, K.W, CAD for the US of Dataset: ( 78 Accuracy, NPV, Chi-square test, “The partial AUC is “No diversity in
Lo, C.M, Cho, N., breast malignant and 166 PPV, specificity, CAD with (0.90 vs. 0.76, P < the images of
Chang, J.M., Huang, benign) Total 244 pAUC BI-RADS 0.05) than the malignant tumors
C.S., Chen, J., Chang, conventional CAD used even before
A Quantitative Analysis for Breast Cancer Prediction …
(continued)
67
Table 1 (continued)
68
Author: Huang, Q., Breast lesion Dataset: varies Accuracy, CNN, LSTM “Feature extraction “Huge differences
Zhang, F., Li, X. diagnosis, the liver application-wise sensitivity, is done in size and
Title: “Machine lesion, the fetal specificity automatically, and modality of dataset
Learning in Ultrasound ultrasound standard scope of error is employed by
Computer-Aided plane detection, the reduced, faster different methods”
Diagnostic System” thyroid nodule image processing”
[27] diagnosis
Publication: Hindawi
Year: 2018
(continued)
69
Table 1 (continued)
70
6 Literature Survey
Wasule and Sonar [9] proposed GLCM technique for extracting texture feature and
SVM 7KNN algorithms for classification of brain MRI. The method shows 96%
accuracy for SVM and 86% accuracy for KNN. SVM has better performance than
KNN, and SVM performance enhances using larger training set.
Kalyan et al. [19] used feature extraction methods for classification of abnormali-
ties using ANN by performance evaluation; it is difficult to decide between GLRLM
and combined.
Kohil and Sudharson [20] proposed the pre-trained residual neural network (RLN)
for despeckling of the ultrasound images before diagnosing using the computer-aided
diagnostic (CAD) system. The proposed method gives better PSNR, SSIM at all noise
levels. The performance of RLN method is better in term of naturalness image quality
evaluator (NIQE) compared to other methods.
Table 1 reviews the medical imaging using ML and DL algorithms. It shows the
advantages and disadvantages of using the ML and DL algorithms for classification
purposes. The table specifies the dataset being used and the change in the parameters
like accuracy, SNR, sensitivity on using the machine learning algorithms. After
reviewing the effect of ML and DL algorithms, Table 2 further reviews denoising
the medical images first using different filtering methods like Lee, median filters,
etc., or hybrid filtering techniques and then applying the ML and DL algorithms for
the classification of images. The table shows that the denoising of the images before
classification improves the accuracy of diagnosis.
7 Gaps in Literature
Author: Diwakar, M., Singh, P. PSNR, SSIM, ED, DIV 87 CT images (512 × 512) Non-subsampled Shearlet “Increased noise reduction in
Title: “CT image denoising transform (NS ST) CT images than other
using multivariate model and approaches”
its method noise thresholding
in non-subsampled shearlet
domain” [8]
Publication: Elsevier
Year: 2020
(continued)
73
Table 2 (continued)
74
combination with deep learning. This can increase accuracy of diagnosis and
other medical procedures.
• Hybridization of deep neural network algorithms like DAWN model shows
improvement in SNR, and CNN DAE models require small training sets for
denoising.
8 Case Study
Support vector machine and artificial neural network state-of-the-art techniques are
used for breast cancer prediction. Figure 2 includes the proposed workflow using
ANN, and comparison is done with SVM in terms of accuracy in Fig. 2.
In data preprocessing phase, the data is checked for any null values and unnamed
values. Any null values found are either replaced or removed from the dataset. The
attributes that are not required for processing, i.e., those features that do not affect
the disease prediction are dropped. In the dataset, the diagnosis column replaces B
to 0 and M to 1 for processing purposes.
While training the data for SVM, the data is divided into training and testing test
sets. The 70–30 split has been taken for processing purposes. For accuracy, linear
kernel is used, and the random state is defined as 0. ANN: While training the data
for ANN, the data is divided in training and testing set; 0.80% of the data is used for
data vales splitting of data into and epoch and loss ANN training, and the rest 20%
of the data is used for the testing purpose. Thirty-one features have been taken for
prediction. There are two hidden layers used in the processing. Both the hidden layers
use ReLU as the activation function, and the output layer uses sigmoid activation
function. Figure 3 shows ANN diagram.
A Quantitative Analysis for Breast Cancer Prediction … 79
Fig. 3 ANN
Intel core i7 with 16 GB RAM is used for processing of the data. In Fig. 3, the
system is used for breast cancer prediction as benign or malignant using ANN and
SVM algorithms. Google Colaboratory is used for programming in Python. Google
Colaboratory allows us to write Python program with no configuration required by
the computer. It allows easy sharing of code. It is popular among research scholars
and students for performing machine learning algorithms.
The performance of the SVM and ANN is determined in terms of accuracy. The
accuracy is the ratio of number of samples classified correctly to the total number of
samples.
For analytic test estimation, the ROC curve is used. It is used to provide the relation
between true positive and the false positive rate. True positive (TP) is the sensitivity,
and false positive (FP) is 1- specificity). The lower left corner of the ROC curve
80 H. Walia and P. Kaur
specifies area under the curve (AUC) which is 0.994 for ANN and 0.987 for SVM.
For ANN, the training dataset gives an accuracy of 95.82%, loss is 8.21%, whereas
for the testing set of ANN, accuracy is 94.73% and loss 15.01%, respectively. The
precision and recall score are 0.939 and 0.952. The accuracy for the SVM algorithm is
95.98%. Precision and recall sore are 0.872 and 0.976, respectively. The performance
of the respective algorithms is shown in Table 3 along with the confusion matrix and
ROC curve shown in Figs. 4 and 5. SVM performs better than ANN for breast cancer
prediction. The accuracy in SVM is achieved with kernel-based transform.
Table 3 shows performance of SVM and ANN.
Fig. 5 a Heat map of confusion matrix for SVM and b ROC SVM
A Quantitative Analysis for Breast Cancer Prediction … 81
10 Conclusion
The machine learning and deep learning methods along with different classification
algorithms are reviewed thoroughly for medical images. The training and testing
samples of benchmark breast cancer UCI are improved using ANN and SVM clas-
sification methods. The evaluation parametric result of SVM in terms of accuracy is
95.98%, whereas precision and recall rate are also improved as compared to ANN.
The qualitative prediction using ROC curve is 0.987 in SVM. The hybrid approach
using transfer learning and RNN may prove better as a future scope in this prediction
approach.
References
1. R.J. Ramteke, K. Monali, Automatic medical image classification and abnormality detection
using K-nearest neighbour. Int. J. Adv. Comp. Res. 2(6), 190–196 (2012)
2. S. Liu, Y. Wang, X. Yang, B. Lei, L. Liu, S.X. Li, D. Ni, T. Wang, Deep learning in medical
ultrasound analysis—a review, pp. 261–275 (2018)
3. T. Wang, Y. Lei, Y. Fu, W.J Curran, T. Liu, X. Yang, Machine Learning in Quantitative PET
Imaging (2020)
4. C. Bowles, B. Kainz, Machine learning for the automatic extraction and classification of foetal
features in-utero (2014)
5. S.V.M.M. Sagheer, S.N. George, A review on medical image denoising algorithms. Biomed.
Signal Process. Control, 1746–8094 (2020)
6. S. Kollem, K.R.L. Reddy, D.S. Rao, A review of image denoising and segmentation methods
based on medical images. Int. J. ML Comp., 288–295 (2019)
7. P. Kaur, G. Singh, P. Kaur, An intelligent validation system for diagnostic and prognosis of
ultrasound fetal growth analysis using Neuro-Fuzzy based on genetic algorithm. Egypt. Info.
J., 1110–8665 (2018)
8. M. Diwakar, P. Singh, CT image denoising using multivariate model and its method noise
thresholding in non-subsampled shearlet domain. Biomed Signal Process. Control (2020)
9. V. Wasule, P. Sonar, Classification of brain MRI using SVM and KNN classifier, in 3rd IEEE
International conference on Sensing ,Signal Processing and Security (2017)
10. D.N.H. Thanh, V.B.S. Prasath, L.M. Hieu, A review on CT and X-Ray images denoising
methods. Informatica 43, 151–159 (2019)
11. C.S. Bedi, H. Goyal, Qualitative and quantitative evaluation of image denoising techniques.
Int. J. Comp. App. 8(14), 31–34 (2010)
12. N. Kumar, M. Nachamai, Noise removal and filtering techniques used in medical images.
Oriental J. Comp. Sci. Tech. 10(1), 103–113 (2017)
13. M. Chowdhury, J. Gao, R. Islam, Fuzzy Logic based filtering for image de-noising, in IEEE
Conference on Fuzzy Systems, pp. 2372–2376 (2016)
14. P. Subbuthai, K. Kavithabharathi, S.Muruganand, Reduction of types of noises in dental images.
Int. J. App. Tech. Res., 436–442 (2013)
15. J. Han, M. Kamber, J. Pie, Datamining concepts and techniques, 3rd edn. (Elsevier, 2016)
16. B.F. Erickson, P. Korfiatis, Z. Akkus, T.L. Kline, Machine learning for medical imaging.
Radiographics RSNA, 505–515 (2017)
17. M.I. Razzak, S. Naz, A. Zaib, Deep learning for medical image processing: overview, challenges
and future
18. D. Levy, A. Jain, Breast mass classification from mammograms using deep Convolutional
Neural Networks (2016)
82 H. Walia and P. Kaur
19. K. Kalyan, B. Jakhia, R.D. Lele., M. Joshi, A. Chowdhary, Artificial Neural Nework application
in the diagnosis of disease conditions with liver ultrasound images. Adv. Bioinf. (2014)
20. P. Kokil, S. Sudharson, Despeckling of clinical ultrasound images using deep residual learning.
Comp. Methods Prog. Biomed. 194 (2020)
21. S. Wang, R.M. Summers, Machine learning and radiology. J. Med. Image Anal. 16, 933–951
(2012)
22. W.K. Moon, C.M. Lo, N. Cho, J.M. Chang, C.S. Huang, J.H. Chenand R.F. Chang, Computer-
aided diagnosis of breast masses using quantified BI-RADS findings. Comput. Meth. Prog.
Biomed. 111(1), 84–92 (2013)
23. J. Shan, S.K. Alam, B. Garra, Y. Zhang, T. Ahmed, Computer-aided diagnosis for breast
ultrasound using computerized BI-RADS features and machine learning methods. Ultrasound
Med. Bio. 42(4), 980–988 (2015)
24. H. Ravishankar, S. Prabhu, V. Vadiya, N. Singhal, Hybrid approach for automatic segmentation
of fetal abdomen from ultrasound images using deep learning, in IEEE Conference, pp. 779–782
(2016)
25. K.J. Greas, S. Wolfson, Y. Shen, N. Wu, S.G. Kim, E. Kim, L. Heacock, U. Parikh, L. Moy,
K. Cho, High resolution breast cancer screening with multi-view deep convolutional neural
networks, vol. 3 (2018)
26. M.H. Yap, G. Pons, J. Marti, S. Ganau, M. Sentis, R. Zwiggelaar, A.K. Davison, R. Marti,
Automated breast ultrasound lesions detection using convolutional neural networks. J. Biomed.
Health Inf. 22(4) (2018)
27. L.J. Brattain, B.A. Telfer, M. Dhyani, J.R. Grajo, A.E. Samir, Machine learning for medical
ultrasound: status, methods, and future opportunities. Abdomen Radio 43(4), 786–799 (2018)
28. Q. Huang, F. Zhang, X. Li, Machine learning in ultrasound computer-aided diagnostic systems.
Biomed. Res. Int. (2018)
29. S.A. Ali, S. Vathsal, K.L. Kishore, An efficient denoising technique for CT images using
window based multiwavelet transformation and thresholding. Eur. J. Sci Res. 48(2), 315–325
(2010)
30. N.K. Ragesh, A.R. Anil, R. Rajesh, Digital image denoising in medical ultrasound images: a
survey, in ICGST AIML-11 Conference, Dubai, UAE, pp. 67–73 (2011)
31. M. Malik, F. Ahsan, S. Mohsin, Adaptive image denoising using cuckoo algorithm. Soft.
Comput. 20(3), 925–938 (2014)
32. L. Gondara, Medical image denoising using convolutional denoising autoencoders, in IEEE
16th International Conference on Data Mining Workshops, pp. 241–246 (2017)
33. P.U. Hepsag, S.A. Ozel, A. YazÕcÕ, Using deep learning for mammography classification, in
2nd International Conference on Computer Science and Engineering; UBMK 2017, pp. 418–
423 (2017)
34. A. Gnanaselvi, G.M. Kalavathy, Detecting disorders in retinal images using machine learning
techniques. J. Ambient Intell. Human Comput. (2020)
35. B. Meena, D. Bhavana, K.M.M. Avinash, P. Anuhya, M.S. Teja, K.B. Kumar, Brain Tumor
detection for MR Images using machine learning algorithm. J. Inf. Comput. Sci. 10(7)
36. L. Zhou, J.D. Schaefferkoetter, I.W.K. Tham, G. Huang, J. Yan, Supervised learning with
CycleGAN for low-dose FDG PET image denoising. Med. Image Anal. 65
37. D. Xie, Y. Li, H. Yang, L. Bai, T. Wang, F. Zhou, L. Zhang, Denoising arterial spin labeling
perfusion MRI with deep learning
38. C.Z. Basha, A. Likhitha, P. Alekhya, V. Aparna, Computerised classification of MRI
images using machine learning algorithms, in Conference on Electronics and Sustainable
Communication System (2020)
39. Z.U. Rehman, Z.U. Rehman, M.S. Zia, G.R. Bojja, M. Yaqub, F. Jinchao, K. Arshid, Texture
based localization of a brain tumor from MR-images by using a machine learning approach.
Medical Hypothesis (2020)
Heart Disease Prediction Using Deep
Learning Algorithm
Abstract Heart diseases also known as cardiovascular diseases (CVDs) have been
the major cause of death in the whole world over the last few decades and have
risen to become the most dangerous disease not only in India but throughout the
world. Heart disease may refer to several conditions that affect your heart. Given
the number of variables in your body that can possibly contribute to heart disease,
it is one of the most difficult diseases to foresee. Detecting and predicting it are a
daunting job for physicians and researchers alike. As a result, a reliable, effective,
and practical way to diagnose such life-threatening diseases, a proper medication is
needed. In this project, we will try to solve this problem using different algorithms
with the Cleveland dataset. Our project will be helpful and will make an easy way
for predicting the occurrence of heart disease.
1 Introduction
This section will cover how we can predict heart disease using the deep learning model
and the test results of heart disease diagnosis dataset by implementing a deep learning
algorithm. The important characteristic of deep learning is that it can automatically
extract the features, making learning simple and easy. The problem of unsupervised
and supervised learning can be solved with the help of the deep learning algorithm.
Deep learning has many techniques such as random forests, logistic regression, and
SVM with hyperparameter and feature selection to predict heart disease.
Deep learning is an effective parametric classifier under supervised settings for
predicting heart disease. Deep learning neural network model has extensive multi-
layer perceptrons, which means a greater number of hidden layers with linear and
nonlinear transfer functions, regularization, and activation function for binary classi-
fication like sigmoid. In this training process, all the output patterns will be checked
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 83
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_7
84 G. Velakanti et al.
with the target variables to determine errors, then the error will be adjusted by the
algorithm. The training process will go on until the mistakes are minimized, and the
whole epochs are utilized.
After the training of the model is done, then, the weights are fed to the considered
model for prediction. The model can then predict and diagnose heart disease for the
list of new patients during their tests.
The other problem the deep learning model faces is over-fitting, which is a
common problem in the field of deep learning. The deep neural network classifi-
cation model is better in terms of training dataset compared to the testing dataset. To
solve the over-fitting issue, the model uses the regularization algorithm to decrease
the complexity of the model while keeping the same number of parameters.
Dropout layer is an effective regularization technique that can be used to prevent
the over-fitting in the deep learning model. It is used in each iterative process of
the training in each epoch, which will eliminate the neural network units or nodes
and their connections in the model, so using the dropout layer, we can prevent the
over-fitting in the deep neural network model.
2 Literature Survey
In [1], the authors used support vector machine (SVM), logistic regression, naive
Bayes algorithm to predict the heart disease in percentages. This system will have a
Web site where the users can register themselves to get the report of the risk of the
heart disease in terms of predictive analysis. The system’s database has users table,
medical history table. The authors used 75% entries in the dataset for training and
25% entries for testing the accuracy of the algorithm.
In [2], using ID3, naive Bayes, K-means algorithms, the authors predicted the heart
disease. From the experiment, they got more accuracy for naive Bayes algorithm if
the input data are cleaned and well-maintained; even though ID3 can clean itself, it
cannot give accurate results every time. They considered naive Bayes variables as
individual and can use combination of algorithms like naive Bayes and K-means to
get accuracy.
In [3], different data mining as well as machine learning techniques is used
which include artificial neural networks (ANN), decision tree, fuzzy logic, K-nearest
neighbor (KNN), naive Bayes, and support vector machine (SVM). This paper just
gives awareness and overview of the existing work. They introduced 26 different
papers published on heart disease prediction. The authors have also stated that many
feasible enhancements could be explored to improve the accuracy of prediction
system.
In [4], heart disease dataset was collected from UCI repository. The authors
mentioned different types of heart-related diseases that we are prone to. Different
algorithms were used for classification which include logistic regression, K-nearest
neighbor (KNN), support vector machine (SVM), naive Bayes, hyperparameter opti-
mization (Talos), and random forest, and accuracy was compared. Talos got the
Heart Disease Prediction Using Deep Learning Algorithm 85
highest accuracy of 90.78%. It also stated that using deep learning models increases
accuracy of prediction.
In [5], the authors used ANN model. First, the authors divided the dataset into
training and test data. Next, they have created the model with 13 nodes and 4 hidden
layers. They did training for 100 epochs with batch size 10 and got the accuracy as
85%.
In [6], the authors used dataset from UCI machine learning repository. They used
four algorithms, that is, naïve Bayes’ classifier, decision tree, K-nearest neighbor,
random forest algorithm to create a model with the maximum possible accuracy.
The authors have compared all the four algorithms and got more accuracy for KNN
algorithm.
In [7], the authors used background methods such as logistic regression, naïve
Bayes, SVM, KNN for comparison and produced accuracy for all the methods.
In [8], the authors used proposed classifiers optimized by FCBF, PSO, and ACO
against other classification models used for health diseases classification and define
the most efficient one. They found the best one as the KNN algorithm.
In [9], the authors used dataset from Kaggle Web site and used logistic regression,
Gaussian naïve Bayes, KNN, artificial neural network. They worked with a stack of
11 ML algorithms, and their results were compared with their proposed model.
3 Proposed System
3.1 Objectives
• To build the project which makes easy, simple, and time-saving method for people
to predict the occurrence of heart-related diseases just by providing a set of features
or the details of the patient.
• To avoid human involvement.
• To show efficiency of the models and suggest the best one.
The implementation of the project is done in Python. This is done using two
different models. They are categorical model and binary model.
4 Methodology
output layer will predict the final output, and in between, the hidden layers are
present which perform the important computation of the algorithm. These are the
networks where the output from preceding step is fed as the input to the present
step such that when the network wants to find the following word of a sentence, the
preceding words are needed, and therefore, it is required to remember the earlier
words. So, recurrent neural network has come into occurrence, which cleared up this
problem with the help of the hidden layers which are present in between the input
and output layers. So, the most dominant feature of recurrent neural network is the
hidden layers because it memorizes information about the data. In Fig. 1, starting
with the input layer, the number of hidden layers used are two and an output layer
as well. In the hidden layer one and two, the no. of neurons mapped is 16 and 8,
respectively.
We take the attributes such as age, sex (male or female), chest pain, resting
blood pressure, cholesterol, fasting blood sugar, electrocardiographic measurement,
maximum heart rate, exercise induced angina, ST depression induced by exercise
relative to rest, peak exercise ST segment, major vessels, thalassemia and are fed as
inputs in the input layer. Through channels, neurons from one layer are associated
to the neurons in the following layers. Each and every channel from the input layer
is assigned with a probability value which is known as weight. Next, the inputs in
the input layer are multiplied with the corresponding weights as shown in the below
figure, and this result sum is sent as input to the neurons which are present in the
hidden layer.
All the neurons in the next layer are bounded with a value called bias which is
added to the preceding input result sum. Then, next, this final result value is passed
through an activation function. This result of the activation function decides the
function of the neuron which means if the neuron gets activated, it goes to next
layer, if not no. These activated neurons from the previous layer transfer the data to
the neurons which are of the next layer through channels. In this manner, data are
propagated through the network and is called as forward propagation.
In the output layer, the neuron with the highest probability value determines the
predicted output of the neural network. The values are basically probability. For
example, when we give all the inputs of a single person which are known to us and
get the wrong value as the prediction, then it means we must train it more by the
adjusting the weights. Now, the final forecasted output is compared against the real
output to realize the error in the prediction to adjust the weights. We must calculate
the error change to adjust the weights. We must calculate the slope of each tiny step
and multiply them together. In deeper neural network, if we want to know how much
the error will change if we adjust a weight, and then just calculate the derivative
of each step from back to the weight, then multiply them all. This is called back
propagation.
Based on the information from the above-said calculations, the weights are
adjusted. As mentioned in Fig. 2, we find the derivative of each tiny step and multiply
them together. This is called as back propagation.
This cycle of forward and backward propagation is continuously repeated with
multiple inputs. This is continued until the proper weights are allocated such that the
network can predict the output correctly in most of the cases.
5 Implementation
• In Fig. 3, the first part of the project deals with the collection of dataset. The
dataset was collected from Cleveland patients (from Kaggle Web site) [10] using
read_csv method in Pandas’ library.
• Data preprocessing and data cleaning are done to remove null values present in
the data. Data cleaning is a process to check for any null values or missing values
or not a number values and update or drop them according to our requirements.
So, some missing values are added to the dataset. The last process of cleaning
deals with dividing the dataset into input and output variables of data.
• To understand the dataset clearly, data visualization must be done. Data visual-
ization is a method which gives a clear idea or an overview of the data using some
pictorial representations. For the visualization part, libraries like matplotlib and
Seaborn are required. In this project histogram, Seaborn and heat map are plotted.
– Histogram: It plots a graph for each of the variables present in the dataset.
Since the “heart” dataset collected from Cleveland city patients (from Kaggle
Web site) [10] contains 14 variables, a total of 14 histograms are plotted. Each
graph reveals some information regarding the variable.
– Heat Map: It is a 2D graphical representation of data where individual
values contained in matrix are represented as colors. The intensity reveals
the correlation between two values.
– Cross-tab: It is just like a bar graph which computes relation between two or
more factors of the data against frequency.
• Next part deals with importing certain required modules and packages, named:
Sys (which is a Python module), Pandas (it provides different data structures and
operations to import and analyze data easily), NumPy (for dealing with arrays),
Sklearn (statistical modeling of data), Keras (it is a deep learning model used for
neural network-related projects), matplotlib, pyplot, scatterplot_matrix, Seaborn
(all these help in data visualization by providing an OO API or a grid or an
interface or functions for plotting).
• Classification of dataset to training as well as testing data.
• Execution of RNN Algorithm: It begins with importing some required packages:
sequential (from keras.models), dense and dropout (from keras.layers), Adam
(from keras.Optimizers) and regularizes module from Keras. We have imple-
mented it using two different models: categorical model and binary model. In
both the models, the input dimension has 13 neurons and two hidden layers.
Furthermore, the first 13 neurons in input dimension are connected to 16 other
neurons in first hidden layer which are reconnected to 8 neurons in the succeeding
hidden layer, with kernel_initializer normal and kernel regularizer l2 as well both
these layers are applied with activation function: “ReLU” in both the models. The
difference comes in output layer which will be explained below
Heart Disease Prediction Using Deep Learning Algorithm 89
In the categorical model depicted in Fig. 4, the output variables are converted
to categorical labels. The output layer has two neurons applied with activation
function: “softmax.”
In the binary model depicted in Fig. 5, the output variables represent the inte-
gers: 0 and 1, representing “no heart disease” and “heart disease,” respectively.
In this model, output layer consists of only one neuron applied with activation
function: “sigmoid.”
• The final procedure is the compilation of both the models and fitting them as
well as checking for model losses and accuracy and displaying the results. The
compilation of the model is done using Adam compiler, and some graphs are
plotted for the accuracy metric.
6 Experimental Analysis
• In Figs. 6 and 7, if test size = 0.1, we get the accuracy as 74% approximately for
both the models.
• In Figs. 8 and 9, if test size = 0.3, then we get the accuracy as 79% and 76%
approximately for categorical and binary models, respectively.
90 G. Velakanti et al.
• In Figs. 10 and 11, if test size = 0.4, then we get the accuracy as 80% and 82%
approximately for categorical and binary models, respectively.
• In Figs. 12 and 13, if we add a hidden layer with 12 neurons, then we get the
accuracy as 82% approximately for both the models.
But in this observation, when we add an extra hidden layer, the processing time
is high. The output is generated slowly.
• If we use l1 regularizer for the hidden layers of neurons 17, 12, 9, 7, 2 and 16, 12,
9, 7, 1 for categorical and binary models, respectively, then we get the results as
around 54% with “undefined metric warning” error.
• In Figs. 14 and 15, if we use l1 regularizer for hidden layers of neurons 17, 9, 2
and 16, 8, 1 for categorical and binary models, respectively, then we get the same
accuracy of 82% approximately.
Table. Accuracy for categorical and binary models for different parameters is as
follows:
7 Results
Results have been visualized using the model accuracy and model loss graphs. The
model accuracy graph plots the accuracy against the number of iterations (epoch) as
depicted in Figs. 16 and 19, and the model loss graph plots the loss of our model
against the number of iterations (epoch) as shown in Figs. 17 and 20. The prediction
of the working of model can be done using some metrics like precision, recall rate,
F1-score and support as depicted in Figs. 18 and 21.
Categorical Model
Binary Model
Heart Disease Prediction Using Deep Learning Algorithm 95
From the above project, we know that the categorical model has more accuracy, i.e.,
85%. This project can be further extended by creating a web interface or an application
for better and organized results. Moreover, our dataset has only few instances. So,
we may improve the accuracy and learning of this project by increasing the dataset
size.
96 G. Velakanti et al.
Limitations
The size of the dataset is small and try using the dataset with a greater number of
instances. As we have used deep learning algorithm, it improves the accuracy of
project. Due to the pandemic, many instances were not able to be collected in the
dataset.
References
Abstract The individuals locked themselves in their residences and turned to social
media to stay updated on COVID-19 news and to pass the time when the pandemic
struck and governments announced lockdowns. As a consequence, dealing with the
fake news about COVID-19 posed a significant challenge for the public. So, the
World Health Organization (WHO) has asked that its formally approved information
and reports be portrayed as top results in any COVID-19-related search on Google,
YouTube, Facebook, LinkedIn, Microsoft, Yahoo, and Twitter. We conducted a thor-
ough investigation to assess and select appropriate solutions, knowledge, and skills
for current issues, as well as the use of tools for tracking fake news within social
media in this study. The search for this work started with a physical search of scien-
tific publications and papers concerning fake news related to COVID-19. During
pandemic, the majority of hospitals utilized artificial intelligence techniques to diag-
nose virus-infected patients, and most governments and businesses used robots to
limit the virus’s exposure to their employees and customers, distribute sanitizers, and
advise the public to “stay safe, stay home.”
1 Introduction
A pile of serious illness pneumonias was discovered in Wuhan City, China, and
was stated to the World Health Organization (WHO) at the end of December 2019.
Acute pneumonia was originally diagnosed in these patients. Some of them labored
in Wuhan’s fish market, where they developed fever, sore throat, fatigue, and, in
more extreme cases, shortness of breath. Conversely, as was first thought, these
signs were not of acute pneumonia. With the growing number of cases, in early
January 2020, China notified the WHO of the circumstances and its unknowable
cause [1]. As a result, the WHO has designated the virus as a coronavirus causing
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 97
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_8
98 M. Massoudi and R. Katarya
2 Related Works
Experts have defined fake news in a variety of ways, but they all seem to agree on
the same definition. Fake news is defined as intentionally false information that is
disseminated in order to deceive and mislead people to believe untruths or unveri-
fiable facts. Fake news, according to this viewpoint, is defined as information that
looks like a legitimate news article and besides contains incorrect information [6, 7].
Simultaneously, spreading fake news on digital media networks is one of the massive
challenges to the world today, with the majority of users relying on it solely to obtain
news. Thus, social media can indeed be thought of as amplification for both real and
fake news. For instance, Facebook, one of the most influential social media network
in 2019, had around 1.5 billion registered members, with 62 percent of them using
it to keep up with news [8]. Therefore, the majority of false information identifica-
tion systems employ machine learning to assist users in determining whether or not
Tracking Misleading News of COVID-19 Within Social Media 99
the data they are accessing is false. This classification is obtained by analyzing the
provided data to certain preexisting datasets containing multiple false and true data
[9]. Furthermore, before being used to develop misleading information prediction
model employing machine learning methods, all training data need to go through the
following stages: data preparation and preprocessing, feature selection and extrac-
tion, and model selection and establishment. These common steps make dealing with
the massive amounts of data required to build prediction models much easier [10].
Many automated systems for detecting misleading information have been proposed
so far. The [11] conducted a comprehensive investigation into the identification of
misinformation on social media sites. Thus, Xichen as well as Ali [12] represented
a thorough review of the current findings in the field of false information. They also
described the impact of online false information, presented cutting-edge identifi-
cation techniques, and addressed the most frequently utilized datasets for building
fake news classification models. Asaad and Erascu [13] described a solution that
integrates numerous machine learning techniques for text classification to determine
news credibility. They analyzed the performance of their system on a dataset of
fake and true news incorporating multinomial naive Bayes as well as support vector
machine classification techniques.
The authors on 3,047,255 COVID-19-related twitter posts, assessed their proposed
classifier. Thus, among the 10 machine learning algorithms the decision tree, neural
network, and logistic regression classifiers provided the highest performance results
[3]. Therewith, Marina and Kin [14] looked at different descriptions of fake news
and came up with one based on complete factual accuracy. They also present a
fake news detection system that integrates both manual and computerized content
authentication, as well as stylistic features. Moreover, the [15] presented a frame-
work for collecting, detecting, and visualizing fake news. They used fact-checking
webpages to gather fake and real news articles. The researchers then created a variety
of fake news detection systems using news and social contact features. Ultimately,
they demonstrated a conceptual platform to the discovered false news data. William
[16] announced the launch or a recent dataset for fake news detection. In addition,
this data collection includes 12,800 individually labeled short presentation from the
POLITIFACT website in a variety of contexts. This dataset was also notable for
being the first large dataset dedicated to detecting fake news. It is also bigger than
previous fake news datasets that have been made public. Consequently, throughout
the scenario of COVID-19 infection in Morocco, the [17] proposes a system that
uses a machine learning technique to analyze feedback on Facebook comments in
order to identify false propaganda and also an aggregation framework to detect and
investigate fake stories.
3 Research Questions
The task of our comprehensive study is to assess and pick the appropriate solutions,
knowledge and skills for current problems and the use of tools for monitoring fake
100 M. Massoudi and R. Katarya
news within social digital media networks. Consequently, we have formulated a few
questions for which we want to figure out the best answers incorporating primary
studies.
1. Which method, machine learning or deep learning, is better for detecting fake
news?
2. What are the elements that contributed to the dissemination of fake news during
the COVID-19?
3. How could artificial intelligence help to stifle the spread of COVID-19 and
protect us from false information during the pandemic?
4. In the fight against COVID-19, how could robots assist frontline workers?
The search for this work began with a physical search of scientific publications and
systematic review papers on fake news during COVID-19. In addition, this research
study has included articles from Science Direct, IEEE, Springer Link, A.C.M, and
other related journals.
3.2 Results
We looked over all of the research questions and came up with the following solutions.
1. Which method, machine learning or deep learning, is better for detecting fake
news?
Detecting false news is becoming one of the artificial intelligence scientists’ most
important tasks. So, the machine learning and the deep learning approaches are the
two primary techniques for detection. So, the [18] used a variety of deep learning and
machine learning algorithms on their proposed model, including logistic regression,
support vector machine, naive Bayes, recurrent neural network (RNN), and long
short-term memory (LSTM), with the best result coming from support vector machine
with an accuracy of 89.34%. Accordingly, the machine learning techniques achieved
utmost result in their model when comparing deep learning and machine learning
techniques.
The [19] compared the different techniques of deep learning and machine learning
in their proposed system and the high performance obtained by the recurrent neural
network as well as hybrid convolution neural network model. So, in their proposed
models the deep learning algorithms outperformed machine learning algorithms.
Hence, deep learning algorithms are more stable and accurate than cutting-edge
methods, and they have represented their effectiveness in a variety of applications,
including false information, spam, rumor, fake news, and disinformation detection.
Tracking Misleading News of COVID-19 Within Social Media 101
Deep learning techniques are also extremely flexible and therefore can be easily
adapted to a new challenge [20, 21]. In a nutshell, both machine learning and deep
learning have displayed superior performance in various models so far [22], but the
majority of researchers prefer deep learning techniques for identifying fake news.
2. What are the elements that contributed to the dissemination of fake news during
the COVID-19?
In early 2020, the COVID-19 outbreak led to a widespread lockdowns around the
globe. With billions of humans stranded at home, people are constantly turning to
social media, which is performing a critical task in the spread of false news as well as
people have shared COVID-19 posts within social media even if they were inaccurate
in order to remain informed, assist others, communicate with others, or keep busy[23].
Moreover, since the COVID-19’s discovery, fake news has spread across the internet,
claiming to have therapies for the virus and advice on how to handle it. In addition, a
deluge of false information about deadly virus has led several people to believe that
they can be healed by drinking bleach or salty and ocean water [18, 24]. Furthermore,
the majority of people use social media to kill time and have fun. Hence, users are
using social media to keep busy during the total lockdown imposed by the COVID-
19 pandemic. As a result, people are less willing to check COVID-19 data before
exchanging it, potentially leading to the proliferation of fake and false news [25].
Figure 1 shows users looking for a variety of applications simultaneously.
3. How could artificial intelligence help to stifle the spread of COVID-19 and
protect us from false information during the pandemic?
Every minute of every day, people are surrounded with information. 98,000 tweets,
160 email messages, and 160 video clips are sent, received, and posted each minute.
As a result, the best way to tackle fake news is to develop artificial intelligence-
based automated systems [27]. Moreover, as the number of cases of coronavirus
increased throughout China, hospitals turned to artificial intelligence to help them
diagnose infected patients more rapidly. With hospitals already overburdened by the
pandemic’s scope, artificial intelligence has been used to “identify visual cues of
the pneumonia linked with COVID-19 on photographs from lung CT scans,” as per
Wired [28]. Thus, according to Professor Andrew Hopkins, artificial intelligence
could be used to develop antibodies and vaccines, as well as design a medication, to
combat both present and future coronavirus outbreaks, due to its ability to cope with
large amounts of data [29].
4. In the fight against COVID-19, how could robots assist frontline workers
Because the COVID-19 virus is a “new” phenomenon, people have really no immu-
nity to it. Anyone who is attached to it can become infected, resulting in severe
disease and death. As a result, to decrease the risk of healthcare practitioners’ inter-
actions with ill patients. So, instead of putting personnel at hazard at intake points,
robots have been used to check for patients who may have symptoms such as extreme
temperatures or sneezing. Moreover, not only do robots improve the hospital’s abil-
ities, but they also reduce the danger to both patients and staff. Robotic assistants
could significantly enhance our ability to combat and eliminate this challenge to our
loved ones [30]. In addition, limiting populations exposed to COVID-19 are among
the most effective efforts to tackle it. Robots allow businesses to keep functioning
even while defending their staff and customers in a time of social disconnection [31].
Thus, several South Korean companies have used robots to monitor temperature and
disseminate sanitizer in the wake of COVID-19 outbreak. In a similar vein, the Singa-
pore government has also begun employing spy robots to inform citizens to “remain
safe, stay at home.” Singapore’s state has been using robots to ensure that no public
crowds in public areas spread the devastating coronavirus [32, 33]. Figure 2 shows
the example of using robots in industries.
4 Discussion
The most important takeaway from this survey is that using artificial intelligence
to combat pandemics whenever they occur is the best way to go. Because artificial
Tracking Misleading News of COVID-19 Within Social Media 103
intelligence is designed to mimic the human brain, so whenever a task poses a risk
to human life, we can use artificial intelligence instead, particularly robotics, to
complete it [34]. Undoubtedly, the use of robotics and artificial intelligence can
aid in breaking the chain of human exposure to COVID-19, as well as reducing
the number of COVID-19 cases seen on a daily basis. Swiftly, after the COVID-19
was revealed, the World Health Organization (WHO) suggested which AI might be
a useful tool for dealing with the virus’s crisis. In addition, AI has demonstrated
the best performance in accurately recognizing COVID-19 patients. In a nutshell,
artificial intelligence (AI) is a revolution for the world, particularly for human health.
The proliferation of the COVID-19 is one of the most dangerous events. So, people
look to social media for reliable information on how to defend themselves. Individ-
uals could even die as a result of misleading information. We conducted a systematic
review of tracking misleading information during the COVID-19 outbreak in this
article and discovered the following information. Consequently, when the pandemic
struck, people were confronted with two potentially dangerous phenomena: COVID-
19 and fake news. Individuals are using social media to pass the time and stay
informed about the latest pandemic news, which is one of the motives for the diffu-
sion of fake news during the pandemic. As a result, there is less willingness to eval-
uate COVID-19 information before expressing it, eventually leading to the spread
of fake news. In addition, artificial intelligence and robots are presenting a progres-
sively significant task during the pandemic. For example, most hospitals use arti-
ficial intelligence to diagnose patients, and the majority of businesses use robots
to reduce the risk of humans coming into contact with the virus and to distribute
sanitizers. Researchers also proposed methods for detecting COVID-19 related false
news within social media utilizing deep learning and machine learning algorithms
to address the problem of false information dissemination.
In the future, we want to develop a model for identifying fake news and rumors
within various social digital networks in both text and video formats. In addition, we
will also look to expand the model to detect fake news written in languages other
than English.
References
22. M. Massoudi, N.K. Jain, P. Bansal, Software defect prediction using dimensionality reduction
and deep learning, in Proceedings of the 3rd International Conference on Intelligent Commu-
nication Technologies and Virtual Mobile Networks, ICICV 2021, pp. 884–893 (2021). https://
doi.org/10.1109/ICICV50876.2021.9388622
23. How does fake news of 5G and COVID-19 spread worldwide? https://www.medicalnewst
oday.com/articles/5g-doesnt-cause-covid-19-but-the-rumor-it-does-spread-like-a-virus#Fac
tors-behind-the-spread-of-misinformation. Last accessed 22 April 2021
24. V. Lampos, M.S. Majumder, E. Yom-Tov, M. Edelstein, S. Moura, Y. Hamada, M.X. Rangaka,
R.A. McKendry, I.J. Cox, Tracking COVID-19 using online search. NPJ Digit. Med. 4 (2021).
https://doi.org/10.1038/s41746-021-00384-w
25. O.D. Apuke, B. Omar, Fake news and COVID-19: modelling the predictors of fake news
sharing among social media users. Telemat. Inf. 56, 101475 (2021). https://doi.org/10.1016/j.
tele.2020.101475
26. Our itch to share helps spread Covid-19 misinformation | MIT News | Massachusetts Institute
of Technology. https://news.mit.edu/2020/share-covid-19-misinformation-0709. Last accessed
22 April 2021
27. (1) New Message! https://www.mygreatlearning.com/blog/role-of-ai-in-preventing-fake-
news-weekly-guide/. Last accessed 22 April 2021
28. The Role of AI during the Coronavirus Pandemic | Blue Fountain Media. https://www.bluefo
untainmedia.com/blog/role-ai-during-coronavirus-pandemic. Last accessed 22 April 2021
29. Coronavirus: How can AI help fight the pandemic? BBC News, https://www.bbc.com/news/
technology-51851292. Last accessed 22 April 2021
30. How robots are helping combat COVID-19. https://www.automate.org/blogs/how-robots-are-
helping-combat-covid-19. Last accessed 22 April 2021
31. 10 examples of robots helping to fight COVID. https://www.forbes.com/sites/blakemorgan/
2020/04/22/10-examples-of-robots-helping-to-fight-covid/?sh=768f77d0f4bf. Last accessed
22 April 2021
32. South Korea: Robot with artificial intelligence helps fight COVID-19 spread. https://www.rep
ublicworld.com/world-news/rest-of-the-world-news/robot-with-artificial-intelligence-helps-
fight-covid-19-spread.html. Last accessed 22 April 2021
33. Coronavirus: Will Covid-19 speed up the use of robots to replace human workers? BBC News,
https://www.bbc.com/news/technology-52340651. Last accessed 22 April 2021
34. M. Massoudi, S. Verma, R. Jain, Urban sound classification using CNN, in Proceedings of the
International Conference on Inventive Computation Technologies (ICICT 2021), pp. 583–589
(2021). https://doi.org/10.1109/ICICT50816.2021.9358621
Energy aware Multi-chain PEGASIS
in WSN: A Q-Learning Approach
1 Introduction
Wireless sensor networks (WSNs) are a group of multiple low-powered sensor nodes
that are responsible to collect readings from the environment [1]. In order to obtain
energy efficiency [2], the routing of information is performed by following a hierar-
chical structure of nodes in WSNs. In a hierarchical WSN, at least one gateway is
there to assemble all sensed and processed data for future use. Sink nodes along with
the gateway are responsible to collect processed and aggregated raw data from cluster
heads (CHs) [3] as shown in Fig. 1. The sensor nodes send their data to their corre-
sponding CHs which in turn are forwarded to the respective sink nodes. Recently,
a chain-based hierarchical routing such as power-efficient gathering in sensor infor-
mation systems (PEGASIS) is used for its simplicity to set up and easy to maintain
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 107
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_9
108 R. Dey et al.
[4]. In PEGASIS, all nodes are organized into a linear chain for data transmission,
and more importantly, this chain can be formed via any sink node with a centralized
approach [5]. Furthermore, PEGASIS supports multi-chain topology for mobile or
static hierarchical structure of nodes in WSN [6]. However, simple PEGASIS is not
that much robust and scalable with multi-path options [5].
In this paper, a battery level-aware Q-learning [BLAQL] technique-based
PEGASIS in WSNs is proposed to improve energy efficiency and, henceforth, the
lifetime of nodes of the network. Here, the learning agent will interact with an action,
and its Q-value gets updated by the reward receiving from the working environment.
In order to reach the gateway, the proposed routing method is introduced to all sink-
based multiple chains available in the network; where Q-value of source node is
updated by reward that is coming from neighbor’s battery level information in Fig. 1.
As the cost function for any route is based on battery level, so the feedback section
of the data packet has to carry the component of the Q-value. As a result of this
proposed routing technique, it is possible to learn the source node with preferable
routes toward destination gateway which in turn a better way for transmitting data
packet is obtained with the increment of rounds. The simulation results show the
effectiveness of the proposed method over the existing techniques.
The rest of this paper is organized as: Sect. 2 presents a brief of related works
for completeness of the work. In Sect. 3, the system model is presented. Next,
Sect. 4 discusses the proposed approach. The simulation studies are shown in Sect. 5
followed by the conclusions in Sect. 6.
Energy aware Multi-chain PEGASIS in WSN: A Q-Learning Approach 109
2 Related Works
In recent times, a WSN is used to collect raw data from dynamic environment and
process as well as send them to a gateway for future use. A chain-based routing tech-
nique like PEGASIS is used in the cluster-based hierarchical WSN [1–7] for several
scenarios [8]. In [9], PEGASIS is used to maximize the lifetime of any WSN using
sink mobility. An improvement of PEGASIS routing protocol has been attempted in
[10]. The PEGASIS protocol constructs the chain, and each node delivers the sensing
data to the nearest neighbor node [9]. However, as PEGASIS is not useful in dynamic
scenarios with multiple chains. Hence, a chain-based hierarchical routing protocol
needs to design which considers for obtaining better network performance [11] in
terms of various quality of service (QoS) parameters such as packet loss tolerance,
delay, network bandwidth, energy awareness. Simultaneously, extensive research
has been carried out in routing for WSNs using various machine learning (ML)
techniques. A reinforcement learning (RL)-based energy-efficient routing protocol
is discussed in [12]. As working on a WSN, it is always preferable to reduce time
for routing and for improvement of the energy efficiency of different levels of nodes;
a Q-learning-based technique [13] provides a thought for better preferable route
by learning a source node to select the precise way toward the destination. For
completeness of proposed work, a brief outline of Q-learning is discussed next.
2.1 Q-learning
In reinforcement learning (RL) [14], the learning agent takes action (At ) toward
environment, and the agent is getting a reward (Rt ) from environment based on (At ).
By having this reward, the agent will take its next action toward the environment
whether the reward may be a positive or negative one. This procedure, as shown
in Fig. 2, is continued until the agent learns to take better actions for future. With
Fig. 2 RL technique
110 R. Dey et al.
iterations, the environment also sends the statement of the state of task (S t ) to the
agent which helps the agent to learn the scenarios of the system as feedback in Fig. 2.
A Q-learning [15] technique is a popular method of RL. Here, the learning agent
interacts with an action, and its Q-value gets updated by the reward receiving from
the working environment as follows.
Here, Qi is the current Q-value of any agent i; and getting updated by taking
action. Here, the parameters α and γ used in (1) denote the learning rate and the
discount factor, respectively. One of the most important features of Q-learning is a
model-free RL technique to learn from the rewards of an action in a particular state.
It does not require a fixed model of environment to perform, so it has an extensive
and efficient use in routing problems of WSN [16].
3 System Model
In Fig. 1, it is already discussed that the sensor nodes in the network are used to
collect unprocessed data from environment, and aggregated data are available at
CHs [17]. All of these nodes in hierarchical WSNs are assumed to be stationary and
homogenous for the setup. Here, four sink nodes are considered as S1, S2, S3, and
S4, and these are directly connected with 12 number of CHs such as CH1, CH2, …,
and CH12 as shown in Fig. 3.
In order to send data toward gateway, sink nodes are requested to deliver processed
data from CHs. For example, CH1 is sending data toward gateway via immediate
Sink Node S1 as shown in Fig. 3. However, in a common scenario, it is not suitable
as any CH is connected with only one sink, this may provide a catastrophic result
when the sink failed to forward data packets toward gateway by means of any reason.
For example, if S2 is failed, then CH5, CH6, and CH7 would not be able to send data
packets for destination gateway as shown in Fig. 3. To overcome from such scenario,
a hierarchical WSN structure is considered in the proposed work with availability of
multiple paths to route data packets toward gateway as shown in Fig. 4. Here, CH1
is connected with all other possible four sink nodes of the network. To reach the
gateway, these four possible routes via sink nodes are connected in one hop distance.
By following such all possible connections between CHs and sink, a layer-wise
mess connection for WSNs is obtained which is shown in Fig. 5.
In order to select the desired path among multiple alternatives as shown in Fig. 5,
the transmission traffic via sink for any moment is reduced in that hierarchical WSN
which in turn decreases the energy requirement for communication. Hence, the
network shown in Fig. 5 can be mapped in to an equivalent presentation as shown
in Fig. 6. The proposed work follows the network structure shown in Fig. 6. Here,
CHs and gateway are the source and the destination, respectively. In addition, the
sink nodes are the immediate neighbors of these source nodes which can signify that
the data packets are routed toward destination via neighbors.
4 Proposed Approach
where D and N denote destination node and neighbor node, respectively. The in
(2) is the change factor after receiving reward. This (2) can be further expressed by
the following.
Q tS (D, N ) = α p + q + T − Q tS (D, N ) (3)
Energy aware Multi-chain PEGASIS in WSN: A Q-Learning Approach 113
In (3), p denotes the time between receiving a message and forwarding it; and, q
is the time during which a message spent traveling to the next node.
The proposed routing (BLAQL technique) is dependent on the neighbor’s battery
level as well as the battery level of all sink nodes which may not be same, or some
of them may fail to transmit packet at any time. Here, the Q-value of source node
is updated by a reward, which is based on battery level information (BS ) of the
neighboring nodes. So, if the source node S has m direct (one hop) neighbors, then
the neighbor selection is based on the following.
f BL = Maximum of Battery_ Leveli ; i = 1, 2, 3 . . . , m (4)
Whenever the network is reconstructed, the new route costs must be re-learned by
analysis and update the Q-value based on neighbor’s battery level. Hence, proposed
BLAQL technique is to learn the agent source node about the preferable chain or route
to follow for destination gateway. This procedure of BLAQL based on multi-chain
routing technique is described by the following Algorithm 1.
Algorithm 1: BLAQL based on multi-chain routing technique
Step 1: Sink nodes initiate the data packet transmission by sending signal to all
CHs.
Step 2: After receiving the signal CHs, start to prepare the data packets for sending,
and maintain a Q-value (Qi ) for cost to reach destination. Qi = QS t (D, N).
//* D → destination node, N → neighbor sink nodes, S → source node/CH,
and QS t → the estimated time for transmission *//
Step 3: CHs send the packets to all the possible neighbor sink nodes (m numbers
of sink nodes) of this network with battery level (BS ) feedback section.
Step 4: All sink nodes send feedback to all CHs along with the information BS .
Step 5: After getting feedbacks as reward, CH updates its Qi using selection
function f BL , and choose the preferable neighbor for data transmission.
Step 6: Data packets received by destination (D) gateway following the selected
route from any source (S) CH.
5 Simulation Studies
To carry out the simulations for the proposed work, the following simulation setup
is considered as shown in Table 1.
Figure 7 shows the difference between the residual energy for both of the proposed
and existing works in accordance with the number of rounds. In Fig. 7, it is observed
that the energy loss is higher at the early stages for proposed BLAQL-based PEGASIS
over simple PEGASIS. It is obtained as to make learn the source node about the choice
of the proper chain to reach the destination. By that reason for initial rounds, residual
114 R. Dey et al.
energy for BLAQL-based PEGASIS is in lower side. However, they merge on same
scale later.
Now, Fig. 8 provides a comparison between normalized average energy used
per round for both of these proposed as well as existing works. Here, it is clearly
found that the used energy on the later rounds is lesser for BLAQL-based PEGASIS
compared to the existing work [9] as source nodes are learning to take better actions.
Energy aware Multi-chain PEGASIS in WSN: A Q-Learning Approach 115
In Fig. 9, the number of alive nodes is shown with number of rounds for both
proposed as well as existing works. For these cases, the number of alive nodes is
reducing with number of rounds. Similarly, the number of dead nodes increases as
the simulation is initiated with 100 number of nodes. Hence, it can determine the
number of dead nodes by comparing the alive nodes along with the total number of
initial nodes; where increasing number of dead nodes with rounds will be counted
as a shortcoming for any routing technique.
Now, for every 10 simulations, the scenarios change with alive and dead nodes.
Based on these multiple simulations, the average cases of first node dies (FND), half
nodes dies (HND), and last node dies (LND) [18] are determined accordingly for
both of these proposed as well as existing works. In Fig. 10, it is observed that the
average FND with respect to the number of rounds is quite high for the proposed
work over the existing one. However, the average HND and the average LND show
a marginal equality for both of these procedures.
Perhaps, with increasing sink nodes along with sensor nodes, possibility of
multiple chains increases, and there all possible paths come under the considera-
tion to be chosen, and from these above comparisons, it can easily find that once
the proposed BLAQL-based PEGASIS is used to learn the source node for taking
116 R. Dey et al.
Fig. 10 Comparison on
average FND, HND, and
LND
Energy aware Multi-chain PEGASIS in WSN: A Q-Learning Approach 117
better chain to deliver packets to destination node provides a better outcome than the
existing one.
6 Conclusion
References
13. K.-L. Alvin Yau, H.G. Goh, D. Chieng, K.H. Kwong, Application of Reinforcement Learning to
Wireless Sensor Networks: Models and Algorithms. Springer-Verlag Wien, © Springer (2014)
14. X. Wang, Q. Zhou, C. Qu, G. Chen, J. Xia, Location updating scheme of sink node based on
topology balance and reinforcement learning in WSN. IEEE Access 7 (2019)
15. A. Arya, A. Malik, R. Garg, Reinforcement learning based routing protocols in WSNs: a survey.
Int. J. Comput. Sci. Eng. Technol. (IJCSET) 4(11) (2013). ISSN 3345
16. M.A. Alsheikh, S. Lin, D. Niyato, H.-P. Tan, Machine learning in wireless sensor networks:
algorithms, strategies, and applications. IEEE Commun. Surv. Tutor. 16(4) (2014)
17. A. Diop, Y. Qi, Q. Wang, S. Hussain, An efficient and secure key management scheme for
hierarchical wireless sensor networks. Int. J. Comput. Commun. Eng. 1(4) (2012)
18. A. Mansura, M. Drieberg, A.A. Aziz, V. Bassoo, Multi-energy threshold-based routing protocol
for wireless sensor networks, in 2019 IEEE (ICSGRC 2019), Shah Alam, Malaysia (2019)
Textlytic: Automatic Project Report
Summarization Using NLP Techniques
Abstract Academic project reports can be very verbose and lengthy since they
include comprehensive descriptions, diagrams, tables, charts, graphs, and illus-
trations. Such reports tend to be too long and detailed for quick assessment or
perusal. The proposed system aims to generate a concise extractive summary of
technical project reports. As each section of the report contains important details and
contributes to a sequence, it must be summarized separately. To achieve this objec-
tive, the system accepts a multi-page document as input and performs section-wise
segregation before processing the contents. It summarizes each section, retaining
the topic structure of the original document in the resulting output. Additionally, the
proposed system implements figure and caption extraction for respective sections
and also generates a downloadable summary output file. The resultant summary was
evaluated using the BLEU metric on an open-source dataset. An average score of
38.996% was obtained.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 119
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_10
120 R. Menon et al.
1 Introduction
Across institutions and types of academic training, professors are found to spend
roughly half [1] of their working hours in non-teaching activities like planning,
collaborating. Corrections and various forms of assessments are the major activities
among them and occupy a huge chunk of their valuable time. This added load is
often multifold in the scientific field.
Writing detailed reports is central to any scholarly pursuit, be it research, project
implementations, or any other academic work. Such reports are used to assess a
candidate for their academic year. These documents are typically lengthy and abound
with images, diagrams, graphs, and detailed descriptions of every stage and aspect
in the implementation of the project, making them tedious to assess.
In-depth coverage of a project is crucial for documentation and publication.
However, a succinct version usually suffices for the trained and experienced eye
of a mentor or professor who would be grading these. With piles of submissions to
assess and deadlines to meet, a summarization tool dedicated to and customized for
in-house project reports could prove to be useful.
In most of the similar existing systems, the structure and sequence of sections in
the report are not retained, and many of the tools do not allow PDF documents as
input, let alone multi-page documents. Moreover, these systems are usually not very
user-friendly. The proposed approach attempts to overcome these shortcomings by
implementing an interface that is easily navigable and user-friendly.
The paper presents the proposed solution as a web application that aims at summa-
rizing a project report by dividing the detailed project report into sections and then
preprocessing and summarizing each section to retain the important points under
respective headings.
The methodology used divides the uploaded PDF of the report into sections using
PyPDF2 and Camelot, both libraries native to Python. An extractive summarization
technique is then applied using word2vec followed by K-means clustering. Figures
and diagrams, if properly tabulated and indexed, are also extracted and retained in the
resulting summary. This avoids the accidental exclusion of any useful information
and, hence, allows to extract a precise inference.
The final output summary is generated by combining the summaries and respective
figures or diagrams of each of the sections formed and a provision to download the
summary as a PDF for convenience. A brief survey of existing systems and algorithms
has been accounted for in the literature review (Sect. 2). Further, Sect. 3 elucidates
upon the workflow of the proposed model. Section 4 explains the various stages in
implementation and evaluation. Section 5 analyzes the final system and evaluation
results. Finally, Sect. 6 states the conclusion and discusses the possible improvements
in the future.
Textlytic: Automatic Project Report Summarization … 121
2 Literature Review
Several existing systems aim to summarize text effectively without missing out on
important details of the text and also doing justice to the gold standards. A few of
the prevalent and widely used algorithms are discussed below based on the survey.
2.2 TF-IDF
TF-IDF is used to determine the importance of the sentences and picks the top-ranked
sentences. This gave good results [4] with respect to feature extraction while scoring
sentences. Another implementation [5] ranked a sentence by calculating the product
of the TF-IDF score and index feature. Though the performance of TF-IDF is better
than the TextRank algorithm, scoring the sentences does not always end up selecting
the most important sentences, and results in losing important details are given a lower
rank.
2.4 LSA
Latent semantic analysis uses SVD [7]. It determines the similarities between
sentences and gives weightage to each word accordingly. The most weighted
sentences are then selected. Selecting sentences that are responsible for introducing
a new topic in the document makes this method efficient. Although, it considers
a particular word to have the same meaning in every sentence—its context is not
considered.
2.5 TextRank
Implementation of this method [9] resulted in better BLEU scores, and the sentence-
based model gave better results than graph-based models. But, this model concen-
trated on only news articles and did not implement image or caption extractions for
the final summary.
3 Proposed Approach
The proposed approach focuses on splitting the PDF into sections and then summa-
rizing each resulting PDF to ensure the inclusion of a summary of every section.
The steps entailing the proposed approach can be further understood as follows as
depicted in Fig. 1.
Any document in PDF format containing an index of sections and their page numbers
is considered as data input for the proposed system. Besides this, the user is required
Textlytic: Automatic Project Report Summarization … 123
to enter the page number of the index table and the page number of the first section.
The proposed approach is aimed to provide section-wise summarization to retain the
structure of the summary and to ensure important details of every section that are
included in the final report generation.
The very first step in the proposed approach is to split the main PDF into sections
based on the topics mentioned in the index table using Camelot. Once the document is
divided into respective sections, cleaning the data is essential in the proposed system.
Before cleaning, PDF files are converted into text files using Python’s PyPDF2
library. Preprocessing of extracted data is further explained in the next section.
Data cleaning forms the base of any summarizer to remove noise from the given text
while preparing the data. There are different preprocessing steps, and the ones used
are mentioned below:
Tokenization—“Tokenization [8] is essentially splitting a phrase, sentence, para-
graph, or an entire text document into smaller units, such as individual words or
terms. Each of these smaller units is called tokens.” This step helps to interpret the
words present in the text and increases the efficiency of the summarization process.
124 R. Menon et al.
After the preprocessing of text, word2vec and K-means algorithms are employed by
the proposed method. Word2vec generates vectors of words considering the syntactic
and semantic similarity of the words. These vectors are basically vectors of numbers
that represent the word. K-means clustering algorithm is used to group the word
vectors and form clusters containing similar word vectors. Along with these algo-
rithms, steps for image extraction and caption extraction are also carried out which
are elaborated upon in the upcoming sections.
Once the summarization of every section is done, the summaries of all the sections
are combined, and a final PDF is generated as an output. The text files of summaries
of all the sections are combined, and a Word file is formed, and finally, the Word
document is converted into PDF.
The final step in the proposed approach is to develop a web application with a user-
friendly interface. The functionalities included in the application are user registration
and authentication, a report upload form, a display of the final summary, and an option
to download the report summary.
Textlytic: Automatic Project Report Summarization … 125
4 Implementation
4.1 Preprocessing
Figure 2 depicts the document input for the proposed system before cleaning the text.
Tokenization, decapitalization, and stop word removal are the three preprocessing
techniques implemented, and the cleaned text is represented in Fig. 3. The natural
language toolkit (NLTK) libraries in Python were utilized. Preprocessing was also
done section-wise. Further, the resulting text was split along sentences, and new lines
were removed.
• Word2vec:
The word2vec model is a combination of the continuous bag of words (CBoW)
model and the Skip-gram model. Both the models are neural networks [10, 11]. The
CBOW model takes the context of each word as the input and predicts the target
word. The Skip-gram model predicts the context of a word by taking the target
word as input. Gensim’s word2vec model is used to generate vector representations
of all the words belonging to a particular section’s text [12]. The vectors are formed
in such a way that the words having syntactic and semantic similarities will be
close together in the vector space. Each sentence is associated with the average
of the word vectors present in that sentence.
• K-means clustering:
K-means clustering is an unsupervised learning algorithm that groups unla-
beled datasets into predefined clusters. The sentence vectors computed with the
help of the word2vec model act as input to the K-means clustering algorithm imple-
mented using Python’s Scikit learn machine learning library. The output is a list
of centroids computed for the predefined number of clusters in the vector space.
The Euclidean distance between the cluster’s centroid and sentences belonging to
that cluster is calculated using Python’s SciPy library [13]. The sentence closest
to each centroid is included in the final summary.
• Image and Caption Extraction:
Images and captions are an equally important part of the project report. For
extracting images, the PyMuPDF library in Python is used. For extracting the
captions, two methods are implemented—one using regular expressions (re library
in Python) and the other using an index table of figures that are a part of the project
report.
The first method, using regular expressions, was found to be too rigid and made
mapping a bit difficult as even the mentions of the figure were extracted, though it
Textlytic: Automatic Project Report Summarization … 127
extracted all captions perfectly. The index table for figures is typically a part of the
standard format of a project report. Using Camelot, just like the main index, this
table was also extracted, and the captions were saved into a list before being mapped
to the respective images. This method was found to be cleaner and effective.
The in-built document function from the docx library in Python is used to combine the
text files generated as summaries of all the sections. This Word file is then converted
into the final PDF using the convert method from the docx2pdf library.
The proposed system uses an unsupervised approach for speedy development and a
dearth of relevant data. This is because not enough handwritten reference summaries
were available for the technical project reports which the system was intended to
process. Currently, the most prevalent evaluation metrics for text summarization
models are ROGUE and BLEU [2–10, 14–16]. Unfortunately, these rely on the
degree of overlap between the machine-generated summary and the reference (gold
standard) summary.
Scientific research papers and their reference summaries from an open-source
corpus by the WING NUS group [14] were used. The evaluation metric that is used
to analyze the performance of the proposed summarization tool is bilingual evaluation
understudy (BELU). Unigrams have been considered to measure the overlap. The
BLEU score lies between 0 and 1 for any given machine-generated summary and its
reference(s). Table 1 provides a helpful guide to interpret these scores [15].
The previously mentioned corpus had research papers in XML format. Input text
was extracted from this format and written into .txt files. These were then given
as input against the reference summaries available in the corpus. BLEU score was
calculated using the nltk.translate package which is native to Python.
The aforementioned functionalities and output are made accessible through a user-
friendly interface, and the homepage of the same is shown in Fig. 4. Upon first
using Textlytic, the user first sees the homepage. Besides being able to navigate to
registration and login, a brief explanation of how Textlytic works and what makes it
different is available for reading. The description has been written keeping in mind
that the system will majorly be used by faculty and examiners.
The system can be scaled to store a repository of an institute’s project reports. In
such cases, authentication is essential before the system is used.
Having successfully logged into the system, the user needs to provide a few more
additional details before the summarization can begin. These are to be entered as
shown in Fig. 5. The page numbers where the index table starts and ends are required
so that the document can be split into corresponding sections correctly. A sizable
number of reports contain a tabulated list of figures at the beginning of the docu-
ment. This list is utilized for image and diagram extraction later. It also helps capture
respective captions since the sequence and number are now known. Most reports
have several common pages at the beginning like cover page, certificate(s), acknowl-
edgment, and the like. These are not to be considered for generating the summary.
Therefore, the user is asked for the page number that marks the beginning of the
actual contents/first section. After these inputs, the user can upload the PDF docu-
ment of the report to be summarized within a few clicks. Textlytic begins processing
the document after the “summarize” button is clicked in Fig. 5.
After the various steps of splitting, cleaning, preprocessing, processing, and
assembly have been applied to the input data, the final summary is available to
view section-wise as shown in Fig. 6. Each section, as mentioned earlier, has under-
gone processing independently. Each resulting summary is displayed beneath the
corresponding heading, as in the original document. Retaining the original structure
of the document helps with consistent outputs. Images, figures, or graphs present in
any section are also captured and make it to the relevant portion of the summary.
Images in sections like “implementation” and “results” are prioritized.
The summary visible in Fig. 6 is available to peruse only as long as the user is
logged in and is viewing that particular page. However, if the user wishes to save the
summary for later or keep a record, they can also download the generated summary
as a PDF on their local machine. The PDF would also contain the relevant figures
and diagrams. The format of the resulting document is as shown below in Fig. 7.
On an AMD Ryzen 5 4500U with Radeon Graphics 2.38 GHz CPU with 8 GB
RAM, the system took 15.50 s to generate a summary for an input of a 93 pages
document. Finally, the proposed system was evaluated using state-of-the-art evalua-
tion metrics. The testing was carried out for five research papers with one reference
(gold standard) summary available for each. Figure 8 puts forth the resulting BLEU
scores for the respective documents in the scisumm-corpus by the WING NUS group.
These scores are to be interpreted keeping in mind that even human translators do
130 R. Menon et al.
not achieve a perfect score of 1.0 or 100% [15]. Every score falls within a range, and
the significance of which has been listed in Table 1.
Detailed and comprehensive project reports are frequently drafted in all sorts of
academic work. They help the reader to understand, step by step, the author’s work,
and accomplishments. But, such a detailed account is often a hassle to go through
during assessment. The professor or mentor assessing the document is usually looking
only for the salient points and figures/charts if any. Instead, most have to go through
long documents and scan for the main points. The proposed system strives to elim-
inate this inconvenience by providing a crisp section-wise summary including the
important figures/charts. An extractive approach proves to be efficient for the same.
As interpreted in Table 1, the evaluation carried out suggests that the proposed system
is capable of generating relevant summaries.
Textlytic: Automatic Project Report Summarization … 131
To further improve the existing system, more efforts can be directed toward
reducing the number of constraints and prerequisites required for the input document.
Also, the efficiency of the algorithm could be improved by reducing or limiting the
number of temporary or intermediate files being created during processing. Construc-
tion of a dataset using technical book reports and their respective human summaries
to further train and refine the system can be done.
References
1. OECD, How much time do teachers spend on teaching and non-teaching activities?, in Educa-
tion Indicators in Focus 29 (OECD Publishing, 2015). https://ideas.repec.org/p/oec/eduaaf/29-
en.html
2. A.K. Mohammad Masum, S. Abujar, M.A. Islam Talukder, A.K.M.S. Azad Rabby, S.A.
Hossain, Abstractive method of text summarization with sequence to sequence RNNs, in 2019
10th International Conference on Computing, Communication and Networking Technologies
(ICCCNT), Kanpur, India, pp. 1–5 (2019). https://doi.org/10.1109/ICCCNT45670.2019.894
4620
3. S. Modi, R. Oza, Review on abstractive text summarization techniques (ATST) for single and
multi documents, in 2018 International Conference on Computing, Power and Communication
Technologies (GUCON), Greater Noida, India, pp. 1173–1176 (2018). https://doi.org/10.1109/
GUCON.2018.8674894
4. S.S. Naik, M.N. Gaonkar, Extractive text summarization by feature-based sentence extraction
using rule-based concept, in 2017 2nd IEEE International Conference on Recent Trends in
Electronics, Information & Communication Technology (RTEICT), Bangalore, pp. 1364–1368
(2017)
5. T. Zhang, C. Chen, Research on Automatic Text Summarization Method based on TF-IDF,
ed. by F. Xhafa et al. (ed.): IISA 2019, AISC 1084 (Springer Nature Switzerland AG, 2020,
pp. 206–212)
6. S. Erera, M. Shmueli-Scheuer, N. Guy, B. Ora, H. Roitman, D. Cohen, B. Weiner, Y. Mass,
O. Rivlin, G. Lev, A. Jerbi, J. Herzig, Y. Hou, C. Jochim, M. Gleize, F. Bonin, D. Konopnicki,
David,A Summarization System for Scientific Documents (2019)
7. D. Shen, Text summarization, in Encyclopedia of Database Systems, ed. by L. Liu, M.T. Özsu
(Springer, New York, NY, 2018). https://doi.org/10.1007/978-1-4614-8265-9_424
8. https://www.webstep.se/an-introduction-for-natural-language-processing-nlp-for-beginners
9. M. Haider, M. Hossin,H. Mahi, H. Arif, Automatic Text Summarization Using Gensim
Word2Vec and K-Means Clustering Algorithm, pp. 283–286 (2020). https://doi.org/10.1109/
TENSYMP50017.2020.9230670
10. T. Hailu, J.-Q. Yu, T. Fantaye, A framework for word embedding based automatic text summa-
rization and evaluation. Information (Switzerland) 11, 78 (2020). https://doi.org/10.3390/inf
o11020078
11. https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
12. https://radimrehurek.com/gensim/auto_examples/tutorials/run_word2vec.html
13. https://towardsdatascience.com/understanding-k-means-clustering-in-machine-learning-6a6
e67336aa1
14. https://github.com/WING-NUS/scisumm-corpus/tree/master/data/Training-Set-2019/Task2/
From-ScisummNet-2019
15. https://cloud.google.com/translate/automl/docs/evaluate
16. J.N. Madhuri, R. Ganesh Kumar, Extractive text summarization using sentence ranking, in
2019 International Conference on Data Science and Communication (IconDSC), Bangalore,
India, pp. 1–3
Management of Digital Evidence
for Cybercrime Investigation—A Review
1 Introduction
Laws play an important part in the judicial systems of any country. It helps in main-
taining peace and harmony. A crime in simple terms is an unlawful act punishable by
a state or other relevant authority. With the advancements in technology and various
other problems, there is always increase in diversity and frequency of crimes [1, 2].
This increases the workload of the authorities right from bottom to top level. Evidence
plays an important role to prove the facts or convict the person involved in the crime.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 133
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_11
134 C. Harshwardhan et al.
With increase in workload, it becomes difficult to manage and handle the evidence
as there are great chances of evidence mishandling and tampering cases. This would
lead to false decisions and punishments. Hence, it becomes crucial to ensure the
integrity and authenticity of the evidence. As shown in Fig. 1 according to NCRB
India, the number of cybercrime cases registered are much more than the number
of arrests made. This discrepancy is due the lack of proper evidence management
system.
1.1 Evidence
Evidence can be defined as anything that one sees, experiences, reads, or facts which
prove something is true or has really happened. Management of evidence becomes
critical in respect to the outcomes of criminal proceedings. If any aspects of evidence
management fail in protecting the evidence required for a prosecution, then it can
compromise the outcome of the judicial proceedings.
Types of Evidence
I. Scientific Evidence—This evidence either supports or rejects a scientific theory,
laws or hypothesis.
II. Digital evidence—The Electronic devices or gadgets related to the crime form
the electronic evidence. The data extracted in digital form from these sources acts as
digital evidence, which can be used in trials.
Management of Digital Evidence for Cybercrime Investigation … 135
The phases involved in crime scene investigation are, preserving the scene; surveying
and searching for evidence; documenting the evidence and scene; reconstruction of
scene. Documentation of evidence is the most important step, crucial for maintaining
a chain of custody. The Actors involved in the management of Digital Evidence are,
Victims and the Suspects, First Responders, Forensic investigators, Police officers,
Court experts and Judicial Authority. In the life cycle of evidence, the evidence must
be managed and administered over its entire lifetime, which is divided into various
phases, from the Evidence collection to its disposal, as described in Fig. 2.
• Acquisition which involves collection and capturing in one place.
• Description which involves combining, describing and arranging the evidence.
• Analysis which involves meaningful interpretation, scientific testing and investi-
gation.
• Assessment which involves Evaluation and judgment of outcomes in this case
facts.
• Presentation or disclosure in the court trials and proceedings.
• Disposal which involves return, or destruction, sale or donation of the evidence.
136 C. Harshwardhan et al.
2 Literature Review
The authors in this paper [4] propose a Blockchain based system, bringing trans-
parency in chain of custody and ensuring integrity of evidence, while transfer-
ring between one participant to another within a blockchain. Here, the evidence
is encrypted using a hash, generated by Base64 algorithm, which is transferred to the
recipient, who decodes it to retrieve the original evidence. Base64 algorithm forms
a perfect fit, due to its capacity to encrypt various files like audio, images and video
into String format, which can be easily transported over the network without any
data loss. The system uses chaincode to facilitate interaction between the application
and blockchain ledger and validates a transaction.
Management of Digital Evidence for Cybercrime Investigation … 137
The authors in this paper [5] propose a model based on Blockchain technology,
which helps to secure evidence from external agents. A chain of blocks is maintained,
where each block stores the cryptographic hash of the previous block using crypto-
graphic hashing. Chain of custody process is implemented with this model, which
ensures data integrity, authenticity and security of the evidence, making the process
tamperproof. The system is implemented in the form components which interact
with each other. The core modules execute the main functions to change the state
of blockchain. The participants are under a consensus agreement and connected to
each other, through a blockchain peer-to-peer network. The Blockchain implemented
chain of custody guarantees security, integrity and authenticity to the authorized
users.
The authors in this paper [6] present a valid time stamping algorithm for digital
signature of evidence, to bring transparency at each stage of the investigation process.
The time stamp, obtained from a secured third party, helps to identify each individual
accessing the evidence. The hashing function generates a unique numeric value based
on the input called a hash value. This hash value is generated for each piece of
evidence, which is further sent to a time stamp authority, which adds it back to the
client by signing with a private key. The received time stamped value is verified with
a public key and stored locally. This system depends on a third party to generate a
timestamp, due to its complexity in implementing it.
The authors in this paper [7] contribute to different phases of chain of custody
followed in different countries. This is implemented by incorporating various layers
in the technical domain, which brings accountability, legality and authenticity. This
system is implemented in two parts. Firstly, the procedural safeguards which ensure
transparency and privacy. Secondly, the data protection safeguards bring account-
ability measures in the system. The whole system safeguards privacy by using
encrypted data at all stages, right upto the court of justice, by providing encryption
key.
The authors in this paper [7] propose the following idea. In today’s world, modern
day technology is achieving advancements in terms of portability and power. So
preservation of evidence online is a little bit challenging. This study uses blockchain
technology which provides integrity, security to collect, store and analyze digital
evidence. We are using proof of concept in hyper ledger composer. Digital evidence
will be considered as an admissible evidence if it satisfies properties: Authentic,
Complete, Reliable and Believable. Also, as a digital evidence, it should satisfy
technical aspects for admissibility such as transparency, explain ability. The objec-
tive is to develop a private blockchain with roles as Creator, forensic investigator,
prosecutor, defense, court using hyper ledger to store history of handling digital
evidence.
The authors in this paper [8] propose the following idea. As the existing system
imposes a weak security model, this study will supervise the entire evidence flow from
collection of evidence by police investigators to court trials and juror votes. In this
process, jurors can vote securely based on evidence and their data will be recorded for
decision. The main objective of this study is to develop a secure, integrated evidence
management process from police investigation to court hearings.
138 C. Harshwardhan et al.
The authors in this paper [9] propose a study which focuses on developing a
blockchain based system on proof of concept to assure legal agreements. This appli-
cation will provide reception, storage and maintenance of collected digital evidence.
In this study, users should not take action which would modify the evidence contents.
In case of modification, he should explain implications of modification to expert
authorities.
The authors in this paper [10] propose the following idea. Most of the existing
Evidence Management systems are centralized, not tamper proof. So, such evidence
is prone to get tampered before actual law proceedings. As a result, the authors have
come up with a solution to use the Blockchain technology which is built over a private
Ethereum and by maintaining a chain of custody (CoC) (viz. a log file used to store
the chronological sequence of the evidence collection). The main objectives of the
author is to maintain the integrity, evidence admissibility in the court of law and to
allow only certain people to access the evidence. They have used the Raft Consensus
Algorithm to achieve their objective. However, the Raft algorithm is slower compared
to other algorithms (IBFT).
The authors in this paper [4] are trying to maintain the integrity and trustworthiness
of the digital evidence by use of Blockchain technology and using the chain of
custody. Basically, in the system, evidence will be stored in blocks and each block
will store the cryptographic hash value of the previous block. The authors tried to
focus on integrity, traceability, authenticity and verifiability of the evidence.
The authors in this paper [11] have focused on the trustworthiness and more
transparency of the evidence management process. In their proposed system, they
have to implement their functionalities viz. Digital Evidence Inventory—It is based
on the Blockchain technology, and it is used to collect the evidence. This is immutable
and everyone can have access to DEI. Forensics Confidence Rating—All the evidence
are given a rating (score) to check the trustworthiness. Global Digital Timeline—
It provides the time of order of evidence. Thus, their framework provides a better
transparency and confidence of evidence to court or investigators.
The authors in this paper [12] have used the chain of custody with the Blockchain
technology. Their framework comprises of a Blockchain Digital Evidence Bag (B-
DEC), and the system is built over Ethereum.
The authors in this paper [13] propose the following idea. Generally, an inves-
tigation is initiated with collection of evidence. Investigating officials analyzes the
evidence to predict why and how the crime was occurred. The evidence is uploaded to
the blockchain to make them tamper-proof. Evidence should be further admissible in
the court of law [14, 15]. In this paper, the author comes up with a friendly approach
for access, so anybody can view the details. It satisfies all the requirements of chain
of custody. It provides integrity and authentication by issuing identity for every user
who gets logged in for using the database and the block undergoes mining to make
sure it is secure, transparent and tamper-proof.
The authors in this paper [16] propose a Blockchain based system to bring trans-
parency in CoC. Each block contains a cryptographic hash of the previous block
and timestamp. Here, the author thought of implementing one of the three types
of Blockchain. The public blockchain in which every transaction is checked by
Management of Digital Evidence for Cybercrime Investigation … 139
3 Analysis of Review
From the analysis and review (Table 1) of existing schemes described in Table 1. It is
observed that the arrest made in cybercrime cases is still on rise. To overcome these
issues, we are currently working on development of a blockchain based evidence
management system that will help in managing the evidence in such a way that the
integrity of evidence will be preserved.
5 Conclusion
In this paper, we have successfully analyzed and compared various systems and their
algorithms. We were able to identify the challenges present in the current systems
140 C. Harshwardhan et al.
Table 1 (continued)
References Blockchain Contributions Advantages Shortcomings
platform
(consensus used)
Petroni et al. Blockchain, Develop a CoC Reduces the Centralized
[9] chain of custody of digital malicious approach where
evidence proof of modifications in only expert
concept of evidence authority has
blockchain based access to modify
system to assure evidence
legal agreements
Raorane, et al. Blockchain Develop a The private Current system is
[10] Blockchain based blockchain enables not capable of
Coc, which is even distribution working on storing
stored in a of access to large amounts of
database like authorized evidence
mongoDB with personnel, thus
timestamp guaranteeing
security
Chopade et al. Blockchain Develop a Coc Security from data The size of the
[4] based on breach and attacks hash generated
blockchain, is guaranteed, due using Base64
where the to use of encrypted Algorithm is large,
evidence is evidence, while which is not
encrypted using a transfer within a feasible
Base64 algorithm network
Ćosić et al. [11] Time stamp Develop an Maintains a The system is not
authority improved system detailed and accustomed to
which uses transparent Chain handle evidence
timestamp of custody, safely and needs
generated by a providing security research to add
third party, for to evidence more authenticity
digital signature through
of evidence cryptographic
evidence
Gopalan et al. PoW Develop a CoC The evidence is Due to the rise in
[12] in a tenable way stored as a data volume, there
which is distributed ledger is reduction in
user-friendly for so as to maintain flexibility and
accessing and the data which can capability of the
which assures be further system
tamper-proof of admissible in the
evidence court of law
Khateeb et al. Blockchain, CoC Develop a better CoC High complexity
[13] system which documentation
supports DFIR improves data
by implementing availability,
CoC legibility with
proof of validity,
so data is not
altered
(continued)
142 C. Harshwardhan et al.
Table 1 (continued)
References Blockchain Contributions Advantages Shortcomings
platform
(consensus used)
Jeo et al. [17] Hyperledger Develop a It assures integrity Complex to
Fabric blockchain based of forensic data as implement
model using it is shared with all
Hyperledger peers in the
Fabric network
and architectures. These challenges would form the base for further development and
research of the system, which would guarantee security, integrity and authenticity of
the evidence in all stages of the investigation process. This would bring transparency
in court proceedings and trials.
References
1. R.Y. Patil, S.R. Devane, Network forensic investigation protocol to identify true origin of cyber
crime. J. King Saud Univ. Comput. Inf. Sci. (2019)
2. P.R. Yogesh, S.R. Devane, Primordial fingerprinting techniques from the perspective of digital
forensic requirements, in 2018 9th International Conference on Computing, Communication
and Networking Technologies (ICCCNT ) (IEEE 2018), pp. 1–6
3. R.Y. Patil, S.R. Devane, Unmasking of source identity, a step beyond in cyber forensic, in
Proceedings of the 10th International Conference on Security of Information and Networks
(2017), pp. 157–164
4. L. Ahmad, S. Khanji, F. Iqbal, F. Kamoun, Blockchain-based chain of custody: towards real-
time tamper-proof evidence management, in Proceedings of the 15th International Conference
on Availability, Reliability and Security (2020), pp. 1–8
5. S. Rao, S. Fernandes, S. Raorane, S. Syed, A novel approach for digital evidence management
using Blockchain (2020). Available at SSRN 3683280
6. J. Ćosić, M. Bača, (Im)proving chain of custody and digital evidence integrity with time stamp,
in The 33rd International Convention MIPRO (IEEE, 2010), pp. 1226–1230
7. J. Rajamäki, J. Knuuttila, Law enforcement authorities’ legal digital evidence gathering:
legal, integrity and chain-of-custody requirement, in 2013 European Intelligence and Security
Informatics Conference (IEEE, 2013), pp. 198–203
8. A.H. Lone, R.N. Mir, Forensic-chain: Blockchain based digital forensics chain of custody with
PoC in hyperledger composer. Digit. Investig. 28, 44–55 (2019)
9. M. Li, C. Lal, M. Conti, D. Hu, LEChain: A blockchain-based lawful evidence management
scheme for digital forensics. Futur. Gener. Comput. Syst. 115, 406–420 (2021)
10. B.C.A. Petroni, R.F. Gonçalves, P.S. de Arruda Ignácio, J.Z. Reis, G.J.D.U. Martins, Smart
contracts applied to a functional architecture for storage and maintenance of digital chain of
custody using blockchain. Forensic Sci. Int. Digit. Invest. 34, 300985 (2020)
11. S.H. Gopalan, S.A. Suba, C. Ashmithashree, A. Gayathri, V.J. Andrews, Digital forensics using
Blockchain. Int. J. Recent Technol. Eng. (IJRTE) 8(2S11) (2019). ISSN: 2277-3878
12. D. Billard, Weighted forensics evidence using blockchain, in Proceedings of the 2018
international conference on computing and data engineering (2018), pp. 57–61
13. E. Yunianto, Y. Prayudi, B. Sugiantoro, B-DEC: digital evidence cabinet based on Blockchain
for evidence management. Int. J. Comput. Appl. 975, 8887 (2019)
Management of Digital Evidence for Cybercrime Investigation … 143
14. R.Y. Patil, S.R. Devane, Hash tree-based device fingerprinting technique for network forensic
investigation, in Advances in Electrical and Computer Technologies (Springer, Singapore,
2020), pp. 201–209
15. P.R. Yogesh, Formal verification of secure evidence collection protocol using BAN logic and
AVISPA. Proc. Comput. Sci. 167, 1334–1344 (2020)
16. H. Al-Khateeb, G. Epiphaniou, H. Daly, Blockchain for modern digital forensics: the chain-
of-custody as a distributed ledger, in Blockchain and Clinical Trial (Springer, Cham, 2019),
pp. 149–168
17. J. Jeong, D. Kim, B. Lee, Y. Son, Design and implementation of a digital evidence management
model based on hyperledger fabric. J. Inf. Process. Syst. 16(4) (2020)
18. P.R. Yogesh, Backtracking tool root-tracker to identify true source of cyber crime. Proc.
Comput. Sci. 171, 1120–1128 (2020)
Real-Time Human Pose Detection
and Recognition Using MediaPipe
1 Introduction
In our paper, we perform real-time human action detection through live video
analysis using MediaPipe. MediaPipe is an open-source framework fabricated for
constructing sophisticated pipelines by taking advantage of accelerated GPU or CPU.
It provides accurate and fast tailored machine learning applications for students,
developers, and researchers. Its key uses quick prototyping of perception pipelines
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 145
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_12
146 A. K. Singh et al.
with inference models and other reusables [1]. It provides custom python solutions
[1] which can be easily used by installing Pandas, NumPy, and Scikit libraries.
MediaPipe essentially consists of three parts: (a) infer sensory data in a substruc-
ture, (b) tools for performance evaluation, (c) collection of computational nodes for
reusing deducible and processing constituents. This paper proposes to extract action
from our video data from different poses through MediaPipe Holistic. The Medi-
aPipe Holistic consists of in-built models that show the relationship between all of
the landmark components to classify the connection between different body parts to
predict human body pose and emotions (refer Fig. 2).
To classify actions based on data, we use several machine learning classifiers in
our paper. Random forest classifier is widely used in the field of image classification,
human pose detection, action recognition, etc. Ridge regression is another classi-
fication algorithm that showcases the different actions in a video sequence. Linear
regression is class of supervised machine learning (ML) algorithms which carries
out regression tasks. The goal is to forecast a non-independent variable value (y) that
is dependent on a given independent variable (x). The gradient boosting classifier is
an incrementing combination of a base model whose errors are fixed in consecutive
recurrence by incorporating regression trees that rectify the residual elements which
are previous stage errors to optimize random loss functions [1, 2].
OpenCV 4.0 permits the designation of sequences of OpenCV image processing
functions in a graphical form. It provides native data streaming support that is far
more suitable for audio and video analyzing [1].
The major contributions are:
1. An efficient and robust four-stage human pose tracking pipeline that can detect
various human actions and emotions in real time.
2. A face, pose, and hand prediction model that is adept at estimating a 3-D
landmark model with only RGB as the input.
3. Open-source hand, pose, and face tracking pipeline framework, which is ready
to use, besides being able to form customizable machine learning models in
JavaScript as well as in Python.
2 Literature Review
pyramids. These characteristic attributes can encode details about the body, strength,
and contours and thus offer an insightful depiction of human positions.
3 Proposed Methodology
In this paper, we decode human body pose and implement MediaPipe Holistic, a
solution provided by MediaPipe ML framework, made up of upto eight different
models that coordinate with each other in real time while minimizing memory transfer
to provide pose estimation, face detection, and hand tracking into a single efficient
end-to-end pipeline [7, 8].
As discussed in the workflow diagram (Fig. 1), we create a live dataset consisting
of 501 landmarks made up of pose, face, and hand landmark coordinates and then
perform train–test split in 70:30 ratio to obtain random train and test subsets, respec-
tively. We train a custom machine learning pipeline including four individual separate
machine learning classification models. Finally, using our best performing ML clas-
sification model (Table 1), we render the predictions onto the real-time device feed
using OpenCV (as shown in Fig. 5) [9, 10].
landmarks, it obtains 3 ROI crops: one each for left hand and right hand and one for
the face. It also employs a re-crop and tracking model in the pipeline which allows
the pipeline to crop full-resolution input frames to achieve better region of interest
(ROI) and presumes that the object detected is not moving significantly from frame-
to-frame, using the result from the preceding frame as a pathway to body ROI in the
current one.
To add further, the low resolution in captured frames (256 × 256) from the human
body pose model means that the subsequent region of interests for hand and face
are, however, too inconsistent to lead subsequent models. The re-cropping of those
regions [7] act as spatial transformers that correct this and are designed to remain
lightweight, costing only around 10% of the correlative to the model’s inference
time.
3.2 Structure
For the purpose of identifying a human body, we use BlazePose’s pose detector.
Using this model, we are able to identify 33 pose landmarks (as shown in Fig. 3), 3D
landmarks of a single picture of a body from RGB video frames which is more than
that of the current standard COCO topology. This method achieves real-time perfor-
mance on mobile devices in python. It utilizes a two-step ML pipeline, wherein the
pipeline first detects a person’s region of interest (ROI) within the frame and performs
re-cropping on the frame to predict pose landmarks. This pipeline is implemented as
a subgraph of the MediaPipe graph.
MediaPipe Hands is a devoted solution for hand and fingers tracking. It can deduce
from only one picture up to 21 3D markings of a hand (as shown in Fig. 4), with the
ability to scale multiple hands. It is a combination of a palm detection system that
works on the whole picture, returning an orientated hand boundary model and the
hand landmarks structure that functions on the clipped palm detector-defined picture
area, returning 3D hand key points with high reliability. This pipeline is implemented
as a subgraph of the MediaPipe graph and renders using a pose renderer subgraph.
There is also the freedom to build with CPU or GPU.
The MediaPipe Facial Mesh calculates face geometry and estimates 468 three-
dimensional [1] facial landmarks. It uses machine learning to deduce a three-
dimensional plane configuration that only requires a single camera feed and does
not need a separate depth sensor [1]. With potential hardware acceleration, it can
monitor landmarks on individual faces using a lightweight model framework across
the processing pipelines. Furthermore, a system of measuring three-dimensional
space is established and the facial landmark screen positions are used to measure
the facial morphology throughout the facial region. In order to promote a durable,
effective, and mobile logic, Procrustes analysis is used. The study is conducted on
the central processing unit and induction of the machine learning model is predicated
on minimum speed/memory.
Real-Time Human Pose Detection and Recognition Using MediaPipe 151
We propose to extract action from our video data from different poses through Medi-
aPipe Holistic. MediaPipe Holistic incorporates separate models, namely pose, face,
and hand (as shown in Fig. 1). All these separate models provide landmarks which
when merged with their respective models yield 501 landmarks. These landmarks
consist of four coordinates (x, y, z, and visibility). Now we train a custom machine
learning model which shows the relationship between all of the landmark compo-
nents to classify the connection between different body parts to predict human body
pose and emotions.
4.1 Dataset
The data are presented in the form of landmarks which consist of four coordinates (x,
y, z, and visibility) that are exported in the form of numerical coordinates (x, y, z, and
v) to a CSV file. Here ‘v’ represents visibility coordinate meaning if the particular
landmark is displayed on the screen or not which has a range from 0 to 1. Our dataset
contains 2004 columns representing 501 landmarks.
152 A. K. Singh et al.
4.2 Implementation
Once we have captured our live dataset consisting of 501 landmarks (including 33
pose landmarks and 468 face and hand landmarks), it is stored in a ‘.csv’ file. Now,
we access the pandas library to read the dataset in the respective csv file in the form
of a dataframe. Once the data frame has been created consisting of all the landmark
components, we perform a train–test split in the proportion of 30% testing and 70%
training using the ‘train_test_split’ module.
4.2.3 StandardScaler()
It normalizes the data by making mean = 0 and scales the data to unit variance. The
standard score of sample ‘x’ is calculated as:
z = (x − u)/s. (1)
where u is the mean of the training samples or 0 if with_mean = False and s is the
standard deviation (SD) of the training samples or 1 if with_std = False [8].
Real-Time Human Pose Detection and Recognition Using MediaPipe 153
Performing predictions using each one of the machine learning models in the pipeline,
we obtain the following accuracy in Table 1.
Here in our project, we consider random forest classification algorithm over other
algorithms in the pipeline due to the following reasons:
1. After executing the model multiple times in a real-time environment (as shown
in Table 1), we were able to achieve the highest accuracy of 96.78 (approx) for
our paper with this algorithm.
2. The random forest classification model is more robust due to its randomness and
less overfitting on the training data. Thus, making it more suitable than other
machine learning model used in this pipeline.
4.2.5 Predictions
Finally, using the pickle library we dump and save our best machine learning model
based on accuracy metrics in ‘.bkl’ format. Then again using pickle, we load our best
performing ML classification model to render landmarks and predict body pose in
the real-time device feed using OpenCv.
The results have been shown (in Fig. 5). The model combines all landmarks (pose,
face, and hand) into one large array of landmark coordinates (x, y, z, and v) and uses
this to create a new dataframe. Then the model makes further detections on this new
dataframe. The model determines the class label and also predicts the maximum
probability of the class that is detected. The model then employs OpenCv’s cv2
module to render the results to the screen over the live prediction window (as shown
in Fig. 5). As showcased in results (Fig. 5), this model correctly detects emotions
as well as makes predictions on the human pose which proves the versatility of this
custom model. The random forest classification model used in the final steps to make
predictions on the real-time device feed has been successfully able to achieve the
highest classification accuracy of 96.78% on testing sequences.
154 A. K. Singh et al.
5 Conclusion
This paper describes the way modern image processing is going to be used for a
vast number of use cases. In this paper, we have particularly shown how we can
analyze human actions and emotions in real time. For our model to be more realistic,
we created our own dataset such as the model assimilates an authentic and real-life
environment rather than using a generic dataset. This can be used as a base for many
forthcoming technologies in the field of artificial intelligence, security, augmented
reality, and video analysis. Other machine learning classification algorithms can also
be used for this model. We suggest that this paper be expanded for the detection
of multi-human action subject to further improvements in the MediaPipe Holistic
library.
For this paper, accuracy may not be the best evaluation metric since the objective
has been a real-time implementation (as shown in Fig. 5). Accuracy may vary due
to different surroundings, discrete datasets made by separate users, user’s device
capability, lighting conditions, etc.
References
Abstract The performance evaluation of the algorithm is based on the data we feed
to it, mainly in training the machine learning models. If the data we sent to the
model was inconsistent or missing, then this might lead to some false predictions.
For example, if the model is something related to a healthcare system that needs to
predict a patient’s condition over a while, then predictions must be accurate. Here, the
accuracy depends on the data we feed to the ML model. In the process of training the
ML model, data preprocessing is a crucial step. In this step, handling the missing data
is essential, and it is vital to take it correctly; else, it might lead to some inconsistent
results. This work aims to introduce a new method named SN-Sync for charging the
missing data, analyze an algorithm, and compare the efficiency with some traditional
techniques.
1 Introduction
Many researchers had found that filling the missing data without proper estimates
will give inconsistent results. The following are traditional techniques that are used
most to charge the missing values.
Literature study-1, deleting the entire column or a row in a given data set. If the
percentage of the null values is greater than or equal to 50%, then the entire column
is dropped. Similarly, if a row contains one or more null values, then the entire
row is dropped [1]. The disadvantages of using this technique are losing some valu-
able information and working inefficiently if the percentage of missing values are
increasing.
Y. S. S. Nuchu (B)
Reputation.com, Hyderabad, India
S. R. Narisetty
Assistant Professor, Department of CSE, Lakireddy Bali Reddy College of Engineering
(Autonomous), Mylavaram, Krishna District, Andhra Pradesh 521230, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 155
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_13
156 Y. S. S. Nuchu and S. R. Narisetty
Literature study-2, ascribe missing values for unceasing data; in a given data set,
this technique includes calculating the mean or median and replacing all the null data
blocks with the calculated mean or median [2]. This technique only works if the data
is numerical and can cause data leak issues.
1
n
mean(X ) = xi (1)
n i=0
Literature study-3, ascribe missing values for categorical data; for given informa-
tion, the missing values are replaced by calculating the mode. This technique works
only with categorical data.
Literature study-4, implementing a model that includes an algorithm that accepts
missing values [3]. Some example algorithms are k-nearest neighbor’s,
n
K -NN, similarity(x, y) = − f (xi , yi ) (2)
i=0
K-Mean Clustering,
c
ci
K -MC, J (v) = (||xi − vi ||)2 (3)
i=1 j=1
Naive Bayes, random forest, where the results are much more accurate when
compared to products when the data has missing values [4]. In this paper, we introduce
a new method for charging the missing values using synthesized data.
In data science, synthetic data is the fastest growing trend and a rising, most valuable
tool. What do precisely synthetic data mean? Synthetic data is composed of data
that is not based on any real-world readings or events [5]. It is purely generated by
a computer program based on use cases, scenarios, or a real-world data set.
The primary goal of generating the synthetic data set is to be flexible and powerful
enough to train a machine learning model. There are numerous advantages of using
synthetic data and mainly used in data science [6]. The central use case of using
synthetic data in ML and data science reduces the need to record real-world data
Charge the Missing Data with Synthesized Data … 157
and events. Thus, it is more obvious to generate and construct data based on more
quickly than waiting on a data set generated based on real-world events.
This use case is commonly valid for the events that are very frequently occurring.
Synthetic data has Nemours use cases. It can be applied in machine learning tasks
[7]. Some of the common use cases for synthetic data are self-driving vehicles, health
care, robots, and security.
It is effortless and fast to generate synthetic data. Once the environment is ready,
it is very cheap to generate as much data as needed. Synthetic data help to generate
labels that are very costly to produce from real-world events.
Synthetic environments are very flexible to modify and improve the model
training. Synthetic data can be used to replace certain parts of data, most sensi-
tive data. For example, in some activities, personal data like personally identified
information and personal health-related are prevented information [8]. To avoid such
consequences, it is recommended to create and use synthetic data.
To generate any data, there is a process that involves few steps. Machine learning
techniques programmatically generate synthetic data. Traditional machine learning
techniques like decision trees and deep learning techniques can be used. We used a
library in R name “synthpop” to generate the synthetic data. Syn() function returns
synthesized data for the given original data set. We can set the bucket size and method
to generate. Figure 1 shows one such example and very commonly used when the
input fields contain numerical data.
A real-time data set [9] that contains body temperature, sex, heart rate as labels
is passed as an input to the Syn(). Figures 2 and 3 show the comparison between the
original and synthetic data. By varying the input params, the accuracy may change.
With a smaller number of events records, we can generate a large data set.
Figures 2 and 3 are generated by passing the method property value as cart and
min bucket size as 10. Figure 2 states the comparison between original and synthetic
data of body temperature. The correlation between observed and synthetic values is
Fig. 2 Comparing observed data and synthetic data for body temperature in cart mod
Fig. 3 Comparing observed data and synthetic data for heart rate in cart mode
very close. This gives a tremendous advantage to train various ML models quickly
[10]. Figure 3 states the same with heart rate.
Figures 5 and 6 are generated by passing the method property value as a sample
which can take categorical values in the input.comparision as shown in Fig. 7.
Fig. 5 Comparing observed data and synthetic data for heart rate in sample mode
Fig. 6 Comparing observed data and synthetic data for body temperature in sample mode
3 Proposed System
In this paper, we proposed a new technique named SN-Sync to charge the missing
data. Figure 9 shows the process flow of how the SN-Sync technique works. For
testing this process flow, we collected a real-time healthcare data set [9] that contains
body temperature, sex, heart rate as labels and computed in R.
Step-1: Load the data into R space.
Step-2: Identify the columns and rows that contain missing data or null values and
find the correlation matrix for the input data set as shown in Table 1.
Step-3: Generate synthetic data for columns and rows that contain missing data and
null values. Based on the correlation matrix in step-2, find the group of clusters based
on the K-means clustering algorithm [11].
Figure 8 shows the group of clusters between heart rate and body temperature.
By using the elbow method, the size of the cluster is calculated which is 4.
km = K Means(n_clusters = 4) (5)
Midpoints for each of the four clusters are marked [12]; it determines the average
cluster values for both heart rate and body temperature. Table 2 gives the mean values
for both body temperature and heart rate in each cluster. In Fig. 8,
Step-4: Replace the missing values with the mean value of a cluster formed by
synthesized data. To replace a null value in a row then get the exact row value for
a column with the highest correlation. For example, to replace a null value in body
temperature at row 14 then match the column that has the highest correlation; in
this case, it was heartrate and got the column value at row 14 and compare the heart
rate value with the cluster’s means as shown in Table 2. Replace the null with the
corresponding body temperate cluster value in Fig. 9.
Step-5: Send the final data set to a machine learning algorithm and compare the
results with some traditional techniques.
162 Y. S. S. Nuchu and S. R. Narisetty
4 Results
Decision tree classification machine learning algorithm is used to compare the results
between some standard traditional techniques like replacing the missing data with
mean, replacing the missing data with mode, deleting the entire row or column that
contains null values or missing data, with the SN-Sync technique.
Mean: All the missing data and null values are replaced by calculating the mean for
that row for the given data set. After replacing all the blanks with the mean value and
passing the data to the decision tree classification algorithm, the confusion matrix
value is
[[ 6 5]
[10 12]]
0.54545454545454
Mode: All the missing data and null values are replaced by calculating the mode for
that row for the given data set. After replacing all the blanks with the mode value and
passing the data to the decision tree classification algorithm, the confusion matrix
value is
[[ 4 7]
[8 14]]
0.54545454545454
Delete row or column: All the missing data and null values are dropped either a
row or the entire column for the given data set. After dropping all the blank values,
input data is passed to the decision tree classification algorithm, the confusion matrix
value is
[[ 4 3]
[8 13]]
0.607142857142857
SN-Sync: All the missing data and null values are replaced with the mean value of
a cluster formed by synthesized data for the given data set. After replacing all the
blank values and passing the data to the decision tree classification algorithm, the
confusion matrix value is
[[ 2 2]
[7 13]]
0.625
Charge the Missing Data with Synthesized Data … 163
Figure 10 shows the comparison graph between mean, mode, dropping rows and
columns and SN-Sync.
5 Conclusion
References
1. M.S. Santos, R.C. Pereira, A.F. Costa, J.P. Soares, J. Santos, P.H. Abreu, Generating synthetic
missing data: a review by missing mechanism. IEEE Access 7, 11651–11667 (2019)
2. P. McMahon, T. Zhang, R.A. Dwight, Approaches to dealing with missing data in railway asset
management. IEEE Access 8, 48177–48194 (2020)
3. R. Madhuri, M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy, S.C. Satapathy, Cluster anal-
ysis on different data sets using K-modes and K-prototype algorithms, in International Confer-
ence and Published the Proceeding in AISC and Computing, vol. 249 (Springer, indexed by
SCOPUS, ISI proceeding etc., 2014), pp. 137–144. ISBN 978-3-319-03094-4
4. T. Kose, S. Ozgur, E. Coşgun, A. Keskinoglu, P. Keskinoglu, Effect of missing data imputation
on deep learning prediction performance for vesicoureteral reflux and recurrent urinary tract
infection clinical study. BioMed Res. Int., 15 (2020). Article ID 1895076
5. B.S. Panda, R.K. Adhikari, A method for classification of missing values using data mining
techniques, in 2020 ICCSEA (Gunupur, India, 2020), pp. 1–5
164 Y. S. S. Nuchu and S. R. Narisetty
6. P.J. García-Laencina, P.H. Abreu, M.H. Abreu, N. Afonoso, Missing data imputation on the
5-year survival prediction of breast cancer patients with unknown discrete values. Comput.
Biol. Med. 59, 125–133 (2015)
7. D.C. Howell, The treatment of missing data in The Sage Handbook of Social Science
Methodology (Sage, London, UK, 2007), pp. 208–224
8. S.R. Narisetty, S. Farzana, P. Maheswari, L-semi-supervised clustering for network intrusion
detection. IJEAT 8(3S) (2019). ISSN: 2249-8958
9. https://tuvalabs.com/datasets/body_temperature_sex__heart_rate/activities.
10. J.P. Reiter, J. Drechsler, Releasing multiply-imputed synthetic data generated in two stages to
protect confidentiality. IAB Discussion Paper 200720 (2007)
11. A. Naik, S.C. Satapathy, K. Parvathi, Improvement of initial cluster center of c-means using
teaching learning based optimization. Proc. Technol. 6, 428–435 (2012)
12. M.R. Murty, J.V.R. Murthy, P.V.G.D. Prasad Reddy et al., Dimensionality reduction text data
clustering with prediction of optimal number of clusters. IJARITAC 2(2), 41–49 (2011)
Discovery of Popular Languages
from GitHub Repository: A Data Mining
Approach
Abstract Usage of Open Source Software (OSS) has been increased over the past
fifteen years among programmers and computer users. OSS communities work as
a “Bazaar” where the project constructors and end-users meet together and search
for suitable matches to their skills and requirements. OSS is emerging as a strong
competitor to commercial or closed software. GitHub is an OSS forge started in
2008 in order to simplify code sharing. It is a Web site and cloud-based service that
aids software developers to store, manage, track, and control changes to their code.
When a GitHub project fails, it results in the loss of time, effort, and resources of this
large community. The current need is to build models that find interesting factors
that contributes to the success of these projects. The massive repositories make this
domain a good candidate for exploratory research using the data mining approach.
In this work, the FP-Growth method is used to find the popular two programming
language combinations and is validated using the SPSS tool. The outcome of this
work benefits the OSS community in terms of time and resources.
Usage of Open Source Software (OSS) has been increased from past fifteen years
among the programmers and computer users. These communities work as a “Bazaar”
where the project developers and end-users meet together and search for suitable
matches to their skills and requirements. Members of this community can view
and update the software for its improvization. They can also detect, fix bugs, and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 165
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_14
166 K. J. Upadhya et al.
contribute to the software. It is said that this evolutionary process can work faster than
the traditional hierarchical and closed model and produce better quality software.
Linux, Apache Server, and Mozilla Firefox are some best outcomes of the OSS
development [1, 2].
GitHub is an OSS forge setup in 2008 with the aim of simplifying code sharing. It is
a Web site and cloud-based service that assists software developers to store, manage,
track, and control changes to their code. Usage of GitHub is rapidly increasing. In
2017, the GitHub community associated 24 million people functioning across 67
million repositories. The current need is to build models that can be used to detect
the factors that are critical in the most successful projects among GitHub repositories.
This will benefit the users and the corporations to make the right choice of projects
[3].
Pattern mining is an important area within data mining that discovers interesting
patterns from the database. Rules found in a database can be classified as frequent
and rare rules. The support count or frequency of an item set is the total number of
records that contain the item set in the database file. A frequent item set is an item
set whose support count satisfies a user-defined minimum threshold. Otherwise, it
is called a rare item set that occurs infrequently which may represent unexpected or
previously unknown associations [4].
1 Problem Statement
Keeping the idea of free sharing, OSS communities allow access to the project source
code and other artifacts (e.g., e-mail communications, number of downloads, infor-
mation about developers, and bug reports) of these projects. As the GitHub usage
community is growing day by day, the necessity of tested models and generating
hypotheses for successful OSS projects also increases. The massive repositories
make this domain the best candidate for exploratory research using the data mining
approach. When an OSS project fails, it results in the loss of time, effort and resources
of the community. There is a requirement of building models that can predict the
outcome of OSS projects and find the features that contribute to the success of these
projects.
The remainder of the paper is organized as follows: Sect. 2 presents the literature
review of the research carried out. Section 3 describes the research methodology.
Section 4 puts forth the results and analysis. Inferences and future directions are
highlighted in Sect. 5.
2 Related Works
The papers reviewed have been surveyed in the direction of data mining techniques
used, types of association rules discovered, and different sources of data. Sanjay et al.
Discovery of Popular Languages from GitHub Repository … 167
[1] researched OSS by collecting data from the Sourceforge Web site, www.source
forge.net, with the goal of extracting the success patterns of the OSS projects. The
research discovered interesting classes of association rules by making use of a novel
concept called association rules network (ARN). The research concentrated on rules
with singleton consequents and the outcome of the research is validated using factor
analysis.
Raja et al. [2] developed a robust model which finds the factors that lead to success
of the OSS projects. The highlight of this research is the combined effects of logistic
regression (LR), neural networks (NN), and decision trees (DT). The research uses
SAS Enterprise Miner useful in the creation and validation of new models. This
work has shown that the projects developed after 2003 are in demand compared to
the older projects because of the movement of OSS projects.
Andi et al. [5] collected projects from the Sourceforge Web site, www.sourcefor
ge.net, to study the success factors. In this research, two item set association rules are
extracted to find the success factors of OSS projects. It has considered the number
of downloads as the critical parameter for success and formulated six success factors
for OSS projects.
Fragkiskos et al. [3] gathered the projects from GitHub with the intention of
finding six different association rules for successful OSS projects. Here, the main
focus was on GitHub user behavior. Association rules are discovered using the Apriori
algorithm, and the collected data are discretized using the k-means algorithm.
Hu et al. [6] researched GitHub repositories with the focus of understanding
the importance and influence of GitHub repositories. Using the GitHub user data
and repository data of a monthly star, graph was constructed. It performed a social
analysis by applying the HITS algorithm on the constructed star graph. This research
demonstrated how the repositories influence value changes every month.
Pattern Mining Approaches: Pattern mining approaches can be divided into
Apriori-based or tree-based algorithms. In Apriori-like algorithms [7], large number
of items are allowed to participate in an item set generation. Scanning the
database multiple times and the generation of large candidate sets decreases mining
performance.
Most of the pattern mining approaches follow tree-based [8–10] method that
employs the traditional FP-Growth [11] approach which constructs a frequent-pattern
tree (FP-tree). Here, the transactions are arranged in frequency ordered, and during
insertion of a transaction if there is any common prefix found, then the count of
all the items in that common prefix is incremented. The remaining part (if any) is
attached to the tree from the last node of that common prefix. If the entire transaction
is not found in the tree, then a new path is constructed in the tree by adding it to
the root node and initializing the support count of every node with value 1. This
tree is traversed in order to mine all the frequent patterns. The conditional pattern
base is generated for every item which is a small database of pattern counts that
appear with this item. This database is converted to a conditional FP-tree that is
recursively processed to discover the required pattern. Anindita et al. [12] performed
168 K. J. Upadhya et al.
3 Research Methodology
Step 2: As Apriori method suffers from the drawback of multiple scans for finding
the support count, building the model FP-Growth method is employed.
A model is built with the following specifications:
• Input: Programming languages used by each user
• Output: 2-programming language combinations and their support count
• Algorithm Used: Frequent pattern growth algorithm
– Input: Transaction database
– Output: 2-programming language combinations with support count
– Preprocessing: The transaction items (programming languages in text form)
are converted into numbers
– Programming Language Used: C language.
In sample Table 1, transaction-id and the various programming languages used by
some user of the GitHub repository is shown. In the first scan, the support count of
different programming languages are calculated. These programming languages are
sorted based upon the support count in descending order and stored in the FP-tree
as shown in Fig. 3. Then, using the FP-Growth method, only the two programming
language combinations are generated.
Step 3: The hypothesis is built for the 2-programming language combinations
discovered and tested as shown in Sect. 4.
The GitHub user accounts are divided into two groups depending upon fan followers.
Top 2-programming language combinations and their support count extracted by
the model for the two groups are given in Tables 2 and 3. Independent sample
T-Test [13] is performed for the two different groups as shown in the first hypoth-
esis test analysis. Another independent sample T-Test is performed within a single
group with fan followers >70. According to this test, the 2-programming language
combinations having either Javascript or Python shows more support counts.
The 2-programming language combinations discovered from these two program-
ming languages are considered popular languages compared with the remaining 2-
programming language combinations. The result of the test is validated in the second
hypothesis analysis.
The 2-programming language combinations discovered for two different groups
are analyzed using the following hypothesis:
• Null Hypothesis H0: Usage of popular languages has no significance contribution
in increase of number of fan followers
• Alternate Hypothesis Ha: Usage of popular languages has significance contri-
bution in increase of number of fan followers
As there are two different datasets according to the two different group of fan
followers independent sample T-Test is used to test the hypothesis. Corresponding
result obtained using SPSS tool for independent sample T-Test is used during
validation process.
The calculated and tabulated independent sample T-Test is obtained as below:
• Degrees of freedom: 29
• Level of significance: 0.05
Discovery of Popular Languages from GitHub Repository … 171
Table 2 2-programming
2-programming language Support count Fan followers >70
language combinations with
combinations
support count and more fan
followers group Python-C++ 7 12,285
Javascript-Python 6 14,198
Javascript-C++ 5 24,893
Python-Go 4 1516
C++ -HTML 4 874
Javascript-HTML 3 1564
Javascript-Go 3 961
Python-HTML 3 967
Python-Shell 3 428
Javascript-Shell 2 190
Javascript-CSS 2 220
Javascript-Ruby 2 1421
Python-PHP 2 310
C++-Shell 2 538
C++-PHP 2 310
HTML-PHP 2 310
Shell-CSS 2 252
Javascript-PHP 1 273
• Significance obtained for two tailed = 0.049 = (0.049)/2 = 0.024 (for one tailed)
< 0.05.
• Tabulated independent sample T-Test value t tabulated : 1.699
• Calculated independent sample T-Test value t calculated : 1.892.
Since t calculated ≥ t tabulated , it is observed that the null hypothesis is rejected. So
Usage of popular languages has significance contribution in increase of number of
fan followers.
2-programming language combinations generated by considering the single group
with more number of fan followers given in Table 2 are analyzed using the following
hypothesis:
• Null Hypothesis H0: Usage of popular languages has no significant contribution
in the increase of the number of fan followers
• Alternate Hypothesis Ha: Usage of popular languages has a significant contri-
bution to the increase of the number of fan followers
As there are two different datasets according to the popularity of 2-programming
language combination, independent sample T-Test is used to test the hypothesis.
The corresponding result obtained using SPSS Tool for independent sample T-Test
is used during the validation process.
The calculated and tabulated independent sample T-Test is obtained as below:
172 K. J. Upadhya et al.
Table 3 2-programming
2-programming language Support count Fan followers < 70
language combinations with
Combinations
support count and less fan
followers group Python-C++ 2 16
Javascript-Python 6 175
Javascript-C++ 3 43
Python-Go 1 26
C++-HTML 0 0
Javascript-HTML 4 83
Javascript-Go 0 0
Python-HTML 6 89
Python-Shell 1 26
Javascript-Shell 1 51
Javascript-CSS 5 217
Javascript-Ruby 2 39
Python-PHP 0 0
C++-Shell 0 0
C++-PHP 2 29
HTML-PHP 0 0
Shell-CSS 1 51
Javascript-PHP 31 85
• Degrees of freedom: 25
• Level of significance: 0.05
• Significance obtained for two tailed = 0.079 = (0.079)/2 = 0.04 (for one tailed)
< 0.05.
• Tabulated independent sample T-Test value t tabulated : 1.708
• Calculated independent sample T-Test value t calculated : 1.892.
As t calculated ≥ t tabulated , it is observed that the null hypothesis is rejected. So usage
of popular languages has a significant contribution to the increase of the number of
fan followers.
The model designed using the FP-Growth algorithm extracts popular 2-programming
language combinations along with support count. These combinations are analyzed
and validated using the variable fan followers. The analysis shows the extraction
of popular language combinations contributed to the success of the GitHub reposi-
tory. This work can be extended to discover popular triplet programming language
combinations and also frequent item sets using the other variables from the GitHub
Discovery of Popular Languages from GitHub Repository … 173
repository which contributes to the success of the project. New research direction can
be followed to search association rules that find factors for the failure of the project.
References
1. S. Chawla, B. Arunasalam, J. Davis, Mining open source software (OSS) data using association
rules network, in Pacific-Asia Conference on Knowledge Discovery and Data Mining (Springer,
2003), pp. 461–466
2. U. Raja, M. Tretter, Investigating open source project success: a data mining approach to model
formulation, validation and testing, in Proceedings of SUGI, vol. 31 (2006)
3. F. Chatziasimidis, I. Stamelos, Data collection and analysis of github repositories and users,
in 2015 6th International Conference on Information, Intelligence, Systems and Applications
(IISA) (IEEE, 2015), pp. 1–6
4. Y.S. Koh, S.D. Ravana, Unsupervised rare pattern mining: a survey. ACM Trans. Knowl. Discov.
Data (TKDD) 10(4), 45 (2016)
5. A.W.R. Emanuel, R. Wardoyo, J.E. Istiyanto, K. Mustofa, Success factors of OSS projects from
source forge using data mining association rule, in International Conference on Distributed
Framework and Applications (DFmA) (IEEE, 2010), pp. 1–8
6. Y. Hu, J. Zhang, X. Bai, S. Yu, Z. Yang, Influence analysis of github repositories. SpringerPlus
5(1), 1268 (2016)
7. R. Agrawal, R. Srikant, et al., Fast algorithms for mining association rules, in Proceedings
of the 20th International Conference on Very Large Data Bases, VLDB. vol. 1215 (1994),
pp. 487–499
8. G. Grahne, J. Zhu, Fast algorithms for frequent itemset mining using fp-trees. IEEE Trans.
Knowl. Data Eng. 17(10), 1347–1362 (2005)
9. S.K. Tanbeer, M.M. Hassan, A Almogren., M. Zuair, B.S Jeong, Scalable regular pattern mining
in evolving body sensor data. Future Gener. Comput. Syst. 75, 172–186 (2017)
10. S. Tsang, Y.S. Koh, G. Dobbie, Rp-tree: rare pattern tree mining, in International Conference
on Data Warehousing and Knowledge Discovery (Springer, 2011), pp. 277–288
11. J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation. ACM Sigmod
Rec. 29, 1–12 (2000)
12. A. Borah, B. Nath, Tree based frequent and rare pattern mining techniques: a comprehensive
structural and empirical analysis. SN Appl. Sci. 1(9), 972 (2019)
13. C.R. Kothari, Research Methodology Methods and Techniques (New Age International
Publications, 2004)
Performance Analysis of Flower
Pollination Algorithms Using Statistical
Methods: An Overview
Abstract Flower pollination algorithm and its variants are bio-inspired metaheuris-
tics. The performance analysis of flower pollination algorithm and variants of the
same, has been carried out with the help of statistical analysis to a certain degree.
Their comparison with other metaheuristic algorithms has also been done; some-
times, with the help of statistical methods and mostly with the help of benchmarking
functions. More exploration can be done in this regard. This paper is an attempt to
take a bird’s eye view of some of the work done in the context of flower pollination
algorithm and a few of its variants and the insights gained in the context of each one
of them, along with an overview of the statistical methods that have been used so far
in carrying out the performance analysis of the same. The insights listed herein also
point toward further research that can be possibly conducted in this context.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 175
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_15
176 P. Bansal and S. Bhave
Algorithm and Firefly Algorithm are some of the well-known bio-inspired algo-
rithms. The fitness functions used in these algorithms help to ascertain whether the
problem-solving process is heading in the correct direction.
Xin-She Yang proposed the flower pollination algorithm in 2012 [1]. It mimics
the natural phenomenon of transferring pollen grains between different flowers of
the same type that is commonly known as cross pollination or biotic pollination.
Birds, wind, etc., play an important role in transferring pollen grains from one flower
to another. They are known as pollinators. The activity of transferring pollens from
one flower to another is termed as global pollination and has been modeled after
Levy Distribution, which is popularly known as Levy Flight. Local pollination takes
place within a single flower itself. Flower constancy ratio is related with the degree
of similarity between two flowers. It has been observed that the standard FPA works
because of flower constancy and long-distance pollinators. Both local and global
pollination mechanisms are controlled by a parameter P that lies in the interval [0,
1].
The FPA was basically proposed for solving continuous optimization problems.
There is a possibility to utilize it in its original form or a modified form to solve other
types of problems as well. There have been several attempts in this direction as well,
and there is a plethora of possibilities to further explore with FPA and its variants.
Population diversity needs to be maintained, while using the FPA as it will help to
obtain precise solutions in Fig. 1.
Many hybrid versions and modifications of the original FPA have been proposed
in the recent years and comparative studies have been conducted by running these
variants and a few other bio-inspired algorithms on the same data set pertaining to a
specific domain or have been executed on the same benchmarking functions to gain
insight about their performance. Nabil [2] added the cloning operator from clonal
selection algorithm to flower pollination algorithm. Twenty-three test functions were
used with this modification and the modified approach worked well with them.
Cui and He [3] used Orthogonal Learning Strategy and Catfish Effect Mechanism
for solving global optimization problems. The Orthogonal Learning Strategy is based
on Orthogonal Experiment Design, and it is embedded into the local pollination
operator of flower pollination algorithm. Catfish Effect Mechanism helped them to
maintain population diversity. There is still vent for exploring the orthogonal learning
method on various sources of information. Moreover, the application of OL based
flower pollination algorithm to real world problems in engineering, for example,
identification of parameters and selection of features, is still to be done.
Abdel-Basset and Shawky [4] have carried out a detailed study of the original
flower pollination algorithm comparing it with Genetic Algorithm, Particle Swarm
Performance Analysis of Flower Pollination Algorithms … 177
Optimization, Ant Colony Optimization, Cuckoo Search, Gray Wolf Optimizer and
Grasshopper Optimization Algorithms. They concluded that flower pollination algo-
rithm needs to be improved on the grounds of avoiding premature convergence and
time required to execute.
Lukasik et al. [5] used Calinski-Harabasz Index in the cost function of flower polli-
nation algorithm and performed cluster analysis on various data sets. They concluded
that flower pollination algorithm based solution provides high clustering accuracy.
They have suggested the usage of other metaheuristic mechanisms with flower polli-
nation algorithm and comparative analysis of algorithms as future work in their paper.
Rodrigues et al. [6] have shown that the binary version of flower pollination algo-
rithm in which the positions of agents are converted into binary strings to denote the
presence or absence of features; and a discretization mechanism using a constrained
sigmoid function to map them to the Boolean lattice is employed, has results compa-
rable to some of the evolutionary techniques in vogue. They suggest exploration of
various offshoots of flower pollination algorithm for analytical purpose as well.
178 P. Bansal and S. Bhave
The binary flower pollination algorithm has been used to solve antenna posi-
tioning problem also and has performed well according to Dahi et al. [7]. The
mapping techniques used affect the performance of the nature inspired optimiza-
tion algorithm under consideration. A modified flower pollination algorithm with
two quadratic objective functions has been used for solving multi-objective environ-
mental/economic dispatch problems by Gonidakis [8]. This version of flower pollina-
tion algorithm was able to find lower values for the concerned conflicting objectives.
Gonidakis [8] indicates flower pollination algorithm as a powerful metaheuristic that
can be further applied and explored in various domains.
Shambour et al. [9] endeavored to improve the exploration rate of flower polli-
nation algorithm by guiding the search process toward more promising areas of the
search space via modification to original flower pollination algorithm. The perfor-
mance of the new version of flower pollination algorithm was evaluated for Artifi-
cial Neural Network Weight Adjustment and numerical benchmark functions. Six
nature inspired algorithms including the modification proposed were compared. The
modified algorithm gave better or equal performance to standard flower pollination
algorithm in 80% of the total experiment cases.
Zhou et al. [10] applied the discrete greedy flower pollination algorithm for
spherical traveling salesman problem and also compared their approach with a few
version(s) of genetic algorithm and tabu search and concluded that their discrete
greedy flower pollination algorithm works faster in most cases and is relatively stable.
They had modified the biotic pollination process by using order-based crossover
and pollen discarding behavior. Order-based crossover accelerates the convergence
of discrete greedy flower pollination algorithm and pollen discarding step adapted
from artificial bee colony algorithm helps to avoid local minima trap and improves
the global search ability of the discrete greedy flower pollination algorithm. It has
been pointed out by these researchers that the efficiency of discrete greedy flower
pollination algorithm may decrease due to increase in the number of cities.
According to Yang et al. [11] nature inspired algorithms have been applied effec-
tively in the areas of telecommunication, image processing, engineering design,
vehicle routing, etc. These algorithms are simple and flexible and can be applied
to hard problems but their computational cost is high owing to several internal evalu-
ations in these algorithms. They have even been used to solve hard problems like the
Traveling Salesman Problem where suboptimal solutions given by these algorithms
have been found to be very useful.
Li et al. [12] have proposed a new adaptive version of the flower pollination
algorithm based on opposition-based learning strategy and t-distribution. They have
named this new algorithm as OTAFPA. The t-distribution variation has been utilized
in OTAFPA to create a new search direction so that the population may be diverse,
and this assists this new algorithm to avoid getting trapped in the local optimum. The
initial pollen population has been optimized using the opposition-based learning
strategy. Eight test functions were used in simulation runs, and they support the
fact that OTAFPA has better optimization ability as compared to the standard flower
pollination algorithm and a few of its variants.
Performance Analysis of Flower Pollination Algorithms … 179
Ma and Wang [13] used the concept of Random Walk, in local pollination instead
of Levy Flight that is used in the standard flower pollination algorithm; to create a
modified algorithm. They have also used the Clonal Selection Algorithm (CSA) for
generating the population. The solutions generated from random walks drawn from
random uniform distribution in [0, 1]; were found to converge faster as compared
to those generated via Levy Flights. They concluded that different functions have
varied requirements for the parameters and the rate of convergence and precision of
optimization are related very closely to the setting of parameters.
Galvez et al. [14] proposed the multimodal flower pollination algorithm that is a
modified version of the original flower pollination algorithm with multimodal capa-
bilities so that it can find all possible optima for a given optimization problem. Exper-
imentation has indicated that this version is more accurate and robust as compared
to some other multimodal optimization algorithms.
Several benchmarking functions like Rastrigin function, Griewank function,
Shwefel function, Rosenbrock function, Sphere function, etc., have been used to
compare the performance of the flower pollination algorithm and its off-shoot
algorithms with other nature inspired algorithms. These benchmark functions help
to gauge the effectiveness and robustness of optimization algorithms like flower
pollination algorithm and of course its variants.
There are yet many possibilities for designing better variants based on the stan-
dard flower pollination algorithm, to address specific needs of different optimization
problems. See Table 1 for a quick reference.
Abdel-Basset and Shawky [4] studied the flower pollination algorithm and its variants
in detail along with other evolutionary algorithms and stressed on the requirement of
better statistical analysis of results generated by these algorithms. Statistical analysis
may assist a lot in gauging the performance of nature inspired algorithms, including
the flower pollination algorithm and its variants.
According to Chiroma et al. [15] the binary variant of flower pollination algorithm
has not been explored much and statistical analysis for validation of results has
not been done much for validation of experimental results with reference to the
existing variants of flower pollination algorithm, as of 2015. Very few research papers
have statistical analysis with reference to this algorithm and algorithms based on it.
Moreover, even if it has been carried out, only a few statistical aspects have been
taken into consideration.
Most of the studies done so far on the flower pollination algorithm and its variants
on the basis of statistics have focused on mean, mode and standard deviation. Some
researchers have included one or more of statistical measures like Friedman’s two-
way analysis of variances by ranks, Confidence Intervals, Wilcoxon Signed Rank
Table 1 A glimpse of a few advancements with respect to the family of flower pollination algorithm(s)
180
Test, Markov Chains, Kruskal Wallis Test, etc., to analyze the solution(s) obtained
by applying FPA and/or its variants. There are many statistical tests that may give
more insight regarding the performance of algorithms. Not all of the statistical tests
have been applied in every case or research so far and hence there is a chance of
obtaining more insights by using a greater number of tests uniformly over a set of
algorithms of this shown in Table 1 type. It is not feasible to apply all of the tests
in every situation as some tests may be more suitable to analyze the output of the
algorithm in a given scenario, while other statistical tests may not fit well in the same
context.
Here is an overview of the statistical techniques that have been employed at
times by researchers to evaluate the performance of flower pollination algorithm and
algorithms that have branched out of it:
Mean, Mode and Standard Deviation of the results obtained by executing the stan-
dard flower pollination algorithm, and its variants using different benchmarking func-
tions have been tabulated by a few researchers but the domain of statistics does have
many other mechanisms to guide regarding the overall performance of algorithms.
Statistical tests are of two types: Parametric and Non-Parametric. Parametric tests
focus on ratio data and interval data. Non-Parametric tests focus on rank-based data,
ordinal data and categorical data.
Friedman’s two-way analysis of variances by ranks is a non-parametric test. This
method is used to compare, rather find the difference between two related samples.
Nabil et al. [2] used this test for comparing the performance of their modified version
of the flower pollination algorithm with other algorithms. This test has also been
used by Dahi et al. [7] in the context of their modified version of flower pollination
algorithm for antenna positioning.
Confidence interval is a range of values within which there is a greater possibility
that the estimates would lie. Liu et al. [16] have proposed a mechanism based on
visualization of confidence intervals to benchmark stochastic algorithms for global
optimization problems. This has relevance in the context of flower pollination algo-
rithm and other algorithms based on it as this group of algorithms is also stochastic
in nature.
Wilcoxon’s Signed Rank Test is a non-parametric test that is used in the context
of paired data. Nabil et al. [2] used this test as well to compare the performance of
their modified version of the flower pollination algorithm with other algorithms.
Markov Chains loosely belong to the category of models based on statistics and
rely on historical data. Stochastic processes can be modeled using Markov Chains.
He et al. [17] have performed the global convergence analysis of flower pollination
algorithm using discrete time Markov Chain approach. They assumed the parameter
values to be fixed and used simple vectors as solution vectors to keep the analysis
simple. According to them, there is a gap between the theory and practice of bio-
inspired algorithms.
A lot of theoretical analysis needs to be carried out further to understand the
convergence of these algorithms. The rate of convergence clearly depends on the
parameter settings and the structure of the algorithm shown in Table 2. Parameter
tuning is also an important concern in this context. The standard flower pollination
Performance Analysis of Flower Pollination Algorithms … 183
Table 2 A glimpse of some research work with respect to the family of flower pollination
algorithm(s) and the use of statistical analysis therein
Serial number Researcher(s) Limitation Details of the statistical
analysis done
1 Nabil [2] Cloning ratio is fixed Friedman’s two-way
analysis of variances by
ranks and Wilcoxon’s
signed rank test
2 Cui and He [3] Application to real world Wilcoxon’s signed rank test
problems such as
parameter identification
and feature selection is
pending
3 Abdel-Basset and Limited Statistical analysis Friedman’s test
Shawky [4]
4 Lukasik et al. [5] Hybridized versions are Rand index, standard
yet to be explored deviation and pairwise
T-tests
5 Rodrigues et al. [6] Parameters of flower None
pollination algorithm need
to be considered
6 Dahi et al. [7] Better mapping techniques Friedman’s two-way
are required analysis of variances by
ranks, Kruskal–Wallis
one-way analysis of
variance test, Bartlett test
for testing the homogeneity
of variance and
Kolmogorov–Smirnov test
for testing the normality of
distribution
7 Gonidakis [8] Hybridization with other None
approaches needs to be
studied
8 Shambour et al. [9] Real world application is Mean and standard
lacking deviation
9 Zhou et al. [10] Performance may decrease Mean, standard deviation
as the number of cities is and rank
increased. This needs to be
tackled
10 Li et al. [12] No statistical analysis None
11 Ma and Wang [13] No statistical analysis None
12 Galvez et al. [14] No statistical analysis None
13 He et al. [17] Restricted to original Markov Chains
flower pollination
algorithm
184 P. Bansal and S. Bhave
algorithm has been shown to have global convergence for sure with the help of
Markov models.
Dahi et al. [7] have also used Kruskal–Wallis one-way analysis of variance test,
Bartlett test for testing the homogeneity of variance and Kolmogorov–Smirnov test
for testing the normality of distribution for the statistical analysis of their approach.
Table 2 summarizes some of the research endeavors with respect to the flower polli-
nation algorithm and variants along with the limitations that can act as pointers for
future research work and the role of statistical tests in analyzing their respective
performance.
The future work in this context includes the usage of a greater number of statistical
tests for obtaining a clearer insight about the overall performance of the nature
inspired algorithm like flower pollination algorithm at hand. A special strategy for
using a set of widely applicable statistical tests may be created for this purpose. This
may include computerized as well as manual statistical analysis of algorithms. This
detailed statistical analysis may lead to better conclusions regarding the application
of the nature inspired algorithms like flower pollination algorithm and its variants to
real world problems.
The performance of algorithms also relates with their exploration as well as
exploitation of the search space. These aspects need to be studied in detail and
statistical tests/measures may assist in this aspect as well.
Each statistical value or estimate gives some insight about the performance of the
algorithm. For instance, standard deviation is connected with the credibility of the
algorithm. Many insights may be gained by delving deeper into the analysis of flower
pollination algorithm and algorithms based on this basic strategy with the help of
statistical mechanisms.
Statistical analysis of an algorithm gives us concrete results regarding the perfor-
mance of the algorithm and also helps us to know which algorithm performs better
than the other under which set of conditions. An important point to note is that not
all statistical tests can be applied in all situations. Different statistical tests are to be
applied in different scenarios for a variety of reasons. A greater number of appli-
cable tests may be used to reveal deeper and better insights so that better versions of
algorithms may be designed to solve a wider range of problems.
References
1. X.S. Yang, Flower pollination algorithm for global optimization, in Unconventional Computa-
tion and Natural Computation. UCNC 2012, ed. by J. Durand-Lose, N. Jonoska. Lecture Notes
Performance Analysis of Flower Pollination Algorithms … 185
Swarna Kamal Paul, Tauseef Jamal Firdausi, Saikat Jana, Arunava Das,
and Piyush Nandi
1 Introduction
Data collected in a business enterprise contains all the story. Insights gained by
obtaining causal answers for business events will help devising better future strategy.
However, causality is a fleeting concept. Absolute causality in chaotic world is
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 187
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_16
188 S. K. Paul et al.
2 Related Work
TETRAD [13] is a diverse collection of causal analysis methods for climate study
which focuses heavily on the statistical behavior of the data provided as input,
allowing to investigate elements like causal graphs, causal effects, feature engi-
neering, and simulations to derive a number of causal inferences. Rulex Explain-
able AI [12] focuses on explaining complex machine learning models using plain
language. Its proprietary algorithms create predictive models in the form of first-
order conditional logic rules. Causality can also be found by generating counterfac-
tual examples which in turn can be generated by solving the optimization problem of
minimizing distance between original and counterfactual samples [7]. PermuteAt-
tack [9] uses genetic algorithm to create counterfactuals which in turn can be used to
establish causality. Artelt [10] created a python toolbox called ceml for generating
counterfactuals. It can be used to explain machine learning models and find causal
relations between variables. Mothilal et al. [11] proposed a framework for gener-
ating counterfactuals which can satisfy feasibility conditions based on constraints
and diversity among counterfactuals.
In the context of Artificial Intelligence (AI), Explainable AI (XAI) [2] can be defined
as a method or techniques to explain the outcome of a black box ML model. In
this context, XAI has been used to find and explain causal influence of multiple
independent variables on the response variable. The proposed method creates a best
possible black box model in terms of accuracy of predicting the response variable with
respect to the causal variables. Later, the same model is used to find causal influence
of each causal variables on the response variable by generating counterfactuals using
perturbation method.
Counterfactual Causal Analysis on Structured Data 189
Counterfactual examples are samples which are minimally modified with respect
to the original sample to alter the predicted value by a model. Thus, counterfac-
tual explanations provide statements as smallest changes required to alter certain
predicted value or decision. Majority of current well-known XAI methods are feature
attribution based [6, 12]. Wachter et al. [7] proposed generating counterfactual exam-
ples can be represented as optimization problem by minimizing distance between
original and counterfactual samples.
There are several limitations in the existing methods of generating counterfac-
tuals [8] for serving the purpose in this context. Thus, an algorithm is proposed for
generating counterfactuals which is model agnostic and based on gradient-free opti-
mization, named as Genfact. It can generate multiple counterfactuals at once and can
do amortized inference [8], thus making the process fast. Given a dataset, it can find
counterfactual pairs closest to each other, and the pairs may not exist in the original
dataset. This feature is useful in this context as the given dataset used for generating
counterfactuals may not contain enough samples around the classification boundary,
but the proposed method can generate samples around the boundary.
Algorithm 1 states the Genfact algorithm for generating counterfactuals. The algo-
rithm works for both categorical and numerical values. If the response variable is
numeric, it is divided into C classes by defining ranges for each class. The encoded
feature data is clustered into K clusters. This is done to group the nearest neigh-
bors which in turn can be used as initial population for the genetic algorithm. Each
cluster is assigned a normalized diversity score which is proportional to the entropy
of the predicted classes of all samples in a cluster. Higher diversity score signi-
fies better mixture of samples from different classes. Genetic algorithm is run on
each cluster in the order of normalized diversity score until 40% of the samples are
covered. The crossover operation handles both categorical and numerical variables
and adjusts them in a way to avoid creating non-feasible samples. The mutated numer-
ical feature values are bounded by the range defined by maximum and minimum
values in the sample set within the cluster. The categorical values are shuffled among
available values in the samples within the cluster. This way it satisfies the actionability
feature mentioned in [8]. The final output consists of counterfactual pairs of samples.
PermuteAttack [9] also uses genetic algorithm to create counterfactuals; however,
they cannot do amortized inference and generate counterfactual sample only for the
input sample. Also, no separate way of handling categorical and numerical values is
mentioned.
Algorithm 1
190 S. K. Paul et al.
offspringsize offspringsize +1
return facts
calculate_fitness(facts)
for each sample in facts
fitness minimum Euclidean distance from other samples in facts having differ-
ent predicted class
counterfacts other sample in facts having minimum distance from sample and
different predicted class from that of sample
return (fitness, counterfacts)
mean squared error of each estimation. The statements with top 3 sample sizes are
only considered.
Filtering out relevant statements are followed by shortening of statement length
if required. A statement containing some categorical feature may have a long list of
values which can lengthen the statement. So, the list of values is filtered based on
their frequency of occurrence in the original data. Top 5 unique values with highest
frequencies are considered, and other values are omitted.
To provide evidence for the explanations a data graph is generated by applying the
filter conditions obtained from the explanations on top of the actual dataset and
thereby summarizing it. The data evidence provides justification of the explanations
and also at the same time allows users to get a deeper understanding of the entity
relationship dynamics. The entity-feature map and entity relationship of the data
needs to be supplied to the evidence data graph generation algorithm. The entity
relationship is used to create edges among feature variables such that edges run
between only those features whose corresponding entities are connected. However,
edges always run between response variable and each of the feature variable in the
data. The following algorithm is used to generate the evidence graph.
Algorithm 2
Filter dataset based on the top n conditions generated by the explainer tree
For each column C in dataset
if C is numeric
divide values of C in k ranges such that each bucket contains at least N/k sam-
ples, where N is total number of samples
add each range in the nodelist with sample size as node size
else if C is categorical
select top k values of C based on sample size and add them into nodelist with
sample size as node size
For each column C in dataset
if C is response variable
For each node of type C in nodelist
find and add edges in edgelist with respect to all other node types in nodelist
make edgeweight as number of observed samples for the relation
else if C is feature variable
For each node of type C in nodelist
find and add edges in edgelist with respect to other node types in nodelist and
satisfying entity relation with respect to C
make edgeweight as number of observed samples for the relation
Normalize node size and edge weight
Counterfactual Causal Analysis on Structured Data 193
Table 1 Comparison of
Algorithm Runtime in seconds Average distance Entropy
counterfactual generation
algorithms DiCE 7675.48 0.029 0.848
Ceml 1297.47 255,153.44 0.9729
Genfact 2.172 0.5030 0.8831
4 Experimental Results
In a second set of experiments, the dataset has been run through the proposed method
with XGBoost [4] serving as the black box model. It provided the causal explanations
and the evidence data graph. The top 3 generated causal statements are stated as below.
It is evident that impressions are affecting the Total_Conversion most. Interest and
Spentperclick are the 2nd most influencing factors. In general, higher the impressions
higher is the Total_Conversion.
194 S. K. Paul et al.
Explanation 1: If Impressions > 920,683.0 & Spentperclick > 1.483 and < = 1.673,
then Total_Conversion will be in range 10.63 – 15.29, [sample size: 16.8%].
Explanation 2: If Impressions > 453,229.125 and < = 920,683.0 & interest are 16,
10, 29, 27, 15, then Total_Conversion will be in range 3.35 – 7.13, [sample size:
28.9%].
Explanation 3: If Impressions < = 453,229.125 & interest are 16, 10, 29, 27, 15,
then Total_Conversion will be in range 1.66 – 3.20, [sample size: 42.8%].
Figure 1 illustrates different sections of data graph generated as evidence.
Figure 1a illustrates data graph with respect to nodes “Total_Conversion” and
“Impression.” In general, higher “Total_Conversion” values have relation with higher
“Impressions.” Figure 1b shows relations with respect to “Total_Conversion” and
“interest.” Lower “Total_Conversion” values mostly have strong relations with
“interest 16,” “interest 15” and “interest 10.”
5 Conclusion
As per the claim, the proposed method has been demonstrated to find and explain
causal relations of a KPI with respect to arbitrary set to feature variables. The perfor-
mance superiority of the counterfactual generation algorithm has also been estab-
lished as it was able to generate high quality counterfactuals in a very short time.
The efficiency is measured by Euclidean distance between generated counterfactual
pairs and entropy of predicted classes of the counterfactuals. The current work can be
extended to encompass causal analysis of time series data generated from complex
dynamical systems.
Counterfactual Causal Analysis on Structured Data 195
References
1. J. Pearl, D. Mackenzie, The Book of Why: The New Science of Cause and Effect (Basic Books,
2018)
2. R. Goebel, A. Chander, K. Holzinger, F. Lecue, Z. Akata, S. Stumpf, A. Holzinger, Explain-
able AI: the new 42? in International Cross-Domain Conference for Machine Learning and
Knowledge Extraction (Springer, Cham, 2018), pp. 295–303
3. R. Andersen, Modern Methods for Robust Regression, in Quantitative Applications in the
Social Sciences (Sage Publications, Los Angeles, CA, 2008), pp. 152
4. T. Chen, C. Guestrin, Xgboost: a scalable tree boosting system, in Proceedings of the 22nd acm
sigkdd International Conference on Knowledge Discovery and Data Mining (2016), pp. 785–
794
5. https://www.kaggle.com/chrisbow/an-introduction-to-facebook-ad-analysis-using-r
6. R.K. Mothilal, D. Mahajan, C. Tan, A. Sharma, Towards Unifying Feature Attribution and
Counterfactual Explanations: Different Means to the Same End. arXiv preprint arXiv:2011.
04917 (2020)
7. S. Wachter, B. Mittelstadt, C. Russell, Counterfactual explanations without opening the black
box: automated decisions and the GDPR. Harv. JL Tech. 31, 841 (2017)
8. S. Verma, J. Dickerson, K. Hines, Counterfactual Explanations for Machine Learning: A
Review. arXiv preprint arXiv:2010.10596 (2020)
9. M. Hashemi, A. Fathi, Permute Attack: Counterfactual Explanation of Machine Learning
Credit Scorecards. arXiv preprint arXiv:2008.10138 (2020)
10. A. Artelt, CEML-Counterfactuals for Explaining Machine Learning models-A Python Toolbox
(2019)
11. R.K. Mothilal, A. Sharma, C. Tan, Explaining machine learning classifiers through diverse
counterfactual explanations, in Proceedings of the 2020 Conference on Fairness, Account-
ability, and Transparency (2020), pp. 607–617
12. https://www.rulex.ai/rulex-explainable-ai-xai/
13. J.D. Ramsey, K. Zhang, M. Glymour, R.S. Romero, B. Huang, I. Ebert-Uphoff, C. Glymour,
TETRAD—a toolbox for causal discovery, in 8th International Workshop on Climate
Informatics (2018)
Crime Analysis Using Machine Learning
Abstract Crime eradication and prevention have been a major setback of most
developed countries. This paper deals with the analysis of criminal data record from
the kaggle which belongs to the San Francisco crime dataset. We are finding the
model with best accuracy, and performance of all the models is tested by us. Here, the
implementation of multiple approaches from machine learning and its comparative
analysis is done with the help of the data. We are finding which model has best
accuracy and performance of all the models. It is shown that Linear SVC has achieved
the best results of all the models considered. The inclusion of these methodologies
to the investigation broadens the search and lessens the risks for the cops.
1 Introduction
Crime eradication and prevention have been a major setback of most developed coun-
tries also. The incorporation of technical knowledge into all areas of improvement is
a match-winning strategy of all countries, and by using the same, here, we incorpo-
rate the technical brilliance into the crime prediction and prevent it with the help of
machine learning techniques which forecast the future by considering the past crime
records, and other parameters which is very useful in suppressing the crimes in the
localities. The usage of machine learning into real-time problems is not something
new; it has been used by various countries in various fields and stepped onto success
on multiple occasions in Fig. 1.
The analysis of crime data is also not a new innovation, but the extension to the
data analysis and predictions done at various stages of forecasting of data, but use
of machine learning techniques is a latest addition to the field and is a very accurate
in comparison to the traditional data forecasting techniques. This process of using
the previous data to predict the future crimes and stimulate a procedure to lessen
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 197
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_17
198 S. R. C. M. Akuri et al.
the risk and loss to the people by incorporating these techniques in the crime world
[1–4]. We deduce that this can be further extended by incorporating the process to
the investigation model and extending the scope of the investigation for the cops.
Hence, we propose the same inclusion, and with comparative analysis, we find the
best approach which can be used for the prediction in Table 1.
Crime data analysis is not a new innovation, but the extension to the data analysis
and predictions done at various stages of forecasting of data but use of machine
learning techniques is a latest addition to the field and is a very accurate in comparison
to the traditional data forecasting techniques. Earlier crime department used to keep
tabs on suspects who are likely to do crimes and also in areas where crime happens
pretty often. Later on surveillance systems came into existence which changed the
scope of keeping the areas of high crime on a leash and safeguard the surroundings
in Fig. 2.
2 Methodology
2.1 Classification
Architecture
grants the flexibility of having more variations in trees and achieves more accuracy
in the model in Fig. 3.
Linear SVC is a best-fit classifier which returns the best-fit hyperplane value as
the output from the input data while considering the features selected from the input
data. Also, the methodology only supports and executes in a linear kernel which is
one of the drawbacks for the algorithm, and it has more flexibility with respect to
the penalty values as well as the loss functions which gives the scope of performing
well in the scenarios where we consider large data.
2.2 Regression
y = θ1 + θ2 .x
1 2
n
minimize predi − yi
n i=1
1 2
n
J= predi − yi
n i=1
Linear regression model is a type of algorithm under the supervised leaning algo-
rithms which often target the output value with the help of the independent variables.
This predicts the values and the relationship between the input and output which can
be explained by the equation given below where root mean squared error is depicted
as cost function J. We use the tuning algorithm RidgeCV for the eradication of
multi-collinearity by analyzing the data with the help of the cost function.
Min Y − X (θ )2 + λθ 2
The pre-processing is done for the data, and it is converted to a comma separated
format file for further handling of data. Then, the filling up of the inconsistent/missing
data is done, and normalization of data takes place which gives us the training set
for further analysis.
On the training set, we now apply the models and get the results by implementing
the decision tree, GaussianNB, Linear SVC algorithms and getting the results and
comparing them with each other. We are finding which model has best accuracy and
performance of all the models. It is shown that Linear SVC has achieved the best
results of all the models considered. We deduce that this can be further extended by
incorporating the process to the investigation model in Fig. 4.
After the comparative analysis of the methodologies, we move on to the regressive
analysis of data. The regressive analysis is done by incorporating two models in the
process which are linear regression and RidgeCv. We have found that the linear
regression model has higher efficiency in comparison to the later. It is shown that
linear regression has achieved the best results of all the models considered in Tables 2
and 3.
202 S. R. C. M. Akuri et al.
Table 3 Regression
Algorithm MSE MSE of training set MSE mean
algorithm results
Linear regression 0.0192 0.46 0.12
RidgeCv 0.0189 1.77 0.36
4 Conclusion
The incorporation of machine learning into real-time problems is not something new;
it has been used by various countries in various fields and stepped onto success on
multiple occasions. Hence, we propose the same inclusion, and with comparative
Crime Analysis Using Machine Learning 203
analysis, we find the best approach which can be used for the prediction. Here, the
implementation of multiple approaches from machine learning, and its comparative
analysis is done with the help of the data. We are finding which model has best
accuracy and performance of all the models. It is shown that Linear SVC has achieved
the best results of all the models considered. We deduce that this can be further
extended by incorporating the process to the investigation model and extending the
scope of the investigation for the cops. The inclusion of these methodologies to the
investigation broadens the search and lessens the risks for the cops. We can also add
different additional modules and develop an application for the crime department as
a future extension.
References
1. F. Afroz, S. Rajashekara Murthy, M.L. Chayadevi, Crime analysis and prediction using data
mining—cap a survey
2. A.A. Shmais, R. Hani, in Data Mining for Fraud Detection, Prince Sultan University, Saudi
Arabia
3. K. Deepika, S. Vinod, Crime analysis in India using data mining techniques. Int. J. Eng. Technol.
7(2.6) (2018)
4. G. Borowik, Z.M. Wawrzyniak, P. Cichosz, Time series analysis for crime forecasting (European
Union, 2018)
Multi-model Neural Style Transfer
(MMNST) for Audio and Image
Abstract Neural style transfer (NST) was created to give a new look for images,
audios and videos through optimization and manipulation techniques. Nowadays, this
specific field has picked up pace amongst various techniques that deal with neural
networks and it has emerged as one of the most efficient means of producing style
transfer. In order to address the shortcomings in the existing system, multi-model
neural style transfer (MMNST) approach for image and audio is proposed. It focuses
on two kinds of data: audio and image. The main objective of this proposed system is
to create artistic imagery by separating and recombining image content and style. For
the audio style transfer, we have two inputs which are broken down, optimized and
enhanced and finally combined together in a fulfilling manner. Specifically, local
and global features can be transferred using both parametric and non-parametric
neural style transfer algorithms, which result in an outcome that has equal portions of
both—content and style input as they coalesce perfectly. For experimentation, VGG-
19 (CNN) and TensorFlow Lite models are used. The proposed model outperforms
the existing models in terms of accuracy, execution speed and the total loss incurred
during the process.
1 Introduction
Neural style transfer associates itself to a gaggle of software algorithms which manip-
ulates digital images, audios, videos so as to adapt the looks and visual kind of another
data. These kinds of algorithms are characterized by their use of deep neural networks
to achieve image, audio and video transformation/manipulation. The fundamental
uses for NST are the creation of artificial artwork from photographs, for instance
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 205
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_18
206 B. Vishal et al.
2 Literature Survey
Cheng et al. [1] proposed “structure-preserving neural style transfer” model. It uses
state-of-the-art methodologies that are prevalent in neural networks and use it to
achieve maximum success in the field of neural style transfer. It primarily focuses
on deep learning algorithms in order to achieve maximum efficiency with regards to
style transfer. This is specifically trained using stochastic gradient descent in order to
minimize the loss functions. Although this model uses state-of-the-art methodologies
prevalent in said field, what this paper does not achieve is to test these methodologies
in various models available in style transfer. The research paper titled [2] primarily
focuses on two things. The first being effectiveness (E) which measures the scale
that a given style has been imprinted/transferred to the content image. Secondly,
coherence (C) is another statistic which measures the scope up to which the content
of the original image is preserved.
Chen et al. [3] use a faster DCNN approach to diversify the attributes present in
the given inputs (divide and conquer method) and create a singularity. It caters to
every single divided entity and finally provides a meaningful outcome by putting
together all singular entities. Usage of SANET model by [4] in order to acquit both
local and global style patterns within a single framework and also preserving the
originality of the content. However, this model uses a good method, but due to the
emergence of more capable and efficient algorithms like VGG-16, VGG-19, ResNet,
the SANET model even though it is a fine method, became routine and mundane
without room for improvement and growth. Moreover, this approach [5] uses a data
augmentation strategy such as data warping and oversampling to tackle the primary
issue, that is, the issue of overfitting. Even though this paper effectively delivers
image transformation, but it does not concoct with the underlying principle at hand
Multi-model Neural Style Transfer (MMNST) for Audio and Image 207
being style transfer. It still contributes in the areas such as image manipulation and
geometric transformations, but it does not bode well with style transfer.
In [6] and [7], they separated the content and style inputs and trained them and
finally, used optimization technique to coalesce them to produce an efficient audio
style transfer output.
The only pickle with this methodology is that the model used here is overshadowed
by many newly arrived models which proved to be more efficient in the style transfer
domain. Luan et al. [8] and Jing et al. [9] extended the work of Gatys on style transfer.
These approaches used deep learning algorithm and employed these methodologies
in VGG-16 model to yield a perfect style transferred output. But, due to the arrival of
many efficient style transfer neural network models, this method became obsolete.
To summarize, most of the practices/papers mentioned above use CNNs to achieve
neural style transfer. But we are planning to achieve success in more than one model
so that others will have all the right data to carry out neural style transfer. Although all
the above approaches/practices do very well in producing good neural style transfer
outcomes, none of the papers really discuss about which model is more accurate and
efficient. So, we have decided to test these methodologies in more than one model
and display the efficiencies of the same so that those who attempt to carry out neural
style transfer in the future will have all the necessary data to choose an efficient
model in order to produce accurate results [10, 11]. We also plan to employ video
style transfer in order to increase scalability of our model.
3 Proposed Work
There are two models used for image style transfer in this paper which are as follows:
208 B. Vishal et al.
→ − 2
Lcontent −
p ,→
x ,l = 1
2 i, j Filj − Pilj
−
→
p The original image.
−
→
x The generated image.
l Layer.
Filj Activation of the ith filter at position j in the feature representation of −
→
x in l.
l −
→
Pi j Activation of the ith filter at position j in the feature representation of p in l.
Layer-1 (Style Loss)
1 2
El = G li j − Ali j
4Nl2 Ml2 i, j
−
→
a The original image.
−
→x The generated image.
L Total number of all layers.
wl Weighting factors of the combination of each layer to the total loss.
E l loss in layer l.
Total Loss
Multi-model Neural Style Transfer (MMNST) for Audio and Image 211
L = αL content + β L style
The main objective here is to adapt the “style transfer” concept to audio dominion.
Specifically, the aim to transfer the style of an audio which is labelled as the “style”, to
a different audio which is labelled as the “content”, and synthesize a brand-new audio
with the overall characteristics of the “style” by also remaining loyal to the “content”.
Through this approach, we will take a breakthrough for understanding the features
of raw music audio signals like the style, melody, rhythm and tempo and efficiently
produce style transfer without losing any of the aforementioned parameters. In this
method, VGG-19 (CNN) is being used to carry out audio style transfer. Here, VGG-
19 model separates the attributes of both content and style input and optimizes it using
the parameters that are designed during this approach and eventually, concatenates
the inputs to get an output required by the user. Also, alongside the required libraries,
two additional libraries, namely: librosa and soundfile were included, which was used
as an outlet for the audio Fig. 5.
To increase the user’s control over the outcome, an optimisation technique is
introduced though which one can customize the output according to his/her needs.
Three parameters, namely: ALPHA, learning_rate and iterations, were introduced to
enable the user to control the outcome in his favour. For instance, a larger ALPHA,
Fig. 5 Architecture of this method. Input A is content audio, input B is style audio then, we initialize
the output with input A. To optimize the output, we use both content loss and style loss
212 B. Vishal et al.
for example, implies more content in the output, however, ALPHA = 0 indicates no
content, minimizing stylization to texture generation Fig. 6.
4 Results
The approaches used in this experiment have been rewarding due to the fact that it
allowed us to control the outcome of the process.
Also, this approach allowed us to have a say in restricting the losses incurred
during process of audio style transfer. The execution speeds have been higher than
the previous models such as VGG-16, ResNet and SANet. Figs. 7 and 8.
The audio style transfer technique also uses parameters to get a hold of the kind of
output as desired by the user. Using these parameters, namely ALPHA, learning_rate
and iterations, we can control the amount of content or style audio to be used to obtain
the output. Through this way, we can get multiple outputs for the same set of input
data in the same platform giving rise to scalability in Figs. 9, 10, 15 (Figs. 11, 12,
13, 14).
Multi-model Neural Style Transfer (MMNST) for Audio and Image 213
5 Conclusion
Thus, the multi-model neural style transfer (MMNST) approach for style transfer
was designed with VGG-19 and TensorFlow Lite models and has been successfully
executed. This approach is comparatively more accurate, efficient and it also controls
the losses incurred during the process to a minimum.
Thus, the objective of producing style transfer outcome has been attained in an
efficient and fulfilling manner.
214 B. Vishal et al.
TensorFlow-Lite
Fig. 13 Spectrograms depicting content (left), style (middle) and result (right)
Fig. 15 Alternate spectrograms of content (1st row), style (2nd row) and result (3rd row) audio
files
References
1. M.-M. Cheng, X.-C. Liu, J. Wang, S.-P. Lu, Y.-K. Lai, P.L. Rosin, Structure-preserving neural
style transfer, in IEEE Transactions on Image Processing, vol. 29 (2020)
2. M.-C. Yeh, S. Tang, A. Bhattad, C. Zou, D. Forsyth, Improving style transfer with calibrated
metrics, in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) (2020)
3. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Deeplab: semantic image
segmentation with deep convolutional nets, Atrous convolution, and fully connected (2016)
4. P. Rathi, P. Adarsh, M. Kumar, Deep learning approach for arbitrary image style fusion
and transformation using SANET model, in 2020 4th International Conference Trends in
Electronics and Informatics (ICOEI) (2020)
5. C. Khosla, B.S. Saini, Enhancing performance of deep learning models with different
data augmentation techniques: a survey, in 2020 International Conference on Intelligent
Engineering and Management (ICIEM) (2020)
6. E. Grinstein, N.Q.K. Duong, A. Ozerov, P. Perez, Audio style transfer, in ASSP—IEEE
International Conference on Acoustics, Speech and Signal Processing (2018)
7. Z. Huang, S. Chen, B. Zhu, Deep leaning for audio style transfer
8. F. Luan, S. Paris, E. Schechtman, Deep photo style transfer, in 2017 IEEE Conference on CVPR
(July, 2017)
Multi-model Neural Style Transfer (MMNST) for Audio and Image 217
9. Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, M. Song, Neural style transfer: a review. IEEE Trans.
Vis. Comp. Graphics 26(11) (2020)
10. P. Li, D. Zhang, L. Zhao, D. Xu, D. Lu, Style permutation for diversified arbitrary style transfer.
IEEE Access 8 (2020)
11. A.J. Champandard, Semantic style transfer and turning two-bit doodles into fine artworks, in
nucl.ai Conference (Mar, 2016.)
12. Y. Zhu, Y. Niu, F. Li, C. Zou, G. Shi, Channel-grouping based patch swap for arbitrary style
transfer, in 2020 IEEE International Conference on Image Processing (ICIP) (2020)
13. W. Ma, Z. Chen, C. Ji, Block shuffle: a method for high-resolution fast style transfer with
limited memory. IEEE Access 8 (2020)
14. A. Levin, D. Lischinski, Y. Weiss, A closed-form solution to natural image matting. IEEE
Trans. Pattern Anal. Mach. Intell. (2008)
15. M. Pasini, MelGAN-VC: voice conversion and audio style transfer on arbitrarily long samples
using Spectrograms (2019)
Forecasting of COVID-19 Using
Supervised Machine Learning Models
1 Introduction
Over the last decade, machine learning has established itself as a prominent field of
study by solving complex problems. ML has many applications in many fields in
Table 1.
ML algorithms play the major role in processing the complex datasets such as
COVID-19. This algorithm follows the if-else approach. ML algorithms use the trial-
and-error method. Forecasting is the most important aspect of the ML [1]. There are
many forecasting algorithms have been used for predicting the specific disease such as
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 219
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_19
220 Y. V. B. Reddy et al.
Table 1 Applications of
Business Healthcare Smart vehicle Natural
machine learning
applications language
processing
Sports Entertainment Image Climate
processing change
Robotics Voicing Stock market Disease
prediction prediction
coronary artery disease [2, 3], cardiovascular disease [3], breast cancer prediction [4],
in particular COVID-19 prediction [5]. This study forecasts the COVID-19 outbreak
and early responses. This study helps in decision-making and manages the disease
effectively.
1.2 COVID-19
The aim of this study is to create a model for predicting the spread of COVID-19, a
new coronavirus. It belongs to the type of SARS-CoV-2 virus. Corona virus disease
is the full name of COVID-19. Formerly, this disease is referred to in the 2019 novel
Corona virus’. The virus was first identified in the Chinese city of Wuhan at the end
of 2019 in Table 2. It has the following symptoms.
Because of the causes of its spread and the threat it poses, almost every country has
declared either partial or complete lockdowns. Some of people possess higher symp-
toms and some of them not possess any symptoms. Medical researchers working on a
vaccine for this virus. Some countries succeeded in vaccine trials and implementing
in their respective countries. People are affected by the virus even though they are
vaccinated. People are ultimately lead to death [6]. To contribute to this situation,
various researchers are studying different types of dimensions of the pandemic to
help the humanity.
We aimed to design a COVID-19 forecasting system to help with the global
humanitarian crisis. For the next 20 days of the outlook, we’re looking at three main
variables [7, 8].
I. The overall confirm cases
II. The overall of recoveries
III. The overall of deaths.
Supervised machine learning models are used for this forecasting. They are
Table 2 Symptoms of
Fever Coughing Shortness of breath
COVID-19
Trouble breathing Headache Sore throat
Loss of smell or taste Tiredness Loss of speech
Forecasting of COVID-19 Using Supervised Machine Learning Models 221
1. Linear Regression
2. Support Vector Machine
3. Exponential Smoothing.
The learning models were trained using Johns Hopkins University’s COVID-19
patient statistics dataset, which is available on GitHub. The dataset was pre-processed
and split into two subsets: Training and Testing. In terms of significant factors, the
performance has been achieved. They are
• R-square score
• Mean Square Error
• Mean Absolute Error
• Root Mean Square Error.
2.1 Dataset
The aim of this research is to provide the better research on detecting and diagnosing
the COVID-19 based on the overall cases. Several ML algorithms are utilized to
analyze these datasets. These algorithms are applied on GitHub Repository dataset
provided by the Johns Hopkins Whiting School of Engineering.
This dataset contains countries across the world and day-wise confirm cases, day-
wise death cases, and day-wise recovery cases. All data comes from the regular case
report, which is revised once a day [7] in Figs. 1, 2, and 3.
See Fig. 1.
See Fig. 2.
See Fig. 3.
Supervised machine learning models deal with labeled data. The labeled dataset that
contains both input and output parameter. Supervised models are of two types: (1)
Regression, (2) Classification. A regressor is a tool that is used to train a model
using regression. After that the trained model makes a prediction based on the input
data or dataset. For model development, the learning methods may use regression or
classification algorithms.
Forecasting of COVID-19 Using Supervised Machine Learning Models 223
For this study, four types of regression models are used. They are
• Linear Regression (LR)
• Support Vector Machine (SVM)
• Exponential Smoothing (ES).
y = mx + c
To get the better regression result, the best c and m values are to be initializing.
To ensure that this minimizing issue is presented, the gap between real and expected
values should be as small as possible.
SVM is one kind of supervised machine learning technique. Both regression and
classification can be accomplished with the SVM. SVM is commonly used to solve
classification problems. In a multi-dimensional space, the SVM model represents
different groups in a hyperplane [8]. The hyperplane will be created in an iterative
process to reduce the error. Since by using liner function, this algorithm solves the
regression issues, it maps the input vector (x) to an n-dimensional space called a
feature space when dealing with non-linear regression problems (z). After applying
linear regression, non-linear mapping techniques are used to complete the mapping.
The exponential window function is used to smooth time series data using the rule
of thumb technique of exponential smoothing. ES is a technique for analyzing time-
series data that is often used. Forecasting is done in this model using data from
previous periods. As time passes, the power of previous data observations diminishes
exponentially. The current–time forecast provided by
2.3.1 R2 Score
R-squared is a statistical test that provides some insight into the fit of a model. It is
calculated on a scale of 0–100%. The coefficient of determination, or the coefficient
of multiple determinations, is what it’s called. The goodness of the trained model is
determined by its R2 value.
The R2 score ranges from 0 to 100%.
• A response variable with a variance of 0% has no variability around its mean.
• A model that describes 100% of the variability in response data across its mean
is said to be 100%.
This is the average (absolute value) of the model predictions and real results on
test data. Its range of values is 0 to infinity. Fewer performance values indicate the
effectiveness of learning models, which is why, they are often known as negatively
oriented ratings.
MSE squares the gap between data points and the regression line. Since squaring
eliminates the negative sign from the value, it gives greater weight to larger differ-
ences. The lower the mean squared error, the closer you are to finding the best fit
rows.
To predict the errors, the standard deviation (SD) is defined as the RMSE. To predict
errors, the best fit line and original data points are calculated, and these are also called
as residuals. All the individual data points are clustered by using RSME along with
best fit axis.
3 Methodology
This research was conducted in accordance with the COVID-19 predictions. The
COVID-19 has proven to be a current and imminent danger to human and animal
life. Every day, it results in tens of thousands of deaths. This project aims to conduct
Forecasting of COVID-19 Using Supervised Machine Learning Models 225
potential forecasts on death cases, recovery cases, and confirm cases in order to help
monitor the pandemic situation. Data collection should be divided into two subsets
after the initial planning phase. There are two sets: a training set and a test set. The
aim of this study is to make predictions. LR, SVM, and ES are among the models used
in this research. The number of illnesses, deaths, and recoveries for the next twenty
days were examined. This study necessitates the use of four models. The analytical
data includes summary tables for each day time series, as well as the number of
confirmed cases in the days following the pandemic’s spread, deaths, and retrievable
data. The data collection of global data on the regular number of cases, deaths, and
recoveries was processed at the start of this report in Figs. 4, 5, and 6.
See Fig. 4.
226 Y. V. B. Reddy et al.
See Fig. 5.
See Fig. 6.
The information was gathered from a GitHub source. After that, the data is pre-
processed. The data is divided into two sets: training and testing. The model will be
used to train the data. The training data will be used to conduct the testing. Evaluation
parameters will be used to assess accuracy in Fig. 7.
Fig. 8 Accuracy
5 Results
5.2 Death Cases Forecasting Shown in Figs. 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, and 29
See Figs. 21, 22, 23, 24, 25, 26, 27, 28, and 29.
228 Y. V. B. Reddy et al.
6 Conclusion
Using machine learning models, this study aims to build a forecast method for
predicting the number of cases affected by COVID-19. The datasets used in this
analysis provide information from regular reports on new confirm cases, new recov-
eries, and new death cases. As the death and confirmation rates rise day by day,
the world’s condition is deteriorating. The number of people who could be affected
by COVID-19 in various countries around the world is unknown. The aim of this
analysis is to forecast the number of confirmed cases, recovered cases, and deaths
over the next 20 days. The two machine learning models LR, SVM, and ES are used
Forecasting of COVID-19 Using Supervised Machine Learning Models 229
to make this forecast. These two variables are used to forecast the number of new
events, recoveries, and deaths.
230 Y. V. B. Reddy et al.
Fig. 20 Accuracy
Forecasting of COVID-19 Using Supervised Machine Learning Models 233
Fig. 21 Accuracy
Fig. 24 ES forecasts
recovery cases for the next
20 days
References
1. G. Bontempi, S.B. Taieb, Y.-A. Le Borgne, Machine learning strategies for time series
forecasting, in Proceedings of European Business Intelligence Summer School (2012), pp. 62–77
2. P. Lapuerta, S.P. Azen, L. Labree, Use of neural networks in predicting the risk of coronary
artery disease. Comput. Biomed. Res. 28(1), 38–52 (1995)
3. K.M. Anderson, P.M. Odell, P.W. Wilson, W.B. Kannel, Cardiovascular disease risk profiles.
Amer. Heart J. 121(1), 293–298 (1991)
4. H. Asri, H. Mousannif, H.A. Moatassime, T. Noel, Using machine learning algorithms for breast
cancer risk prediction and diagnosis. Procedia Comput. Sci. 83, 1064–1069 (2016)
Forecasting of COVID-19 Using Supervised Machine Learning Models 237
5. F. Petropoulos, S. Makridakis, Forecasting the novel coronavirus COVID-19. PLoS ONE 15(3)
(2020)
6. L. van der Hoek, K. Pyrc, M.F. Jebbink, W. Vermeulen-Oost, R.J. Berkhout, K.C. Wolthers et al.,
Identification of a new human Coronavirus. Nat. Med. 10(4), 368–373
7. R.S.M. Lakshimi, B.T. Rao, M.R. Murty, Predictive time series analysis and forecasting of
COVID-19 dataset. in Recent Patents on Engineering (2021)
8. V. Bhateja, A. Gautam, A. Tiwari, S.C. Satapathy, N.G. Nhu, D.N. Le, Haralick features-based
classification of mammograms using SVM (2018)
Feature Extraction from Radiographic
Skin Cancer Data Using LRCS
1 Introduction
In the current one, skin cancer has emerged as one of the most dangerous types of
cancer diagnosed in human beings by classification, skin cancer can be divided into
several categories identifying melanoma, basal and squamous cell carcinoma. Of
these, melanoma is the most common as well as must unpredictable being negatively
more common.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 239
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_20
240 V. S. S. P. R. Gottumukkala et al.
Even though its occurrence is only 4% of all types of skin cancer, its fatality crosses
70%. It may be noted that image processing is widely used for the diagnosis of such
cancer. Demography is a non-invasive analytical procedure, whereas oil immersion
technology helps in the visual examination of skin surface structure [1].
The diagnostic accuracy in melanoma detection is quite often dependent on the
human aspect of the dermatologist’s training input of crossing the maximum assess-
ment. It is pertinent to note that the diagnosis of melanoma is relatively complex
to get these to work out in the early stages. The automated diagnostic tool can be
hence more effective beyond dermoscopic current techniques are computer-aided
identifications of melanoma based on artificial neural Newton [1, 2].
It is quite distressing to note that incidence in the USA over the past three decades
with facilities due to accelerated metastases. Invasive melanoma is nearly 1% of
all types of skin cancer; however, it is responsible for a major proportion of actual
death related to skin cancer in the USA, as many as 91,270 people were affected by
melanoma in 2018 accounting for over 91% of diagnosed cases of skin cancer linked
to 9320 fatalities. In this context, it is essential to develop technologies for detecting
malignant melanoma in the early stages as even the experienced dermatologists find
it difficult to dispose of properly. Record work has been continuously in various
techniques since the late 1990s for advantages automated analysis of dermatology
images required to assist the physicals for enhancing clinical admixture [1, 2]. With
the advent of the computer version, many computerized tools are based on the hospital
to aid the doctor in early diagnosis of skin cancer. The current approach incorporates
key phases for the identification of skin lesions covering comparison with grouping
[2, 3].
It is expected that system-based research could reduce the time of diagnosis and
improve the precision of selectors. During the higher conflicts, greater diversity
completed with limited knowledge, skin diseases are quite challenging for reliable
diagnosis particularly on developing of the underdeveloped country having low health
is foundry. It is worth note that early detections reduce the chance of fatalities. Even
the environmental deterioration aggravates the skin cancer possibilities generally,
the phases of such diseases. The general phases of these diseases are as follows:
STAGE one-in situ diseases, survival 99.9%, STAGE two-high-risk diseases, survival
45–79%, STAGE three-regional metastasis, survival 24–30%, STAGE four-remote
metastasis-survival 7–19% [1, 4].
2 Associated Mechanisms
The literature survey conducted for this paper included twelve research papers of
various authors and approaches to implementation methods. Significant contribution
has been summarized as per the following.
Angunane et al. [1, 5] have been addressed the issues of artificial neural networks
and their AI divisions covering computer vision. Artificial neural networks (ANNs)
are useful for radiology, urology, cardiology and oncology. It may be noted that neural
Feature Extraction from Radiographic Skin Cancer Data Using LRCS 241
networks can help in functions of the highly callable feature. The main parameters
of melanoma are asymmetry, boundary, colour, diameter, etc., based on MATLAB
procedure, the dispute enhances the success rate ASCD rules for melanoma skin
cancer can be coupled with ANN for classification, these returns have to be trained
for the goals. The precision achieved is around 96.9% in classification. Lee et al. [2,
6] used 88 images of melanoma for analysis and suitable segmentation. The three-
step of their method corresponds to its traditional stages affectively the sensitivity,
precision and consistency. It may be rated that for segmentation, the last and proposed
segmentation often are constructed as the ground reality. The three measures have
effected a significant change in the process of diagnosing.
Celebi et al. [7, 8] built an open-source application for convectional layers of
neural networks. With the availability of reduced graphics units, image analysis of
diabetic retinopathy has of last become an essential area of study. The authors have
presented a short review of the exciting sub-field of segmentation and classification
which can be used as partial directives to researchers.
Sugar et al. [9, 10] have used the first extract approach for the texture-based feature
extraction achieving a high level of precision for the classification method of ANN
using multilayer perceptron (MLP). The results for four unsuspected images were
accurate up to 80% at the lowest per cent accuracy and 88.8% for the highest per
cent accuracy on use of melanoma based on 23 pictures.
Pham et al. [11, 12] have evaluated the classification of the outcome of six clas-
sifiers along with seven attribute techniques and four data pre-processes steps based
on two of the most extensive skin cancer datasets. The framework connected on
the linear standard section of the input as a data processing step, HSV as a feature
subtraction device and balanced random forest as a classifier coupled to HAM 10,000
dataset heavy 81.464% AUC, 745% precision, 90.09% at the specified percentage
of 7284.
It is worth notify that many countries have discussed malnova skin cancer detection
based on feature extraction techniques by noisy segmentation, border identifications
methods can be used to focus on ANNS, support vector machine for especially
identify of only reasoned specially. The first part of the five parts of this research
endeavour is the introduction followed by a literacy survey. The third part discusses
the feature extract, whilst the fourth part elaborates on the result and interference of
feature selection mechanisms. The last part covers the conclusion.
3 Proposed System
It is known that the root cause of skin cancer is the abnormal growth of cells on the
skin. The major types of skin cancer are—basal cell carcinoma, squamous carcinoma
and melanoma. If there are not detected easily, spreading in such that it is hard to
treat leading to fatalities. Mainly, this is detected early by visual diagnosis along with
clinical screen followed by dermoscopic analysis histopathological assessment and
biopsy. With the advent of artificial intelligence and machine leaving, the chance of
242 V. S. S. P. R. Gottumukkala et al.
moving the ideal machine for early detection of cancer has become fully potential.
Due to the fine-grained difference in sun lessons appearance, automated classifi-
cation is quite difficult through images. For ideal operation, finely grained objects
are categorized, images are the processor and function is extracted by the lopsid-
edness, Rim irregularity, colour intensity and span (LRCS). For developing a skin
cancer prediction algorithm, it is a pertinent note that spreading to different parts
of the body can move the disease very dangerous. The LRCS fractures are used
for prediction. Dermoscopic images are of use in identifying tumours. To achieve
the nature of being on the pair experts across the globe, it must be used for skin
cancer classification. Such a system has pre-processors, sequential before extracts of
features.
The datasets used are of international skin imaging celebration (ISIC) having an
image of melanoma skin cancer. The ISIC project aims to reduce effectiveness by
early diagnosis in all the categories. There is 2300 photography of each 100–150
images are obtained, processed or chanced.
This section discusses the feature extraction from the skin cancer images, these are
stored as training set data and LRSC architecture is shown in Fig. 1. The first step is
the total directly images and applies segmentation techniques are divides the number
LRCS standard converts (lopsidedness, Rim irregularity, colour intensity and span)
are used in which lopsidedness means uneven area and rim irregular. To irregular
boundaries colour intense clients of the area the often two parameters. The algorithm
steps as follows.
1. Apply the segmentation
|X i ∩ G|
Ai = , i = 1, 2 (1)
Xi ∪ G
Assume B, pB, B∗ as the segmented lesion, the centre of the circumscribed rectangle
of B and a region lopsidedness to B concerning pB. Then, as follows: na is defined
1 |B\B||B ∗ \B|
na = (2)
2 B
Assume B̂ as the convex hull of the lesion band ∂B, ∂ B̂ as the boundary of B and
B̂, respectively, then
1
nb = dist x, ∂ B̂ (3)
|∂ B| x∈∂ B
The above algorithm to determine skin cancer’s image attributes to apply the segmen-
tation to identify the affected area and find out lopsidedness and Rim irregularity.
Also, get the skin’s affected area’s RGB values and density also identifies store
attributes.
244 V. S. S. P. R. Gottumukkala et al.
In this selection, the LRCS algorithm is applied for skin cancer affected images to
obtain attribute orientation and extract the specific feature by applying the techniques
of segmentation, colour detection, the area of affected area and the irregularity of
skin cancer is determined by OpenCv and lands. The affected area of skin cancer
identifies by applying GT mask with segmentation technique is shown in Fig. 2.
Clearly, it explains to identify the area of skin cancer with normal mask, GT mask
and pred mask apply to segmentation technique.
The second step is to determine skin cancer location effected by using segmen-
tation, followed by the need to identify the skin cancer techniques to apply the
techniques like lopsidedness, Rim irregularity, colour intensity and span as features
extracting algorithm. In second step, first, find asymmetry area of the skin cancer
and corresponding diagram is shown in Fig. 3a, b. Second, find area density using
segmentation using masks corresponding diagram is shown in Fig. 3c, and finally,
find centre of the location is shown in Fig. 3d.
After completion of the third and fourth steps to get the values and extract the
features of the image id, a read of effected and perimeter, maximum diameter and
minimum diameter, horizontal asymmetry and vertical asymmetry, max red, max
green, max blue, minimum red, minimum green, minimum blue, hue, saturation,
value as follows in Table 1.
Table 2 indicates 11 features of the image for randomly applying different area
to dataset. Generally, area of skin cancer image gives the information about whether
that particular part is having skin cancer or not and it is determined using mentioned
11 features in Table 2. Figure 4 indicates 11 features of image randomly applying
area to dataset. Best features of values obtained for area at 152,221 apply to dataset
and worst feature values at area 258,643 applied to dataset. As indicated in graph,
different colours of each rectangular box for 11 features by applying different areas
to dataset is shown in Fig. 4.
Perimeter indicates the exact area of skin cancer and gets different perimeter values
for different areas. Table 1 indicates 11 features of the image with different perimeter
values get randomly applying different area to dataset. Figure 5 indicates 11 features
of image get different perimeter values for randomly applying area to dataset. Best
feature values observed for perimeter at 9.66 and worst values at perimeter 30.73.
As shown in graph in different colours of each rectangular box for 11 features by
applying different perimeter values are shown in Fig. 5.
Features of the image of skin cancer by taking maximum diameter as 600.
Maximum and minimum diameter indicate the range and depth of skin cancer. In
this algorithm, maximum diameter 600 is considered because this is highest value
for saviour skin cancer and minimum diameter 450 because this is highest value for
normal skin cancer. Table 1 shows the 11 featured values of maximum diameter 600
with different area values, and graphical view is shown in Fig. 6. Table 1 shows the 11
features values of minimum diameter 450 with different area values, and graphical
view is shown in Fig. 7.
Feature Extraction from Radiographic Skin Cancer Data Using LRCS 245
Fig. 2 Apply the normal mask, GT mask and pred mask segmentation technique to find out the
affected area
246 V. S. S. P. R. Gottumukkala et al.
Fig. 3 a To find out the asymmetry of the area of skin cancer, b to find out the asymmetry of the
area of skin cancer, c to apply the masking segmented, find out the area density, d colour density
and centre of the location are forced
The results of the extraction of the dataset prove the effectiveness of LRCS param-
eters. With these datasets, it is convenient for classification to find out the exact nature
of cancer. Also, we can ensure the effective area of cancer and its severity and depth
of cancer at particular location.
5 Conclusions
In skin cancer, early detection is the most critical factor for the health sector as
well as the patient’s object detection method is suitable for such diagnosis of skin
cancer characteristics. In this paper, it has focussed mainly been on feature extraction
as LRCS parameters to identify the attributes like maximum diameter, minimum
diameter, horizontal ammeter, vertical ammeter, maximum green, maximum blue,
maximum red and structure values. The customized algorithm has been developed
for convenient extraction for uneven area of the affected skin and skin cancer affected
area of colour density is extracted from around 2500 images to determine the results
of the attribute are displayed as the datasets and generated in the future, classification
of the feature enhanced data cannot out by real return algorithm on system technique.
Table 1 Extract features of the image of skin cancer on dataset
Area Perimeter Maxdia Mindia h_asym v_asym Maxr Maxg maxb Minr Ming Minb h S v
210,136 881.74 600 450 0.367 0.562 255 255 255 0 2 25 14 140 90
188,162 48.63 600 450 0.079 0.079 254 241 255 0 0 0 12 140 89
190,534 30.73 600 450 0.27 0.27 216 209 238 0 0 0 11 140 90
246,518 49.56 600 450 0.742 0.192 255 243 250 11 11 44 15 140 90
152,221 71.94 600 450 0.894 0.631 214 211 255 0 0 52 9 255 147
239,795 709.31 600 450 0.4 0.418 249 231 245 0 2 45 9 255 147
221,596 15.9 600 450 0.279 0.687 211 193 252 35 24 101 2 187 147
221,633 9.66 600 450 0.634 0.012 210 199 253 6 3 27 5 221 147
215,275 14.83 600 450 0.394 0.754 190 182 240 17 20 77 6 207 147
250,307 574.5 600 450 0.719 0.154 255 243 242 7 0 7 7 137 88
258,643 406.37 600 450 0.281 0.313 234 214 226 2 9 37 14 251 147
Feature Extraction from Radiographic Skin Cancer Data Using LRCS
171,692 32.38 600 450 0.361 0.452 231 222 229 3 4 28 1 122 88
233,875 61.94 600 450 0.099 0.27 242 228 255 0 7 25 11 139 90
193,000 43.21 600 450 0.013 0.013 187 172 251 23 16 98 5 201 147
247
248 V. S. S. P. R. Gottumukkala et al.
Table 2 Features of the image of skin cancer by variation of area into consideration
Area h_asym v_asym Maxr Maxg Maxb Minr Ming Minb h S V
210,136 0.367 0.562 255 255 255 0 2 25 14 140 90
188,162 0.079 0.079 254 241 255 0 0 0 12 140 89
190,534 0.27 0.27 216 209 238 0 0 0 11 140 90
246,518 0.742 0.192 255 243 250 11 11 44 15 140 90
152,221 0.894 0.631 214 211 255 0 0 52 9 255 147
239,795 0.4 0.418 249 231 245 0 2 45 9 255 147
221,596 0.279 0.687 211 193 252 35 24 101 2 187 147
221,633 0.634 0.012 210 199 253 6 3 27 5 221 147
215,275 0.394 0.754 190 182 240 17 20 77 6 207 147
250,307 0.719 0.154 255 243 242 7 0 7 7 137 88
258,643 0.281 0.313 234 214 226 2 9 37 14 251 147
171,692 0.361 0.452 231 222 229 3 4 28 1 122 88
233,875 0.099 0.27 242 228 255 0 7 25 11 139 90
193,000 0.013 0.013 187 172 251 23 16 98 5 201 147
Fig. 6 Comparison of
features of the image of skin
cancer by taking maximum
diameter as 600
References
1. H.R. Mhaske, D.A. Phalke, Melanoma skin cancer detection and classification based on super-
vised and unsupervised learning, in Proceeding of the International Conference on Circuits,
Controls, and Communications (CCUBE), 1–5 Jan 2013, Bangalore, India (IEEE, 2013)
2. H. Lee, K. Kwon, Diagnostic techniques for improved segmentation, feature extraction, and
classification of malignant melanoma. Biomed. Eng. Lett. 10, 171–179 (2020)
3. M.A. Khan, M. Sharif, T. Akram, S.A.C. Bukhari, R.S. Nayak, Developed Newton-Raphson
based deep features selection framework for skin lesion recognition. Pattern Recogn. Lett. 129,
293–303 (2020)
4. M.M. Vijayalakshmi, Melanoma skin cancer detection using image processing and machine
learning. Int. J. Trend Sci. Res. Dev. (IJTSRD) 3, 780–784 (2019)
5. C. Magalhaes, J. Mendes, R. Vardasca, The role of AI classifiers in skin cancer images. Skin
Res. Technol. 25, 750–757 (2019)
6. A. Murugan, S.A.H. Nair, K.S. Kumar, Detection of skin cancer using SVM, Random Forest,
and kNN classifiers. J. Med. Syst. 43, 269 (2019)
7. M.E. Celebi, N. Codella, A. Halpern, Dermoscopy image analysis: overview and future
directions. IEEE J. Biomed. Health Inform. 23, 474–478 (2019)
8. S. Majumder, M.A. Ullah, Feature extraction from dermoscopy images for melanoma diagnosis.
SN Appl. Sci. 1, 753 (2019)
9. Y. Sugiarti, J. Na’am, D. Indra, J. Santony, An artificial neural network approach for detecting
skin cancer. TELKOMNIKA Telecommun. Comput. Electron. Control 17, 788–793 (2019)
250 V. S. S. P. R. Gottumukkala et al.
10. M.Q. Khan, A. Hussain, S.U. Rehman, U. Khan, M. Maqsood, K. Mehmood, M.A. Khan,
Classification of melanoma and nevus in digital images for diagnosis of skin cancer. IEEE
Access 7, 90132–90144 (2019)
11. T.C. Pham, G.S. Tran, T.P. Nghiem, A. Doucet, C.M. Luong, V.D. Hoang, A comparative
study for classification of skin cancer, in Proceeding of the International Conference on System
Science and Engineering (ICSSE), 20–23 July 2019, Dong Hoi, Vietnam (IEEE, 2019)
12. T. Sreelatha, M.V. Subramanyam, M.G. Prasad, Early detection of skin cancer using melanoma
segmentation technique. J. Med. Syst. 43, 190 (2019)
Shared Filtering-Based Advice of Online
Group Voting
1 Introduction
A user not only shares his updates with direct friends in terms of text, imagery and
video, but also can spread such updates easily to a much broader range of indirect
friends, use the rich networking and worldwide scope of common OSNs. Many NSOs
now provide a social voting capability that enables users to express their views with
peers, for example, like or hate them, on different topics, from user statuses to profile
photos, played sports, bought goods, visited websites… Taking this as a disdain—
some OSNs, such as empowering users to run their own voting campaigns, with
user-specific voting choices on every issue of their preferences one step further. In
reality, social voting often offers several possible trade values other than relaxing
immediately social experiences. Advertisers may take votes for those brands to be
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 251
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_21
252 M. Kalyan and M. Sandeep
advertised. Product managers will initiate market research votes. In order to draw
more online buyers, electronic trade owners should strategically launch votes [1].
When the social vote becomes more and more common, the issue of “knowledge over-
load” arises: the consumer is quickly overpowered by different elections conducted
by direct and indirect peers. The presentation of “right votes” to “right people” is
crucial and difficult to optimize user experiences and increase user involvement in
social votes. Recommendation systems (RSs) address the overloading of information
by recommending things that may be in their interests to consumers. In this article,
we present our recent efforts to improve RSs for social voting, that is to say to suggest
interesting campaigns for users. Unlike the usual recommendation products, such as
books and films, social votes spread via social connections. If you initialize, partic-
ipate or rebound your mates, a person is more likely to be subjected to a vote [2,
3]. The popularity of a vote in your social neighborhood is heavily associated with
the voting practices. Social propagation also emphasizes social impact: if a person
participates in the vote, he or she will be more likely to vote. The action of a consumer
in voting is closely associated with social friends because of social transmission and
social influences. The use of social confidence knowledge creates special obstacles
and opportunities for RS’s. Moreover, without negative samples, voting data shall be
binary. The development of RSs for social voting is also fascinating. To overcome
these challenges, we create a range of novel RS models; including MF models and
NN models to learn user interest, while also mining knowledge on user voting, user
friendship and user-group distress. To address these challenges, we have a series of
novel RS models [4–6].
2 Problem Statement
3 Proposed Methodologies
Our experience has not been studied too much online social voting. We build RS
models for MF and NN. Experiments of true social voting traces reveal that infor-
mation about both the social network and party affiliations can be used to greatly
boost the accuracy of the reputation recommendations for voting [4]. Our NN-based
models studies show that information about the social network dominates group
membership information. And for cold user’s social and community knowledge is
better than for strong users. We demonstrate that easy NN model models based on
track results in hot-voting computational MF models while MF models can be used
more efficiently in the interest of users for non-hot voters.
4 Enhanced System
Using the valid username and password to login the admin. After effective logon,
certain operations may be conducted, such as account authorization, list users and
authorization, See all requests and answers for friends, Add the mail, See all video
posts, Display all postal services recommended, View all postal services reviewed,
Collective Filtering History, Find Top K hit rate in table. All user search history. Both
friend requests and answers can be seen by admin. The tags ID, the requested user
photo, the requested user name, the request for user’s name, status and time and date
are shown for any request and reply. The status will be modified to approve whether
the recipient approves the request or the status will stay as pending. Both friends
who come from the same Website can be seen by the manager. Information include,
Request From, Website Request, Name Requests and Website Requests. The admin
will view all posts on the same and other network pages shared by friends. Info like
image article, cover, definition, name recommendation and name recommendation.
The admin attaches information like the title, summary and the posting picture [5].
Details like title and summary are encrypted and saved in a folder. There are no
user numbers available. Before carrying out any activities, users may log. Upon
registration, the information will be saved in the database. Using the approved user
name and password, he needs to login after effective registration. Once login is
successful, some operations such as Register and Login can be performed, view
your profile, request for friend, For Friends, Find Friends See all of your mates, Post
search, history of my search, recommendations view, user interests view in post, Top
K hit rate view. Users browse the same Website for their users and give them friendly
requests in various Websites. The user can find people on other pages only if they
have permission to make friends [6] in Fig. 1.
254 M. Kalyan and M. Sandeep
5 Conclusion
We have a range of online group voting MF-based and NN-based RSs. Via real data
studies, we have found that the accuracy of popularity-based voting advice, particu-
larly for cold users, can be considerably increased in social network information and
group affiliation information, and social network information dominates group affili-
ation data within NN-based approaches. This paper shows that social and community
knowledge is significantly more useful to increase the accuracy of recommendations
for cold users than for heavy users. This is because cold consumers are more likely
to take part in common votes. In our tests, clear path-based NN models outperform
hot-voting computational intensive MF models, although no hot-voting preferences
are best exploited by MF models by users. This paper is only our first step toward
a comprehensive analysis of the recommendation on social voting. As a matter of
urgency, we would like to research how the details on voting material, particularly
Shared Filtering-Based Advice of Online Group Voting 255
for cold votes, can be extracted for recommendation. Given the availability of multi-
channels information about your social districts and events, we are also involved in
creating RSs for individual customers.
References
1 Introduction
Click information from the database logs and the query session data or query subject
templates would be the basis of effective keyword recommendation approaches.
Depending on their semantically importance to the initial keyword query, new
keyword proposals will be decided. Based on a query log overlay of their clicked
URLs, the semantic relevance between two keyword query types can be determined
via its proximity to a bipartite graph which links keyword queries and their clicked
URLs in the query log based on their similarities in the topic distribution space.
However, none of the available methods have position-conscious query suggestions;
therefore, keyword queries can be used not only in connection with user knowledge
needs, but in the vicinity of the user location. This necessity comes about thanks
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 257
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_22
258 A. Sreelekha and P. Dileep
to the prevalence of spatial keyword search, which considers the user location and
the keyword query provided as arguments, returning spatially near and textually
related objects. In 2011, Google handled an average daily of 4.7 billion requests,
and all of which have local purpose and aim Web site items (i.e., Web sites with
text descriptions) or geo-documents (i.e., documents associated with geo-locations).
In addition, 53% were found to have a local intention in 2011 in mobile search of
Bang. We propose a keyword question suggestion mechanism (LKS) for the purpose
of filling this void. We show the advantage of LKS with a toy instance. Please take
into account five of the geo-documents d1-d5. Any document is linked to the place.
Assume a consumer issues a chq = \seafood keyword question “_q at _q location.
Notice the d1-d3 (including \seafood) documents”) well beyond _q. One place known
is \lobster “The documents d4 and d5 which are both related to the initial user search
intention are able to be found nearby. Earlier models to propose keyword queries
disregard the user place and the,” which again does not get the documents in the
vicinity. Note that, LKS has another purpose and, therefore, differs from other recom-
mendation location-aware approaches. The first problem of our LKS architecture is
how keyword query similarity can be measured accurately and the spatial distance
element captured. LKS approaches use a bipart keyword graph (KD-graph for short)
to bind keyword queries to their documents as shown in Fig. 1 in conjunction with the
previous query suggestion (c). Contrary to all past methods, which neglect positions,
LKS modifies the weight for KD-graph edges such that the semantic significance
of keyword queries as well as the spatial distance between the locations of the text
and the _q position of the query editor is captured. In order to find the m keyword
query set with highest semantic significance to kq and spatial proximity to the user
position, we use a random restart mechanism (RWR), starting from the query chq
given by the users. RWR on a kid-graph was deemed superior to alternative methods
and was a common technique used in multiple keyword recommendation studies
(location-independent).
2 Problem Statement
In terms of its impact and its significance, so many areas have been dedicated to an
appropriate approach only to the small workload. We propose a formal description,
based on the sectors which both should represent, of competition between two prod-
ucts in this article. Our competitiveness assessment uses consumer feedback, a rich
pool of knowledge in a variety of fields. In large-scale analysis datasets, we propose
effective approaches to evaluate competition and resolve the natural problem of iden-
tifying the top competitors for one particular object. Finally, the accuracy of our find-
ings and the scalability of our strategy were assessed with several domain datasets.
An experimental assessment on actual datasets from various domains tested the effi-
ciency of our approach; we present effective competition assessment approaches in
broad revision datasets and deal with the natural problem of identifying the top-K
competitive competitors for a certain commodity. A business with a number of n
products I and a series of functions is introduced to us. We then want to classify the
I-k objects that optimize CF in the specified single item I 2 I (I) [1–3]. Furthermore,
a naive implementation MapReduce would face the difficulty of moving anything
into the reducing device to account for the integration into the computation.
3 Proposed Methodologies
4 Enhanced System
A correct username and password are needed for admin. After successfully logged
in, he will perform such procedures including viewing and authorizing all users, their
data. Hotels are added (hotel name, location, area name, item name, item price, item
description, item image, no. of rooms available, room charge distance from location)
and add malls (name of mall, location, name of zone, definition of mall, specialized
mall, mall picture, location distance), take a look at all hotel info, comments, rating,
comments, all mall info, see all hotel booking info, payment details, see the results
chart of hotels and mall rating, see the top-K keywords searched for. There are n
numbers of people. Until performing such activities, users can register and submit
your position when registering. Using a correct username and password and loca-
tion, it can log in after active registration. After login is complete, he will conduct
several operations such as profile view information, account creation and manage-
ment, check of neighboring hotels and centers from your site, GMap, comment, book
hotels, display top-K keywords checked. Numerical users are available. Until doing
such activities, users can register and add their location during registration. Use the
correct username and password and location to login after registry is successful.
After active login, he will perform such operations, including viewing profile infor-
mation, generating and administering account, searching nearest hotels, malls, GPS,
commentary, book hotels, showing top-K keywords checked. Name of mall, loca-
tion, area name, description of mall, specialization of mall, mall image, and location
distance in Fig. 1.
5 Conclusion
References
1 Introduction
Cloud computing is one of the fast growing technology, which can be used to provide
different types of services, i.e., storage, compute, and network to all users who
were subscribed to it in a seamless fashion by using virtualization. According to
NIST [1], “cloud computing can be defined as an on demand network access to a
shared pool of configurable computational resources.” This paradigm mainly gives
the access of the resources in cloud in the form of different type of services. Cloud
computing architecture mainly needs an application which should run from a browser
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 263
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_23
264 S. Mangalampalli et al.
2 Related Works
these parameters. In this paper, we are proposing a prioritized load balancer for
minimization of VM and data transfer costs in cloud computing [15–17].
The proposed prioritized load balancer is used to maps tasks, which were coming
onto cloud console and maps these tasks to corresponding VMs based on calculation
of priorities of tasks. Task priority is calculated based on length of the task and
processing capacity of tasks. Initially tasks coming on to cloud console were coming
at the data center Controller, and these tasks were given as input to prioritized load
balancer. This prioritized load balancer will calculates priorities of tasks based on
size of task and processing capacities of tasks. Initially, tasks were inserted onto the
queue and then load balancer will checks length of the task and runtime processing
capacity of task and then assign these tasks to appropriate VMs based on the priority
calculated by the load balancer.
If a task comes with high priority based on length and processing capacity of task,
then load balancer will check for a VM which is suitable for the current task and
assign all tasks which were in the queue based on this calculation of priority, and
this process continues until all tasks in queue were completed as shown in Fig. 1.
Case 1: if VM found
a. Prioritized load balancer will returns corresponding id of that
VM to DC controller.
b. DC Controller sends a request to identified VM by prioritized
load balancer using that id.
c. DC Controller also notifies prioritized load balancer about VM
allocation.
d. Prioritized load balancer updates allocation of VM.
Simulation is carried out in cloud analyst [17] simulator, which is a graphical GUI-
based simulator, which was built on cloudsim in Table 1.
In this paper, we have integrated prioritized load balancer algorithm into cloud
analyst by extending data center class in cloud analyst. The primary objective of
this proposed algorithm is to map incoming jobs by using priority of tasks and then
assign an appropriate VM in data center. We have addressed the metrics named as
response time, virtual machine, and data transfer cost at data centers.
The data center settings for this simulation which consists of RAM which is
4 GB; data storage space is 100 GB; number of CPUs used for a host is of 4 CPUs
and processing capacity of 10 K MIPS. Initially, we have analyzed the load balance
of VMs, response time and then identified the VM and data transfer cost in the
corresponding data center.
268 S. Mangalampalli et al.
1200
1000
800
600
400
200
0
1 2 3 4 5
Fig. 2 Load balancing and usage of VMs for different load balancers
In this algorithm, our primary objective is to balance the VMs by assigning tasks
based on priorities of tasks I (Fig. 2). This is calculated based on length and processing
capacity of tasks. In this simulation, we have compared our proposed load balancer
with the existing algorithms named RR and throttled algorithms. Table 2 represents
usage of different VMs for the corresponding load balancer.
After calculating usage of VMs, we have calculated response time which is an impor-
tant parameter for any scheduling and load balancing algorithm as shown in Fig. 3.
We have compared our algorithm against RR and throttled load balancers and simu-
lation results revealed that our proposed algorithm outperforms existing algorithms
in terms of response time in Table 3.
Prioritized Load Balancer for Minimization … 269
Finally, we need to calculate total VM cost and data transfer cost for the above
simulation for 60 min, and then, we have also calculated the VM costs for the existing
RR and throttled load balancers in Fig. 4. Table 4 represents the total cost of data
center i.e., sum of data Transfer cost and VM cost.
References
10. J.P. Mapetu, Z.C. Buanga, L. Kong, Low-time complexity and low-cost binary particle swarm
optimization algorithm for task scheduling and load balancing in cloud computing. Appl. Intell.
49(9), 3308–3330 (2019)
11. M. Lagwal, N. Bhardwaj,Load balancing in cloud computing using genetic algorithm, in 2017
International Conference on Intelligent Computing and Control Systems (ICICCS) (IEEE,
2017)
12. S. Mohanty et al., A novel meta-heuristic approach for load balancing in cloud computing, in
Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed
and Cloud Computing (IGI Global, 2021), pp. 504–526
13. Y. Fahim et al.,Load balancing in cloud computing using meta-heuristic algorithm. J. Inf.
Process. Syst. 14(3) (2018)
14. A. Ragmani et al., An improved hybrid fuzzy-ant colony algorithm applied to load balancing
in cloud computing environment. Proc. Comput. Sci. 151, 519–526 (2019)
15. B. Mallikarjuna, P. Venkata Krishna,A nature inspired bee colony optimization model for
improving load balancing in cloud computing. Int. J. Innov. Technol. Exp. Eng. 8, 51–54
(2018)
16. M. Lawanyashri, S. Subha, B. Balusamy, Energy-aware fruitfly optimisation algorithm for load
balancing in cloud computing environments. Int. J. Intell. Eng. Syst. 10(1), 75–85 (2017)
17. B. Wickremasinghe, R.N. Calheiros, R. Buyya,Cloudanalyst: a cloudsim-based visual modeller
for analysing cloud computing environments and applications, in 2010 24th IEEE international
conference on advanced information networking and applications (IEEE, 2010)
Smart Underground Drainage
Management System Using Internet
of Things
1 Introduction
The drainage system performs a significant feature among huge urban areas in the
place where millions of people survive. The fundamental thing to provide the drainage
K. V. M. Mohan (B)
Department of ECE, Teegala Krishna Reddy Engineering College, Hyderabad, Telangana, India
K. M. V. M. Kumar · S. Kodati
Department of CSE, Teegala Krishna Reddy Engineering College, Hyderabad, Telangana, India
G. Ravi
Department of CSE, MRCET, Hyderabad, Telangana, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 273
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_24
274 K. V. M. Mohan et al.
system is to make sure the dryness in lands from the not used and over flow of rain
water. In order to maintain the appropriate functionality, the drainage environment has
to lie observed. Indeed, not every area holds a drainage observing group. It prompts
an irregular checking over the drainage condition. The unpredictable observing has
contributed according to the obstructing on the drainage so much inference to the
greeting which triggers flooding in the area. Since the commitment to plenty of
people is required who are simply ready to document a limited report together with
low accuracy, manual checking has also not much effective one. Drainage checking
system is not automated. Along it lines, at whatever point there is a blockage such
is strong to sort out the particular area about the blockage. Likewise, early cautions
concerning the blockage are not received. It becomes out according to very inconve-
nient to deal with the circumstance, so pipes are blocked totally. People faced many
problems and need a support since the leakages on the drainage lines [1].
The problem encountered with this type of drainage line can turn daily city life
into serious problems. Some problems like clogging due to waste, sudden dilatations
of the water level, and various polluting gases can arise if the best possible cleaning
actions are not carried out in time. The current drainage system is not automatic as
it is difficult to tell if the railing is in a particular position. In addition, from time to
time, various gases such as methane (CH4 ), carbon monoxide (CO) can be generated
in these drainage pipes from time to time, which are destructive and can cause serious
problems if inhaled by humans in large quantity. These problems are common among
workers who have been given notice and can lead to death as a result. Likewise, we
do not receive any warning of clogging or rising levels of these gases or expanding
water levels. Recognizing and repairing the block is therefore tedious and hectic [2].
Developing clean and smart urban areas with automated monitoring system is the
main aim of the system proposed in the paper as shown in Fig. 1. The main operation
of this system is to continuously monitoring the unused water level, flow of rate of
water, and any leakages in drainage channels persistently by sensing an alert message
to provide concerning information through GSM and LCD. This system can give a
low-cost and adaptable solution of monitoring in all condition of sensors. In Fig. 2,
ARM7 is a family of processors widely used in embedded system applications [3,
4]. It is manufactured by Philips and comes pre-installed with numerous integrated
peripherals. This makes it a more effective and reliable alternative for beginners and
high-end app developers.
Smart Underground Drainage Management … 275
The 16 × 2 display has 32 characters by and large, for example 16 out of one line and
the other 16 in the second line. Each character is made of 50 pixels, consequently, all
the pixels must work to show the character effectively, and this capacity is constrained
by another regulator (HD44780) in the display unit in Fig. 3. Basically, the LCD is
utilized to display the information obtained from the different sensors.
276 K. V. M. Mohan et al.
The water go with the flow sensor have a plastic valve body, water rotor, and the hall-
effect sensor. At the point, whenever water moves through the rotor, the rotor rotates.
According to the different rate of change of water flow, speed of rotor is changes in
Fig. 4. The analogous twig signal is produced as output from the hall-effect sensor.
This certain is terrific in conformity with recognize stream among a water container
or coffee machine [5].
Figure 5, LM35 is a temperature sensor that produces a simple sign which is relative to
the instantaneous temperature [6]. The output voltage can undoubtedly be deciphered
Smart Underground Drainage Management … 277
The industries, fuel stations, etc. are the main sources from which various harmful
gases are produced. Such gases are detected using the gas sensors in Fig. 6. These
sensors can easily sense the content of H2 S levels in the environment and can give
an alert in a form of sound or visual information by simply incorporating it into the
system. These sensors have feasible interface ability in to the system and produce a
fast response [8, 9]. The blockages occur in the drainage channels are cleared and
worked through manholes that are enclosed in various places along the channels [10].
Fig. 7 GSM
2.5 GPS
2.6 GSM
The GSM can support communication into 900 MHz band. We are from India and
the greater section about the versatile system suppliers between that nation works
among the 900 MHz band. In Fig. 7,
3 Algorithm
• Power Up equipment
• Initialization of the equipment module
• Display the “DRAINAGE MONITORING SYSTEM” on LCD
• Sense esteem values of sensor using microcontroller.
• Display the temperature on LCD by sensing temperature with temperature sensor
Smart Underground Drainage Management … 279
4 Working of ARM7
This ARM processor is interfaced to the various kinds of concerning water level,
water flow, and gas recognizing sensors. This processor controls all the signals
coming from the various sensors attached to it when they are detected if exceeds
their threshold levels by transferring them to the processor. Then, the information
taken from such sensors this ARM7 monitors and controls the drainage system by
performing the concerned actions and also notifies the condition to the nearest munic-
ipal corporation through sending an text SMS using the GSM technology. This can
support to locate the blockage of water level and drainage easily for taking further
controlling actions. The real-time monitoring of sensor values are detected through
this ARM7 and then continuously update them using Web server in IOT by interfacing
the sensors to different number of ports of processor. The complete information of
system with respect to sensors is likewise being displayed on the 16 × 2 LCD.
5 Results
6 Conclusion
in the program using the GSM module. This process was repeated for the high level
of water detection condition also.
References
1. M.T. Lazarescu, Design of a WSN platform for long-term environmental monitoring for IoT
applications. IEEE Emerg. Sel. Topics Circuits Syst. 3(1), 45–54 (2013)
2. Y. Narale, A. Jogal, S.P. Bhosale, H. Chowdhary, Underground drainage monitoring system
using IoT. Int. J. Adv. Res. Ideas Inno. Technol. 4(1) (2018)
3. S. Rao, S.K. Muragesh, Automated Internet of Things for underground drainage and manhole
monitoring systems for metropolitan cities. Int. J. Inno. Sci. Eng. Technol. 2(4) (2015)
4. A. Suvarna, S.A. Shaik, Sonawane, Monitoring smart city application using Raspberry PI based
on IOT. Int. J. Adv. Res. Ideas Inno. Technol. 5(VIL) (2017)
5. G.A. Naidu, S. Kodati, J. Selvaraj, A review report of smart health care applications and benefits
using Internet of Things. Int. J. Recent Technol. Eng. (IJRTE) 8(3) (2019). ISSN: 2277-3878
6. S. Velliangiri, R. Sekar, P. Anbhazhagan, Using MLPA for smart mushroom farm monitoring
system based on IoT. Int. J. Netw. Virt. Organ. 22(4), 334–346 (2020)
7. S. Velliangiri, S.A. Kumar, P. Karthikeyan (eds.), in Internet of Things: Integration and Security
Challenges (CRC Press, 2020)
8. D.P. Rajan, D. Baswaraj, S. Velliangiri, P. Karthikeyan, Next generations data science applica-
tion and its platform, in 2020 International Conference on Smart Electronics and Communi-
cation (ICOSEC), Trichy, India, pp. 891–897 (2020). https://doi.org/10.1109/ICOSEC49089.
2020.9215245
9. S.H. Khan et al., Statistics-based gas sensor, in 2019 IEEE 32nd International Conference
on Micro Electro Mechanical Systems (MEMS), Seoul, Korea (South), pp. 137–140 (2019).
https://doi.org/10.1109/MEMSYS.2019.8870821
10. A. Baviskar, A. Mulla, A. Bhovad, J. Baviskar, GPS assisted standard positioning service for
navigation and tracking: review and implementation, in 2015 International Conference on
Pervasive Computing (ICPC), Pune, pp. 1–6 (2015)
IoT-based System for Health Monitoring
of Arrhythmia Patients Using Machine
Learning Classification Techniques
S. Kodati (B)
Department of CSE, Teegala Krishna Reddy Engineering College, Hyderabad, Telangana, India
K. P. Reddy
Department of CSE, CMR Institute of Technology (Autonomous), Hyderabad, Telangana, India
G. Ravi
Department of CSE, MRCET, Hyderabad, Telangana, India
N. Sreekanth
Department of CSE, BVRIT Hyderabad College of Engineering for Women, Hyderabad,
Telangana, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 283
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_25
284 S. Kodati et al.
1 Introduction
In every one of the day to day life around the world, a significant role has been
played by the healthcare. By the appropriate treatment, diagnosis and early stage
prevention of health diseases can be achieved. It was easy to prevent some rare
diseases in the early stages such as heart stroke and heart attack when they occur.
Burden on the health systems needs to be reduced in order to keep the level and
quality of medical care on the more high level. Arrhythmia is the condition in which
there is an irregular beating of heart, and this is categorized under the pathologies of
heart [1]. The condition of arrhythmia patients has to be continuously monitored as
the information about the heartbeat is necessary to diagnose the type of arrhythmia
and to follow appropriate medical procedures. Usually, the patients are diagnosed
using the electrocardiogram (ECG) monitors or monitors that are available in the
clinics. This is where the problem arises; for arrhythmia diagnosing, ECG of patients
has to be continuously monitored for minimum of 24–48 h. Hence, the wearable
sensors have been used for monitoring the patient’s health condition over their typical
environment. An increasing interest has been observed on the wearable sensors from
the latest years. Many researches demonstrated that the recording and then managing
the patient’s physiological information for a long duration of time can be achieved
by considering the application of wearable sensor technologies [2].
A possible solution for healthcare system to reduce their burden is the Internet
of Things (IoT). Such healthcare system should utilize and take into consideration
of declaring about some amount of adding data preprocessing, analysis, and storage
approaches to the system since enormous quantity of data exist in present Internet
world [3]. According to the wireless sensor structure-based technology, knowledge
and information have been explained by the healthcare environment over the recent
days [4]. Patients face an uncertain circumstance at the end of prognosis due to the
specific explanation of heart problems and heart attacks which are a straightforward
consequence of the lack of good therapeutic maintenance for patients at the necessary
conditions. It is used to clearly identify elderly and young patients and to educate
specialists, friends, and relatives. This is why considering a resourceful activity is
an idea that can achieve such unexpected success rates by controlling the patient
condition monitoring using sensor technology and the Internet to pass it on to friends
and family in a chance when a problem arises. Furthermore, it has been observed that
the ML processes are used for continuous improvement in various areas of the Internet
of Things (IoT) [5]. IoT-based arrhythmia patient monitoring system with wearable
sensors, classification of arrhythmia using multi-machine learning algorithms, and
the data visualization through the prediction of classified data are the main objective
of work to be presented in this paper.
IoT-based System for Health Monitoring of Arrhythmia … 285
Another latest technology used for information exchange is machine learning which
implements algorithms to learn from the datasets [10]. According to the earlier obser-
vations of machine, algorithms can be implemented in this machine learning. Recog-
nizing the patterns over the data and using the learnt patterns are the main aim of the
machine learning. The fundamental artificial intelligence methodology that extracts
information during data training is the machine learning technique [11]. These ML
algorithms can be useful for difficulty of distinguishing and getting wide information
286 S. Kodati et al.
or record patterns [12]. In particular, this technique is most suitable for medical and
healthcare purposes to diagnose and identify the various diseases.
Figure 1 shows the framework of present IoT-based arrhythmia patient’s health moni-
toring system using machine learning. The main objective of the this health moni-
toring system is providing a health condition monitoring of arrhythmia patients using
IoT technology and then classifying that sensed data from patients with the help of
Controller
Data Transmission
Wi-FI module
Cloud Storage
Preprocessing
Present Feature
Sensor
Classification
Data
Prediction
Presence of Absence of
Arrhythmia Arrhythmia
IoT-based System for Health Monitoring of Arrhythmia … 287
Initially, wearable sensors are used in this system at the patients’ environment for
the continuous monitoring of their health condition, and these sensed data are trans-
mitted to the cloud. The transmitted data can be seen and monitored by the patients’
family members as well as doctors. In this presented framework, system is used to
monitor the patients continuously who are suspected to suffer from arrhythmia. The
heartbeat, body temperature, etc. of some physiological parameters can be monitored
in such system through the use of respective wearable sensors. These parameters are
controlled, and continuous monitoring can be achieved by using the controller. The
real-time data monitoring can be done by using the global system for mobile commu-
nication (GSM) and global positioning system (GPS) technologies with the location
of patients also. The sensed data transmitted to the cloud are then visible to the
patients and doctors through a software platform called “ThingSpeak.”
This section called data transmission makes use of Wi-Fi module technology to
obtain the data from different wearable sensors attached to the patients, and then
in the cloud section, this module transmits the obtained sensor data to the cloud.
The secure transmission can be achieved with this to provide the privacy to the
patient’s data. The design of system is in such a way to effectively and feasibly
store the patient’s bidirectional data into the cloud for a large duration of time.
Feature of this storing patient’s data for longer duration can help the doctors and
medical professionals in diagnosing arrhythmia. In this section, components used
provide an access to monitor the data transmitted to the cloud through a platform
called ThingSpeak. It displays the sensed physiological parameters like heartbeat,
temperature in the form of graphical representation. The data transmission method
used in this system provides an improved scalability and additional advantages such
as accessibility of data to both the doctors and patients based on demand.
288 S. Kodati et al.
After transmitting the data to cloud, that can be accessible to the medical professionals
and doctors through a platform to make a decision on sensed data in diagnosing the
patient’s health. If there were any abnormalities in the sensed physiological data
stored in cloud like irregularity in heartbeat, high temperature, then classification
can be performed to make the specification decision on it. In this system, sensor data
regarding to the heartbeat and body temperature of the patients suspected to suffer
from arrhythmia are monitored and then stored in the cloud. In this machine learning
techniques are used to classify the data if occur any abnormalities in the sensed data
and then predict whether the patient suffer from arrhythmia or not. Multiple machine
learning techniques are used in this system to perform the classification on the data.
Then, by using classified data, prediction of arrhythmia can be accomplished. The
total process of this classification and prediction can be explained below.
The extended improvements for the data to be analyzed in this system before the
actual processing of information can be dealt in this information preprocessing step.
In this preprocessing framework, tattered information turns in to error-free training
dataset. Information with the predefined location is necessary for some artificial
intelligence-defined models such that; for example, in the random forest algorithm,
invalid attributes must be handled by the master training dataset to conduct random
forest calculation as it does not enforce invalid attributes. The other main objective
of this preprocessing step is to organize the huge collected information with a goal
of applying more than one machine learning and deep learning classifiers to perform
on one training dataset and choosing the best among them.
In general, artificial intelligence techniques are used for the diagnosis purposes in
the healthcare system to classify the patient’s health data in predicting the diseases,
and this most widely makes use of machine learning and deep learning techniques.
Therefore, in this system, five well-known types of machine learning classifiers are
used to effectively classify the data that are discussed as follows one by one. By using
such multi-machine learning algorithms, data can be classified and then predicted the
arrhythmia based on classified data. Modeling and processing of collected and stored
patient’s health data are carried out to predict the arrhythmia in this paper using the
multi-machine learning algorithms. This multi-classification process is done with the
collaboration of IoT analytics and cloud environment with the automation analysis.
The machine learning classification algorithms used in this paper are the logistic
IoT-based System for Health Monitoring of Arrhythmia … 289
in the cloud and then selection of the best feature is carried out in the feature selection.
Classification of dataset can be done by the multiple machine learning algorithms
discussed above based on the nature of dataset available. From the resultant feature,
prediction of arrhythmia can be performed by checking the presence and absence of
it.
4 Results
The system developed in this paper initially used an IoT to sense the data from
the patients using the wearable sensors, and then through the processor, real-time
monitoring data of patients like heartbeat and body temperature are recorded by
the ThingSpeak software. Figures 2 and 3 show the response of heartbeat and body
temperature sensors from the ThingSpeak. Then, using this current monitored data
along with the raw datasets, heart diseases like arrhythmia in humans can be predicted
by implementing the machine learning techniques.
The current system considered a raw ECG dataset for the result analysis and
evaluation. There are roughly 16 different arrhythmias types in which each and
every beat data of ECG has recorded at R number of peak locations. Then, the current
dataset considered has 5 number of arrhythmias groups which have 22 number of
beats types under the Association for the Advancement of medical Instrumentation
(AAMI) standards, but an arrhythmia group with only ten number of ECG beat types
is used for the evolution of present framework. The ten number of such ECG beat
types with their number of samples used for both training and testing are shown in
Table 1 such as nodal escape beat condition as j, fusion of paced and normal beat
condition as f, fusion of ventricular and normal beat condition as F, ventricular flutter
wave beat condition as !, paced beat condition as P, left bundle branch block beat
condition as L, right bundle branch block beat condition as R, premature ventricular
contraction beat condition as V, atria premature contraction beat condition as A, and
the normal beat as N in Fig. 4.
After training and testing the dataset with multi-machine learning models, the
performance of this arrhythmia prediction in the early stages is evaluated by selecting
the best feature from feature selection and then classified using it as the presence or
5 Conclusion
References
1. A.M. Rahmani, Z.N. Aghdam, M. Hosseinzadeh, The role of the Internet of Things in
healthcare: future trends and challenges. Comp. Methods Prog. Biomed., 105903 (2020)
2. T. Poongodi, P. Sanjeevikumar, B. Balamurugan, J. Holm-Nielsen, Internet of Things (IoT)
and eHealthcare system—a short review on challenges (2019)
3. M.G. Khari, Securing data in Internet of Things (IoT) using cryptography and steganography
techniques. IEEE Trans. Syst. 50(1), 73–80 (2019)
4. M.I. Khan, M.M. Alam, T. Pardy, A. Kuusik, Y.J.I.A. Le Moullec, H. Malik, A survey on the
roles of communication technologies in IoT-based personalized healthcare applications. IEEE
Access 6, 36611–36631 (2018)
5. S. Askar, S. Sulaiman, Investigation of the impact of DDoS attack on network efficiency of the
University of Zakho. J. Univ. Zakho 3(2), 275–280 (2015)
6. T. Ince, S. Kiranyaz, L. Eren, M. Askar, M. Gabbouj, Real-time motor fault detection by 1-D
convolutional neural networks. IEEE Trans. Ind. Electron. 63, 7067–7075 (2016)
7. M. Pellegrini, P. Pierleoni, L. Pernini A. Belli, S. Valenti, L. Palma, A high reliability wearable
device for elderly fall detection. IEEE Sens. J. 15(8) (2015)
8. L. Palmerini, L. Cattelani, S. Bandinelli, F. Chesani, C. Becker, P. Palumbo, L. Chiari, FRAT-Up,
a rule-based system evaluating fall risk in the elderly, in 2014 Proceedings of IEEE 27th Inter-
national Symposium on Computer-Based Medical Systems, vol 204 (IEEE Computer Society,
2014), pp. 38–41
9. K. Ren, M. Li, W. Lou, S. Yu, Y. Zheng, Scalable and secure sharing of personal health records
in cloud computing using attribute-based encryption. IEEE Trans. Parallel Distrib. Syst. 24(1),
131–143 (2013)
10. H.-C. Shin, M.R. Orton, D.J. Collins, S.J. Doran, M.O. Leach, Stacked autoencoders for unsu-
pervised feature learning and multiple organ detection in a pilot study using 4D patient data.
IEEE Trans. Pattern Anal. Mach. Intell. 35(8) (2013)
11. T.C. Silva, L. Zhao, Network-based stochastic semisupervised learning. IEEE Trans. Neural
Netw. Learn. Syst. 23(3) (2012)
12. L. Atallah, B. Lo, R. Ali, R. King, G.Z. Yang, Real-time activity classification using ambient
and wearable sensors. IEEE Trans. Inf. Technol. Biomed. 13(6), 1031–1039 (2009)
EHR-Sec: A Blockchain Based Security
System for Electronic Health
1 Introduction
In recent years, technology has had a massive impact on the healthcare sector such
as disease prediction, digitization of health records, settlement of insurance claims,
etc. Electronic Health Record (EHR) is a digitalized version of a patients’ health
record that contains information like medical history in terms of diagnosis, treatment,
medical prescriptions, laboratory reports, demographic information and insurance
details [1]. The Ministry of Health and Family Welfare (MoHFW) of the Govern-
ment of India notified the Electronic Health Record (EHR) Standards for India in
2013 and revised in 2016 [2]. According to the EHR Standards, ‘Any person in India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 295
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_26
296 S. Deore et al.
can go to any health service provider, any diagnostic center or any pharmacy and
yet be able to access and have fully integrated and always available health records
in an electronic format.’ The standards lay down a framework for EHR systems to
have secure storage, increased availability, efficient data exchange and proper access
control mechanism [3]. EHR systems available today in India follow these stan-
dards to varying degrees of conformance. Efforts are being made by various State
as well as Union Governments to use advancements in Information and Communi-
cation Technologies (ICT) in developing EHR systems. Also, in the private sector
corporate hospitals like Max Health, Apollo, Sankara Nethralaya have successfully
implemented EHR systems [4]. However, EHRs are rarely exchanged between hospi-
tals causing less interoperability among healthcare entities. As a result, data sharing
among healthcare entities is often inefficient and time-consuming. EHR systems are
prone to security data breaches [5, 6]. In 2019 alone, over 41 million health records
were stolen, exposed or disclosed without permission which was 195% higher than in
2018 [7]. Current EHR systems also suffer from lack of total availability which makes
it harder to deliver healthcare services in times of natural disasters like earthquake,
floods, etc. Apart from issues discussed above like interoperability, privacy and avail-
ability, EHR systems fail in being patient centric. The EHR standards for India make
it imperative for healthcare entities to store data on behalf of the patient with owner-
ship of the data residing with the patient [2]. EHR systems with poorly designed
interfaces have shown to cause time pressure and psychological stress among nurses
leading to inconsistent healthcare delivery [8].
The integration of technology and digitization in healthcare technology is creating
both the opportunities and challenges. The ease of handling patient data is coming
with the risk of compromise of data due to cyber-attacks [9, 10]. The data breach on
EHR can impact the healthcare system as well as the availability of data [11].
Blockchain has shown promise with its decentralized peer to peer network that
provides an immutable and a transparent ledger of transactions. It enables to create
trust, transparency and accountability among various stakeholders of the Blockchain
network.
In this paper, we propose EHR-Sec, a secure platform for preserving patient
privacy, increased availability, efficient data sharing for enhanced collaboration
among healthcare providers based on a permissioned Blockchain network of partic-
ipating healthcare entities. The EHR system works in a permissioned Blockchain
consortium of healthcare organizations, government agencies, etc., based on the
open-source Hyperledger Fabric platform hosted by the Hyperledger project. Hyper-
ledger Fabric platform supports a permissioned membership based modular archi-
tecture model with higher transaction throughput than other Blockchain platforms
without the need of any cryptocurrency for transaction execution [12].
EHR-Sec: A Blockchain Based Security System for Electronic Health 297
2 Background
Blockchain
Blockchain is a decentralized peer to peer network with an immutable distributed
ledger which is replicated over multiple nodes containing transactions performed
over the network [13, 14]. Initially proposed as a framework for peer to peer transfer
of Bitcoin it has since found its use in many other non-financial use cases. Apart
from being decentralized, it is secure and resilient to node failures resulting in
high availability of any Blockchain based network, which makes it preferrable to
use in developing applications for healthcare, supply chain management, gover-
nance, etc. Blockchain broadly can be classified as public, private and permissioned
Blockchain. Bitcoin uses public Blockchain where any peer can participate in the
network without revealing its identity, whereas private Blockchain restricts entry of
public peers on the network without validation by the network operator. Permissioned
Blockchain allows for peer to join the network after verification with designated
permissions to perform specific set of activities on the network. The ledger is a chain
of blocks that contains transactions validated by nodes using a consensus protocol.
The consensus protocol ensures that all participating nodes have similar view of the
ledger. Blockchain is designed to be Byzantine Fault Tolerant (BFT), meaning that
even if some peers behave against the network or faulter, and the network reaches
to a final consensus. This consensus is achieved by consensus algorithms like the
Proof of Work (PoW) consensus used in the Bitcoin Blockchain that requires nodes
to perform some computational task ‘work,’ to achieve consensus and prevent mali-
cious nodes form disturbing the integrity of the network [13]. Newer Blockchains
like Ethereum have proposed smart contracts: a program containing some business
logic running on the Blockchain which users can interact with [14].
Hyperledger Fabric
Hyperledger Fabric is an open-source permissioned Blockchain platform hosted
under the Hyperledger project. Hyperledger Fabric platform provides a frame-
work for developing enterprise-grade applications for healthcare, supply chain
management, governance, etc. Hyperledger Fabric supports a modular architecture
that comprises of components like ordering service, Membership Service Provider
(MSP), peer to peer gossip service, smart contract, ledger and pluggable endorsement
and validation policy [12]. Hyperledger Fabric proposes the idea of organizations
also known as members with peers as an endpoint to it. Peers on the network host
ledger and smart contracts that enquire the ledger. The ledger comprises of blockchain
that contains transactions executed on the network and a world state implemented
as a database that holds current value of business attributes. The idea that multiple
organizations can come together and form a consortium enables for secure data
sharing across them. The MSP is a trusted authority that provides identities to actors
on the network that determine access permissions to information and resources. The
Certificate Authority (CA) issues digital certificates providing an identity to the actor
for executing transactions on the network. Chaincode holds the business logic that
298 S. Deore et al.
enables application users to create new facts and query into the ledger. Transac-
tion execution in Hyperledger Fabric follows the execute-order-validate architecture
[12]. The endorsers execute a transaction after which the client on collecting enough
endorsements submits it to the ordering service. The ordering service then establishes
an order among submitted transactions leading to creation of block and sends the
same to peers on the network. In validation phase, every committing peer validates
against the endorsement policy. The valid transactions are then applied to the world
state in the form of CouchDB or LevelDB.
Fig. 1 System architecture for EHR-Sec; sample network of four hospitals where actors interact
using a web-based UI that invokes API calls to the Blockchain network using SDK provided by the
Hyperledger Fabric project
EHR-Sec: A Blockchain Based Security System for Electronic Health 299
registering the actors using the Certificate Authority (CA). These actors can interact
with the Blockchain network using a web-based application interface. The idea that
a patient reserves the right to provide access of his/her EHR to a doctor makes it
patient centric, providing ownership to the patient which is as per the guidelines
notified by the EHR Standards for India (2013).
However, in situations of emergency when a patient is incapable of providing
access to the consulting doctor, EHR-Sec makes necessary provisions for access
control. EHR storage in EHR-Sec follows a two-tier approach: EHR metadata and
data about access permissions are stored on-chain, and actual EHR data is stored
on an off-chain cloud server. The off-chain cloud server is chosen as per the Health
Insurance Portability and Accountability Act of 1996 (HIPAA) [15]. The data stored
in the cloud server is encrypted to disallow information leak to unauthorized access.
4 Component Specifications
Hospitals
Hospitals that resemble organizations in the Hyperledger Fabric network should
ideally house two peers in Fig. 2. API calls made to the network invoke the chain
code installed on the peers to query the ledger. CouchDB is chosen as the ledger
world state database as it supports rich set of queries and data values modeled in
JSON format [16]. The CouchDB is the on-chain data store. The hospital admin has
the responsibility of registering doctors and patients specific to that hospital. The
digital identity stored in the wallet is of the form X.509 standard and exposes the
public key that acts as an authentication anchor for transactions executed by the actor.
Ordering Service
Hyperledger Fabric provides three types of ordering service nodes: Solo, Kafka and
RAFT. With the recent version of Hyperledger Fabric 2.x, Solo and Kafka have been
deprecated [16]. RAFT is the recommended ordering service by Hyperledger Fabric
and is crash fault tolerant (CFT). RAFT follows a ‘leader-and-follower’ model where
the dynamically elected leader forwards the messages to the rest of the follower nodes
[17]. RAFT requires 2n + 1 node to manage n faults. Considering availability of the
EHR management system as a prime objective without compromising throughput,
EHR-Sec uses a 5 node RAFT ordering service.
The fact that a patient is in control of his/her data and can provide access to consulting
doctor makes EHR-Sec patient-centric. EHR metadata and data related to access
permissions is stored on-chain. Doctors need to request a patient to gain the ability
to access, edit and update a patient’s EHR. Upon request, the patient can then grant
the necessary access to the doctor. This transaction of granting access is recorded
on the Blockchain. The access permission data consists of attributes like Doctor-ID,
access type: read, update or append, timestamp of access granted and timestamp
till access is granted. In emergency situations when patient is unable to provide
access, the administrator of that hospital can execute a transaction that temporarily
EHR-Sec: A Blockchain Based Security System for Electronic Health 301
grants automatic access to the doctor. This is achieved by using a Boolean state
variable depicting a normal state to be of value ‘false’ and an emergency state to
be of value true. EHR metadata also resides on-chain that gives information about
the EHR data stored on off-chain cloud server. It includes attributed like patient
unique identification number and type and size of EHR data stored. EHR Standards
for India make it obligatory for patients to have a unique identification like UIDAI
issued Aadhar number which is used by EHR-Sec.
The highly sensitive EHR data stored on the cloud server as well as when trans-
ferred among actors on the network requires to be encrypted to prevent information
leak to any unauthorized access. To be able to encrypt the EHR data which can
only be decrypted by the actor to whom such permission is granted requires use of
cryptography techniques. EHR-Sec proposes a combination of symmetric key and
asymmetric key cryptography. The EHR data is encrypted using a symmetric key
which itself is encrypted by using public key of the patient. This encrypted EHR
data is then stored on the cloud server. When a patient wants to access his/her EHR
data, the encrypted symmetric key is decrypted by using the private key which then
decrypts the EHR data using this decrypted symmetric key. In cases where the access
is to be given to different doctors, EHR-Sec proposes a proxy re-encryption scheme.
Proxy re-encryption allow for proxies to alter a ciphertext encrypted for one party
into a ciphertext that can be decrypted by another party by using a re-encryption key.
When a patient grants access to a doctor, the patient sends a re-encryption key to the
proxy that is generated using the patient’s private key and the doctor’s public key.
This re-encryption key is then used by the proxy to encrypt the already encrypted
symmetric key. The re-encrypted symmetric key is decrypted by the doctor’s private
key. EHR-Sec uses NuCypher for data sharing between using proxy re-encryption.
It uses a decentralized network to delegate encryption and decryption rights over
the network, thus splitting trust among multiple nodes and allows for specifying a
minimum requirement of nodes for the provision of the key to decrypt [19, 20]. This
way data can only be decrypted by the actor who has been given access, without
revealing the private credentials of the patient. Any other actor, malicious or not,
who has not been given access can never access the EHR data in normal form in
Fig. 3.
5 Conclusion
Blockchain technology with its core features of secure data storage, immutability
of transactions recorded, fault tolerance, availability and transparency enable for
302 S. Deore et al.
its use in developing healthcare related applications. The proposed EHR manage-
ment system, EHR-Sec ensures the privacy, security, interoperability and availability
of EHRs. EHR-Sec built as per the EHR Standards for India and HIPAA can be
implemented with a consortium of healthcare institutions later on extending to other
stakeholders in the ecosystem. Hospitals implementing this system for daily EHR
management can expect better collaboration, increased quality of healthcare delivery
and better decision making by healthcare professionals.
References
10. P.R. Yogesh, Backtracking tool root-tracker to identify true source of cyber crime. Proc.
Comput. Sci. 171, 1120–1128 (2020)
11. R.Y. Patil, S.R. Devane, Hash tree-based device fingerprinting technique for network forensic
investigation, in Advances in Electrical and Computer Technologies (Springer, Singapore,
2020), pp. 201–209
12. E. Androulaki, Hyperledger fabric: a distributed operating system for permissioned
blockchains, in EuroSys’18 (2018)
13. S. Nakamoto, Bitcoin P2P e-cash paper. https://bitcoin.org/en/bitcoin-paper
14. Ethereum whitepaper. https://ethereum.org/en/whitepaper/
15. J. Kulynych. D. Korn, The new HIPAA (health insurance portability and accountability act of
1996) medical privacy rule: help or hindrance for clinical research, 108, 912–914 (2003)
16. Hyperledger fabric documentation. https://hyperledger-fabric.readthedocs.io/en/release-2.2
17. D. Ongaro, J. Ousterhout, In search of an understandable consensus algorithm (extended
version), https://raft.github.io/raft.pdf
18. Why new off-chain storage is required in blockchains, in IBM Storage (IBM Corporation,
2018)
19. R. Gondhali, A. Patil, O. Kharatmol, R.Y. Patil, Electronic medical records using blockchain.
Int. J. Adv. Res. Sci. Commun. Technol. 5(5) (2020)
20. M. Egorov, M. Wilkison, D. Nunez, Nucypher kms: decentralized key management system.
https://arxiv.org/abs/1707.06140
End-to-End Speaker Verification
for Short Utterances
Abstract Speaker verification is the process used to verify a speaker from his/her
voice characteristics. Given a speech segment as input and the target speaker data, the
system automatically determines whether the target speaker spoke the test segment.
There are many methods of bio-metric verification like a fingerprint, iris scanning
signatures, etc. In this list, speech specific authentication is not as much reliable as
other methods. Hence, we would like to develop a reliable speaker verification model.
Recent advances in deep learning have facilitated the design of speaker verification
systems that directly input raw waveforms. Though developing a model with raw
waveforms is complex in speech processing, it would yield an end-to-end system,
which reduces the time and power of feature extraction. To achieve the motive of
end-to-end speaker verification, we propose to use Raw Waveforms as input. The
development of such a system is possible without much domain knowledge of feature
extraction. Moreover, the availability of a large dataset eases the development of the
end-to-end system. The later part of the proposed system also includes analyzing the
model’s performance using a short utterance dataset to make the model more user-
friendly and reduce computation power. Hence, we plan to analyze and improvise
RawNet (Jung et al. in Proceedings of Interspeech, pp. 3583–3587, 2020 [1]) for
short utterances.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 305
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_27
306 S. Ranjana et al.
1 Introduction
Speech processing is the study of speech signals and its processing methods. Aspects
of speech processing are acquisition, manipulation, storage, transfer and output of
speech signals. Sub domains of speech processing include speech coding, synthesis
[2], enhancement [3, 4], speech and speaker recognition and speaker identification
and verification. The crucial aspects of a speaker verification system are feature
extraction and model building. Extraction of the features from the sound signal is
essential as speaker verification systems largely depend on speaker-specific char-
acteristics. MFCC [5] is a low-dimensional representation of the audio input. The
signal is divided into smaller frames, and corresponding frequencies are identified.
These are passed through a Mel-Filterbank to obtain a vector. The performance of
MFCCs degrade rapidly in real-world noise [6]. The noise signal alters all MFCC
if at least one frequency band is distorted. When FFT was applied, it gave only
frequency values, and time information was lost. Linear Frequency Cepstral Coef-
ficients (LFCC) consistently outperforms MFCC mainly due to its better perfor-
mance in the female trials [7]. The relatively shorter vocal tract can explain this
in females and the resulting higher formant frequencies in speech. In the presence
of different forms of additive noise and reverberant conditions, Power Normalized
Cepstral Coefficients (PNCC) processing offers significant improvements in recogni-
tion performance compared to MFCC and PLP processing, with just a minor increase
in computational cost over traditional MFCC processing [8]. Another commonly used
feature extraction technique is spectrogram [9]. Unlike MFCC, here, time domain is
represented by the window numbers. 2D matrix represents the frequency magnitudes
(columns) and time for a given signal (rows-window number). A waveform-based
CNN that directly takes raw speech signal as input has been used in many studies
to resolve this issue such as in speech verification [1], speaker recognition, voice
activity detection.
2 Related Work
on the length of the input utterance. Average pooling layers have been used [9] to
aggregate frame-level feature vectors to obtain a fixed-length utterance-level embed-
ding. The network is further trained for verification using the contrastive loss [9]
or other metric learning losses such as the triplet loss. Similarity metrics like the
cosine similarity [12] or PLDA are often adopted to generate a final pairwise score.
The literature survey consolidated in Table 1 shows that raw waveforms as input and
CNN-GRU architecture yield better performance for the speaker verification system.
Hence, we propose to develop an end-to-end speaker verification system using raw
waveforms [1] and for short utterances of raw waveforms [11].
3 Proposed System
In speech verification systems, models are fed with intermediate features like MFCCs
and spectrograms [9]. However, the input has only limited spectral information due
to the provided filter bank type and magnitude compression which affects the model
architecture. A waveform-based CNN that directly takes raw speech signal as input
has been used in studies of speech verification [1], speaker recognition and voice
activity detection to resolve this issue.
SincNet processes raw audio samples and learns powerful features using a Deep
Learning model. The parametrized sinc functions, which replaces the DNN’s first
layer, implement band-pass filters and are useful in convolving the waveform to
extract low-level features. Here, the network learns more relevant features and
improves the model’s convergence time due to its significantly fewer parameters.
The SincNet uses a Softmax layer at the top responsible for mapping the network’s
final features into a multi-dimensional space corresponding to various speakers. In
contrast to standard CNNs, which learn all filter elements, the proposed method
learns only the low and high cutoff frequencies directly from data. This provides an
efficient way of generating a customized filter bank.
RawNet is a speaker embedding extractor that takes raw waveforms as input and
produces speaker embeddings for speaker verification without using any prepro-
cessing techniques. RawNet adopts a convolutional neural network-gated recurrent
unit (CNN-GRU) architecture. The scale of a given feature map is modified using
filter-wise feature map scaling (FMS) technique to derive more discriminative repre-
sentations. The front CNN layers comprise residual blocks, followed by a max-
pooling layer which extracts frame-level features. GRU layer aggregates frame-level
features into an utterance-level representation.
It uses neural networks to encode the speaker attributes of an utterance into a fixed-
length vector irrespective of the length of the utterance.
Verification by cosine similarity: It calculates similarity of two vectors by measuring
the cosine of the angle between them. For the sample audio passed into the network,
its vector is calculated and the cosine similarity is computed between the vector of
sample audio and the vector of the claimed speaker. The score lies between 0 and 1.
End-to-End Speaker Verification for Short Utterances 309
Figure 2 elaborates on the various layers in CNN-GRU architecture. Input and output
vector shapes of each layer are mentioned on the respective arrows.
Convolution Neural Network (CNN): The layers of CNN are organized in dimen-
sions of width, depth and height. The neurons in one layer bind to only a small
portion of the neurons in the next layer rather than connecting to all of them. The
output is reduced to a single vector of probability scores, organized along the depth
dimension. To classify an object, each input image will move through a series of
layers with filters, pooling, completely connected layers and the SoftMax feature.
Gated Recurrent Unit (GRU): GRU is an improvised version of conventional recur-
rent neural networks. GRU uses an update and reset gate to solve the vanishing
gradient problem. They can be trained to retain only required information from the
past and discard information unrelated to the prediction.
Max Pooling: It selects the maximum element from the region of the feature map.
The output is a feature map containing the most prominent features. They minimize
the dimension of feature maps, thereby reducing the number of learning parameters
and the amount of computation performed in the network.
The model was trained with complete VoxCeleb2 dataset for 8 epochs. VoxCeleb1
was used for validating and testing after which an EER of 0.04 was observed.
We used the complete VoxCeleb2 dataset in training and the generated short dataset
from VoxCeleb1 as testing data. We have obtained an error rate of 12% which is
greater than the error of long data speaker verification. While comparing the previous
two models, we find that longer dataset performs well in the given situations in Table
4.
Short Utterance Improvisation: We have trained both the models for 4 epochs and
included only less than 10% of the dataset. In the comparison shown in Table 5, we
can infer that training model with short data and using time augmented evaluation in
testing improves the accuracy of short utterance speaker verification system.
The reasons we infer for this change in performance is:
Training with Short utterance data: The model is able to learn well from the
features of shorter utterances. And the uniform length of the utterances adds up to
the better learning. The gender differences are also seen less in shorter utterances.
Hence, this aids in better performance.
Time augmented evaluation: This method is used in testing of the model. This
includes augmenting or appending the short data in testing phase to produce longer
data which helps in better performance, while testing.
6 Conclusion
Speaker verification was proposed as the project title using raw waveforms. Raw
waveforms were considered due to its ability of the systems end-to-end development.
RawNet [1] was considered as baseline and system for both long utterance and short
utterance data was developed. In [1], an error rate of 3% was reported when developed
for 100 epochs with files in.m4a audio format. After implementation of the same for
8 epochs, we have reported an error rate of 4.3%. Generation of short utterance data
from existing VoxCeleb1 dataset with duration of 2 s was done. When trained using
VoxCeleb2 dataset and tested using VoxCeleb1 short dataset, an error of 12% was
reported. Through this analysis, we have inferred that long utterance data performs
better than short utterance. Hence, to improvise the performance of short utterance
data, we have implemented the concept of time augmented evaluation. Here, we
concatenate the smaller speech segments and make the system presume it as it is a
long utterance data. Here, we could improve the accuracy of the system. Another
way used to improve performance of short utterance data is the removal of noisy and
silent parts of the dataset for better learning.
References
1. J.W. Jung, S.B. Kim, J.H. Kim, H.J. Shim, H.J. Yu, Improved RawNet with feature map scaling
for text-independent speaker verification using raw waveforms, in Proceedings of Interspeech
2020 (2020), pp. 3583–3587
2. B. Tarakeswara Rao, R.S.M.L. Patibandla, M.R. Murty, A comparative study on effective
approaches for unsupervised statistical machine translation, in Proceedings of AISC Springer
Conference, vol. 1076 (2020), pp. 895–905. Z. Michalewicz, Genetic Algorithms + Data
Structures = Evolution Programs, 3rd edn. (Springer, Berlin, Heidelberg, New York, 1996)
3. R. Zheng, B. Xu, S. Zhang, Text-independent speaker identification using GMM-UBM and
frame level likelihood normalization, in Proceedings of International Symposium on Chinese
Spoken Language Processing (Hong Kong, China, 2004), pp. 289–292
4. A.S.V. Praneel, T. Srinivasa Rao, M. Ramakrishna Murty, A survey on accelerating the classifier
training using various boosting schemes within cascades of boosted ensembles. Proc. Int. Conf.
Springer SIST Ser. 169, 809–825 (2019)
5. A. Poddar, M. Sahidullah, G. Saha, Speaker verification with short utterances: a review of
challenges, trends and opportunities. IET Biom. 7(2), 91–101 (2018)
End-to-End Speaker Verification for Short Utterances 313
Abstract In real life, most of the big dataset is skewed in nature. Domains like
medical diagnosis, bioinformatics, banking theft, natural disaster, network intru-
sion, oil-spill detection, instrument failure and army crimes have dataset which are
not always balanced; rather, imbalanced dataset is the common occurrence. This
will lead to biased classification if the positive class instances are very few when
compared with negative class instances. Multi-class imbalanced big data learning is
a challenging research topic in big data analytics. In this review paper, we analyse
the conventional balancing techniques of data-level, algorithm-level approaches and
ensemble techniques which balance multi-class imbalanced dataset. Though these
existing ensemble techniques balance the dataset, it cannot be applied to streaming
data and has scalability issues. This paper is intended to analyse different techniques
to develop a novel ensemble technique to learn, balance and pre-process the big data
stream for classification by analysing different techniques and its accuracy levels.
1 Introduction
In most of real-world dataset, there are at least 1 million instances and 100 features,
without a single well-defined target class. From these instances, interesting cases
have a frequency of less than 0.01 [1]. This impacts the result from slight variation
to serious challenges to the standard methods of classification and regression. The
accuracy and efficiency of existing methods are significantly affected by overfitting,
creating a bias towards the class which has more samples [2].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 315
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_28
316 R. Madhura Prabha and S. Sasikala
2 Literature Review
In big data mining, classification is the widespread category which is used to predict
and analyse the values from the given classes. There are many standard classifiers
like decision tree, K-nearest neighbors (KNN), support vector machine (SVM), etc.
are used for classification. Even though, we have many classifiers which are not
capable of handling big datasets.
Devi et al. in [4] recommended a MapReduce framework based on Bat Feature
Selection method which is adaptable for large dimensional data and leverages
efficiency of parallel algorithms.
But in numerous real-world big data, we have many different classes where few
classes are having more samples and other classes are having only less samples.
Classes which are having more samples are called majority classes and classes which
are having less samples are called minority classes [5]. The dataset which has many
minority classes and few majority classes is called imbalanced data. Since all clas-
sifiers assume that entire dataset is distributed evenly, it shows biased results for
imbalanced big data classification with standard algorithms [6].
Class imbalance is a general problem for all domains like medical, banking sector,
fault identification sector, etc. And the misprediction cost in minority classes is high
when compared with majority classes for multi-class imbalanced datasets.
In supervised learning, classification depends upon class labels. But in imbal-
anced big data, the classifiers will take the class labels which are high in count
automatically with the assumption of equal spreading of all classes. This produces
A Comprehensive Analysis on Multi-class Imbalanced Big Data … 317
2.2.1 Undersampling
2.2.2 Oversampling
Oversampling is achieving the balance by replicating the occurrences which are few
in count. In oversampling, we will not lose any dataset; instead, we will duplicate the
existing minority class datasets. But disadvantage is the dataset size will be increased
which needs large memory and more computational time.
There are many ways to implement oversampling. Simplest technique is random
oversampling which randomly chooses and duplicates some minority instances in
training dataset. But disadvantage of random oversampling is overfitting problem
and small decision region.
Park et al. in [9] suggested oversampling techniques to balance the highway
traffic data to predict traffic accidents. Based on oversampling, the predicting system
analyses and pre-processes traffic big data to create a learning system. Balanced data
is classified into several groups, to which K-means cluster analysis is applied. Finally,
prediction can be done by logistic regression. The result shows that aimed accuracy
was 42.71% and the actual accuracy is 80.56%. These analysis steps are completed
by Hadoop framework. The disadvantage is a well-organized standard that should
be generated for oversampling technique.
Chawla et al. in [10] presented SMOTE (synthetic minority oversampling tech-
nique) which creates synthetic instances over path of joining all KNN points than
duplicating the real datasets.
SMOTE focuses and creates a bias to less instance class. This oversampling
method creates many different minority points near existing points.
By using SMOTE, larger area is covered by the class which has less instances and
makes better prediction of hidden instances belonging to the class which has less
instances.
While creating artificial instances, SMOTE is not considering the neighbouring
majority classes. This can result in an increase of class overlapping and produce more
outlier instances. The class overlapping has a strong correlation with class imbalance
[11]. Therefore, imbalanced overlapping classes are difficult to classify than the
normal one. SMOTE raises minority instances near boundary and over-fitting.
Zhai et al. in [12] proposed an oversampling method OSSLDDD-SMOTE (one-
sided selection link and dynamic distribution density-SMOTE). This method dealt
with noise instances in a hierarchical filtering mechanism, and SMOTE is applied
only the minority instances near the classification boundary by dynamic SDDL
(sequential distribution density link). Generating new instances in different counts
for each border line minority instances depends upon distribution density of that
particular instance. This approach eliminates the disadvantages of SMOTE.
But OSSLDDD-SMOTE produces unwanted synthetic samples around the
majority samples. And finding borderline will become a serious issue.
Hussein et al. in [13] proposed advanced-SMOTE (A-SMOTE) method which
regulates the position of synthetic creation. First, synthetic instances are created
using SMOTE method. Next, it removes the synthetic instances nearer to majority
instances and borderline. In experimental results, the A-SMOTE technique produces
a clear borderline between two classes and also eliminates noise.
A Comprehensive Analysis on Multi-class Imbalanced Big Data … 319
Generally, sampling methods need more computational time and memory space.
Since original datasets itself overwhelmed by time, sampling methods are not suitable
for some domains which are having growing dataset.
Ertekin et al. in [16] proposed an SVM active learning which selects informative
instances from a randomly picked smaller pool of instances. Instances which are
inside the margin area are called small instance pool. This method will not search
whole dataset; instead, it will query the system. So, this active learning achieves
a fast solution with competitive prediction performance and deals with unlabelled
instances.
Belarouci et al. in [17] proposed a cost-sensitive extension of least mean square
(LMS) algorithm which solves unbalancing issue by penalizing errors caused by
different weights for different instances. After balancing, different classification
techniques are applied. Experiment result shows the improvement of classification
accuracy.
Hamidzadeh et al. in [18] suggested the Chaotic Krill Herd evolutionary algo-
rithm (CKHA) which examines both class spaces for sample reduction in binary
class imbalanced dataset. All instances are measured using a combined weighted
multi-objective optimizer in WDDS (weighted distance-based decision surface).
Using CKH algorithm, fitness values are measured in search space then it identi-
fies instances which have the best fitness value. All instances which reduce accuracy
and Geometry mean (Gmean) are removed from the original dataset. This method
controls imbalance and keeps instances of class which has less number of examples.
320 R. Madhura Prabha and S. Sasikala
Data level and algorithm-level approaches can solve binary class imbalanced
problem. Whereas, these approaches cannot solve multi-class imbalanced problem.
Since relationship among classes is different and boundaries can overlap, we need
an ensemble technique which combines both data and algorithm approaches.
Song et al. in [20] analysed one-versus-one (OVO) decomposition scheme by
applying binary ensemble learning approaches. “m” multi-class is split as m(m−1 )/2
binary class sub-datasets. For each pair of corresponding classes, a classifier is
trained, ignoring the samples which do not fit into those two classes. Then all
binary class outputs are aggregated to produce the multi-class result. OVO approach
combines with SMOTE is an appreciative method to balance multi-class imbalanced
problems. When making predictions, unlabelled samples will be fed into the models
for prediction.
Piri et al. in [21] proposed synthetic informative minority oversampling (SIMO)
which uses SVM classifier. Since the instances which are nearer to decision boundary
will be more informative, these samples are over-sampled. Weighted-SIMO (W-
SIMO) is also proposed which differ from SIMO. W-SIMO oversamples only the
instances which are wrongly classified informative minority instances with high
degree. In these techniques, emphasis on informative minority instances which
usually mis-predicted by standard classifiers.
Alam et al. in [2] proposed a recursive technique for multi-class imbalance clas-
sification and also for regression problem. In this technique, the data imbalance
problem is transformed into multiple balanced problems. Partitioning and balancing
data are applied recursively. Partition was implemented using balanced distribution
and random partitioning methods. An ensemble classifier is modelled and ensemble
rule selects one class. Further, it also solve the data imbalance in regression. This
technique is effective and improves performance.
Hassib et al. in [22] proposed a three-phase classification framework. First
phase is feature selection. Second phase is balancing dataset using LSH-SMOTE
(locality sensitive hashing synthetic minority oversampling technique). Lastly, resul-
tant dataset is given to WOA + BRNN (bidirectional recurrent neural network)
algorithm for classification. This method increased classification accuracy level.
Tsai et al. in [23] presented cluster-based instance selection (CBIS) technique
which combines clustering and instance selection. Clustering is applied on majority
classes to split into many subclasses and in each subclass instance selection filters
A Comprehensive Analysis on Multi-class Imbalanced Big Data … 321
irrelevant instance of that particular subclass and gives a label for each subclass. This
technique balances multi-class dataset.
Sleeman et al. in [30] created a compound framework on Apache Spark for multi-
class datasets. Instance-level difficulties of each class are analysed to find the learning
difficulties. This information is embedded in common resampling algorithms. These
algorithms will balance multiple classes. A new method of SMOTE was applied
which removes the spatial constraint in distributed datasets. This method shows
that instance-level information is most important for creating training dataset for
multi-class imbalanced big data.
For multi-class dataset, it is divided into multiple binary classes and balancing can
be applied to each binary class. So multi-class imbalancing can be solved by solving
each binary class in Table 1.
Samples were categorized into four types such as safe samples, borderline
samples, rare samples and outliers depending upon the neighbor sample class.
For safe samples random undersampling (s-random undersampling), for borderline
samples SMOTE and for rare samples br-SMOTE are used in training dataset which
has binary classifiers [29].
It is time-consuming to identify instance-level information of all instances in big
data stream. SMOTE needs more computational power and memory requirement for
streaming data. To address these issues, we need to create an ensemble technique
which has to address multi-class imbalanced data on distributed environment with
high volume of data.
2.6 Solutions
The major issues in existing ensemble techniques are scalability, important data
deficiency, memory issues, finding borderline, learning instance-level information
and computational time to create a training model.
The solution to the imbalanced big data is an ensemble technique which combines
sampling and enhancement in classification algorithms. A novel ensemble technique
should be developed to solve scalability and memory issues by using map reduce
framework or in spark. Finding borderline issue and high computational time can
be solved by learning instance-level difficulties in distributed environment with high
volume of data.
References
1. Q. Yang, X. Wu, 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis.
Mak. 5(04), 597–604 (2006)
2. T. Alam, C.F. Ahmed, S.A. Zahin, M.A.H. Khan, M.T. Islam, An effective recursive technique
for multi-class classification and regression for imbalanced data. IEEE Access 7, 127615–
127630 (2019)
3. H. He, E.A. Garcia, Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9),
1263–1284 (2009)
4. D.R. Devi, S. Sasikala, Feature selection and classification of big data using MapReduce
framework, in International Conference on Intelligent Computing, Information and Control
Systems (Springer, Cham, 2019), pp. 666–673
5. M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, F. Herrera, A review on ensembles for
the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans.
Syst. Man Cybernet. Part C Appl. Rev. 42(4), 463–484 (2011)
6. V. Ganganwar, An overview of classification algorithms for imbalanced datasets. Int. J. Emerg.
Technol. Adv. Eng. 2(4), 42–47 (2012)
7. J. Tanha, Y. Abdi, N. Samadi, N. Razzaghi, M. Asadpour, Boosting methods for multi-class
imbalanced data classification: an experimental review. J. Big Data 7(1), 1–47 (2020)
8. M.M. Rahman, D.N. Davis, Addressing the class imbalance problem in medical datasets. Int.
J. Mach. Learn. Comput. 3(2), 224 (2013)
9. S.H. Park, Y.G. Ha, Large imbalance data classification based on mapreduce for traffic accident
prediction, in 2014 Eighth İnternational Conference on Innovative Mobile and Internet Services
in Ubiquitous Computing (IEEE, 2014), pp. 45–49
10. N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-
sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
11. R.C. Prati, G.E. Batista, M.C. Monard, Class imbalances versus class overlapping: an analysis
of a learning system behavior, in Mexican International Conference on Artificial Intelligence
(Springer, Berlin, Heidelberg, 2004), pp. 312–321
12. Y. Zhai, N. Ma, D. Ruan, B. An, An effective over-sampling method for imbalanced data sets
classification. Chin. J. Electron. 20(3), 489–494 (2011)
13. A.S. Hussein, T. Li, C.W. Yohannese, K. Bashir, A-SMOTE: a new preprocessing approach
for highly imbalanced datasets by improving SMOTE. Int. J. Comput. Intell. Syst. 12(2),
1412–1422 (2019)
14. L. Cao, H. Shen, Imbalanced data classification based on hybrid resampling and twin support
vector machine. Comput. Sci. Inf. Syst. 14(3), 579–595 (2017)
15. N. Junsomboon, T. Phienthrakul, Combining over-sampling and under-sampling techniques for
imbalance dataset, in Proceedings of the 9th International Conference on Machine Learning
and Computing (2017), pp. 243–247
16. S. Ertekin, J. Huang, L. Bottou, L. Giles, Learning on the border: active learning in imbal-
anced data classification, in Proceedings of the Sixteenth ACM Conference on Conference on
Information and Knowledge Management (2007), pp. 127–136
17. S. Belarouci, M.A. Chikh, Medical imbalanced data classification. Adv. Sci. Technol. Eng.
Syst. J. 2(3), 116–124 (2017)
18. J. Hamidzadeh, N. Kashefi, M. Moradi, Combined weighted multi-objective optimizer for
instance reduction in two-class imbalanced data problem. Eng. Appl. Artif. Intell. 90, 103500
(2020)
19. M.A. Febriantono, S.H. Pramono, R. Rahmadwati, G. Naghdy, Classification of multiclass
imbalanced data using cost-sensitive decision tree C5.0. IAES Int. J. Artif. Intell. 9(1), 65
(2020)
20. Y. Song, J. Zhang, H. Yan, Q. Li, Multi-class ımbalanced learning with one-versus-one decom-
position: an empirical study, in International Conference on Cloud Computing and Security
(Springer, Cham, 2018), pp. 617–628
A Comprehensive Analysis on Multi-class Imbalanced Big Data … 325
21. S. Piri, D. Delen, T. Liu, A synthetic informative minority over-sampling (SIMO) algorithm
leveraging support vector machine to enhance learning from imbalanced datasets. Decis.
Support Syst. 106, 15–29 (2018)
22. E.M. Hassib, A.I. El-Desouky, L.M. Labib, E.S.M. El-Kenawy, WOA+ BRNN: an imbalanced
big data classification framework using Whale optimization and deep neural network. Soft
Comput. 24(8), 5573–5592 (2020)
23. C.F. Tsai, W.C. Lin, Y.H. Hu, G.T. Yao, Under-sampling class imbalanced datasets by
combining clustering analysis and instance selection. Inf. Sci. 477, 47–54 (2019)
24. N. Liu, X. Li, E. Qi, M. Xu, L. Li, B. Gao, A novel ensemble learning paradigm for medical
diagnosis with imbalanced data. IEEE Access 8, 171263–171280 (2020)
25. H. Jegierski, S. Saganowski, An “outside the box” solution for imbalanced data classification.
IEEE Access 8, 125191–125209 (2020)
26. M. Koziarski, M. Woźniak, B. Krawczyk, Combined cleaning and resampling algorithm for
multi-class imbalanced data with label noise. Knowl. Based Syst. 204, 106223 (2020)
27. M. Żak, M. Woźniak, Performance analysis of binarization strategies for multi-class ımbalanced
data classification, in International Conference on Computational Science (Springer, Cham,
2020), pp. 141–155
28. J. Wei, H. Huang, L. Yao, Y. Hu, Q. Fan, D. Huang, New imbalanced bearing fault diagnosis
method based on sample-characteristic oversampling technique (SCOTE) and multi-class LS-
SVM. Appl. Soft Comput. 101, 107043 (2021)
29. X. Gao, Y. He, M. Zhang, X. Diao, X., Jing, B. Ren, W. Ji, A multiclass classification using
one-versus-all approach with the differential partition sampling ensemble. Eng. Appl. Artif.
Intell. 97, 104034 (2021)
30. W.C. Sleeman IV, B. Krawczyk, Multi-class imbalanced big data classification on Spark.
Knowl. Based Syst. 212, 106598 (2021)
Efficient Recommender System for Kid’s
Hobby Using Machine Learning
1 Introduction
Nowadays, many applications are built for recommendation based on user prefer-
ences. In this paper, we will be looking for types of recommendation system, machine
learning, and support vector machine algorithm.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 327
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_29
328 S. S. Lunawat et al.
Recommendation system [1] is widely used nowadays as it has used to develop many
applications. The recommendation system has three types as shown in Fig. 1.
Collaborative filtering is a methods for recommender systems which uses past infor-
mation of users and the items. The inputs are nothing but historical data of user
connections with items [3]. The representation used is matrix form in Table 1.
Fig. 1 Types of
recommendation system [2]
Recommenda on
Systems
2 Machine Learning
Machine learning (ML) [1] is a nothing but a simulation model which allows
computers to acquire knowledge from the real world by improving performance by
training with new knowledge. Nowadays, ML algorithms have widely used in any
domain like business, medicine, etc. Learning is nothing but acquiring knowledge
which is done by experiencing [4].
Machine learning algorithms are classified based on learning as below:
Support vector machine (SVM) has attracted many researchers towork on different
applications. SVM comes under a supervised machine learning algorithm [5]. SVM
as shown in Fig. 2 is used for classification and sometimes as well as regression.
In the SVM algorithm, plotting of a data item as a point on graph by a number of
features. In which the value of each feature by representing coordinate. Then, by
performing classification by representing the hyperplane.
As shown in Fig. 3, SVM consists of support vectors which are datapoints near
the hyperplane. A hyperplane is a linear line that classifies between different classes
which builds confidence that the classification is exact.
The paper is arranged as follows. Section 1 gives an introduction to types of recom-
mender system, Sect. 2 explains what is machine learning, Sect. 3 covers support
vector machine Sect. 4, refers related work, Sect. 5 covers problem formulation,
330 S. S. Lunawat et al.
Sect. 6 covers proposed system, Sect. 7 covers experimental results, and Sect. 8
refers to conclusion.
4 Related Work
SVM searches this line through measuring the distances between the extreme points
(support vectors) of the cluster of our data points equidistant to the line of search, and
this distance needs to be maximized to find our hyperplane. SVM is used because the
approach of this algorithm is different in classifying categories because as mentioned
above, the extreme points are the most non likely points of that cluster, i.e., boundary
Efficient Recommender System for Kid’s Hobby Using … 331
points of that clusters, hence are nearer to both the clusters of other categories. For
e.g., the data points in the arts category will have an extreme point which could be
nearer to the cluster of sports category hence the same for other categories in the
dataset Naive Bayes treats the data independently, i.e., it takes into account more
of a probabilistic view considering the features (variables) of a particular category,
hence any future observation will get classified based on features and calculating the
marginal likelihood of the point to be classified. So the posterior probability for the
point will be calculated with respect to each of the other categories in the dataset,
and comparing these probabilities with each other resulting in classification of that
point into one of the categories (Table 2).
Most recommender systems have to consider many factors for building the accu-
racy of the system. Very few works have proved for accuracy for which researchers
are constantly been working to improve this system [10]. This purpose system uses
SVM as a standard technique of machine learning. SVM helps to separate data as
per classifier class by using hyperplane. User’s preferences are classified according
to the training set. Finally, proving to the model dataset is demonstrated on different
preferences to prove its high accuracy [9].
5 Problem Formulation
The major problem nowadays is parents by comparison with others kids forces their
kids to learn or take classes for what kids are not interested in. As a result of it without
interest, they cannot prove themselves. To solve this problem by giving a clear idea
to parent about how to overcome this problem of selecting a hobby for their kids. A
332 S. S. Lunawat et al.
6 Proposed System
After collecting data from users through a survey form and after data prepro-
cessing we had a set of observations which belonged to different classes namely
arts, academics, and sports. The point was to find a decision boundary (optimal line
of separation) between them so that any future observations can be classified into a
certain category.
We have built a website for the proposed system. Our data set have 14 different
features based on general questions. The website takes user preferences from parents
or user and recommend suitable hobby. The questions asked are simple based on
assumption for suitable classes. Figure 4 shows the flow graph of the proposed system.
In this, on our website the user enters the questions asked to them. The preferences are
considered as input to the SVM algorithm. The algorithm then applies the algorithm
to the input and creates a hyperplane. The hyperplane then based on maximizing
margin will classify different hobby classes. The output which is recommendation
based on preferences.
7 Results
Below images show our results of proposed system as shown in Fig. 5a, b.
334 S. S. Lunawat et al.
Metric Scores
(continued)
Metric SVM Naive Bayes
F1 score 0.887844740217096 0.8697022374398617
8 Conclusion
References
Abstract Due to the widespread access to a global positioning system and automated
road mapping systems for many smart devices, the road network navigation services
have become a core application. Path planning, a basic feature of road network
navigation systems, identifies a path between the given place of start and destination.
Due to multiple complex situations such as abrupt changes in moving direction,
unpredictable conditions of traffic, missing or instable GPS signals, and so on, the
effectiveness of this route planning functions on roads is vital. The route planning
service must be delivered promptly in these situations. In this article, we suggest a
system for answering a new route scheme question in real time by caching and re-
using historical queried routes, namely path planning by caching. Unlike the standard
route planning schemes relying on the cache where only a queried path is used in
the cache as it fits the current request exactly, PPC uses partly matching requests to
address a new question part(s). Consequently, only unmatched fragments of paths
are calculated, and the total workload of the device can be reduced considerably.
Extensive experiments in an actual database of the road network demonstrate that
our method conducts advanced route planning strategies by reducing the calculating
latency on average by 32%.
1 Introduction
The on-road path planning in mobile navigation services is a key feature that finds a
route from a queried place to a destination. A road planning question may be made
in different situations due to unpredictable factors, for example, abrupt changes in
the direction of travel, unpredictable patterns in traffic, or the loss of GPS signals.
Road mapping must be carried out in a timely manner in these situations [1]. When
a huge amount of route planning requests are sent to the server, e.g., during highest
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 337
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_30
338 S. M. Rao and M. Vijayakamal
time periods, the need for timeliness is much more difficult. Since the response
time is crucial for user content for personal navigation systems, the server has a
mandate to handle high workload demands for route planning effectively. To address
this requirement, we propose a scheme that will efficiently respond to a new path-
planning question through cache and reuse traditionally queried paths: path planning
by caching (PPC) (queried paths in short). Unlike standard route planning schemes
relying on a database that only returns a stored query when it fully corresponds to a
new query, PPC uses partially matching queried paths in a cache to address part(s) of
a new query. As a result, only the unmatched track segments must be calculated and
the machine workload greatly reduced [2]. A path-planning system (PPC) is proposed
by us in order efficiently to respond to a new path-planning querying by using cached
paths so as to avoid taking time-consuming shortest path calculations, as described
in Fig. 1. The system architecture includes three key components, respectively, in
rectangular boxes: In contrast with a traditional route planning scheme, we save up
to 32% on average (without using cache). We implement a cached route that shares
segments with other routes. PPattern is a concept. PPC supports partial hits for a
new question between PPatterns. Our tests show that up to 92, 14% on average of all
cache hits are partial hits. A new probabilistic model is proposed to detect pathways
cached that are highly likely to be a PPattern for the new query-dependent on-road
network consistency. Our tests show that the retrieving of track nodes by 31.69% on
average saves these PP patterns a tenfold more than the saving of 3.04% gained by a
successful completion. In considering the user choice of roads of different kinds, we
have created a new cache replacement mechanism [3]. For each question, a metric
is allocated to answer both the type of road and the popularity of the query. The
findings reveal that our current cache substitution strategy raises the cache hit ratio
by 25, 02% over state-of-the-art cache substitution policies.
Efficient Route Planning Supports Road Cache 339
2 Problem Statement
Road preparations must be carried out promptly. When a huge amount of route
planning requests are sent to the server, e.g., during highest time periods, the need for
timeliness is much more difficult. Since the response time is crucial for user content
for personal navigation systems, the server has a mandate to handle high workload
demands for route planning effectively. For the structure of a vast road network
model, Jung and Pramanik give a HiTi graph model. HiTi is designed to reduce the
search space for computing the shortest path. HiTi performs high road weight updates
and eliminates overhead storage [4]. In the calculation of the shortest pathways, the
calculation costs are greater than those of HEPV and Hub Indexing. Demiryurek et al.
suggest the B-TDFP algorithm by using retrograde searches to decrease the search
space for time-dependent fast routes. The plan uses a road hierarchy to balance each
city. It adopts an area level partitioning system. Only when it completely fits a new
query will a cached query be returned. Time is highly complex. The content of cache
must not be up to date to answer current developments in the questions that have
been posted. The costs of cache construction are huge, since in a complete path to
query results the system must measure the benefits values for all sub-paths.
3 Proposed Methodologies
4 Enhanced System
Using correct username and password, the admin must login. Once effective login,
certain operations like viewing and permitting users can be done, places can be
added to data. List all added sites and their documents with Disktra algorithms in
rank, pictures, and distance. See all cache links for all cache-rescinded sites, see
all transactions, and see some cache-to-cache time delay, view cache connect chart
score, and view all chart place ranks. The Tweet server displays the data of all users
and permits them to login: username, address, e-mail ID, and moving numbers for
example. The administrator adds places with information such as location, position
title, location description, location uses, photographs, location document, and the
distance to this place with a center point name. The administrator will see any cache
connection that is the keywords used more than once by users to scan. The rank of
caching links will be seen along with the cache link search sites (number of times
that the keyword is searched from cache). On searched sites (by current user), the
user can see all other comments. Information of the comment includes comment by
name, response and reply date in Fig. 1.
5 Conclusions
References
1 Introduction
Biological living systems such as Homo sapiens are endowed with the capability of
associating one-dimensional or two-dimensional or three-dimensional information
with stored concepts: name, face, time etc. This associative memory capability is rou-
tinely invoked to conduct daily life. Hopfield attempted and succeeded in proposing a
model of associative memory for storing and retrieving one dimensional information.
Such a ‘Memory model’ is based one the notion of linear separability [10].
One of the important problems in designing an associative memory is the so-
called programming problem, i.e., to be able to store certain desired 1-D/2-D/3-D
information as the desired ‘DESIRED MEMORIES.’ It was realized by Hopfield that
G. Ramamurthy
Department of Computer Science and Engineering, Ecole Centrale School of Engineering,
Mahindra University, Bahadurpally, Hyderabad, Andhra Pradesh, India
e-mail: rama.murthy@mahindrauniversity.edu.in
T. J. Swamy (B)
Department of Electronics and Communication Engineering, Gokaraju Rangaraju Institute of
Engineering and Technology, Bachupally, Hyderabad, Andhra Pradesh, India
e-mail: jagan.tata@griet.ac.in
Y. Reddy
Department of Research and Development, Mahindra University, Bahadurpally, Hyderabad,
Andhra Pradesh, India
e-mail: yaminidhar.bhavanam@mahindrauniversity.edu.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 343
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_31
344 G. Ramamurthy et al.
In the serial mode, the above updation takes place (asynchronously) at only one
neuron (say ‘i’), whereas in the fully parallel mode, the above updation takes place
(synchronously) at all the M nodes simultaneously. In the partial parallel modes of
operation, state updation takes place at more than one node but strictly less than ‘M’
neurons. Thus, denoting by V̄ (n) vector the state of HNN (whose components are
v1 (n), v2 (n),…,vm (n) ({+1, -1} vector), in the fully parallel mode of operation, we
have V̄ (n + 1) = Sign{W̄ V̄ (n) − T̄ }, where T̄ is the threshold vector.
Definition 1 The state vector Z̄ (i.e. +1, -1) is called the ‘stable state’ if and only if
Z̄ = Sign{W̄ Z̄ − T̄ }
Definition 2 The state vectors J, K (i.e. {+1, −1} vectors) constitute a cycle of
length2 if and only if
J¯ = Sign{W̄ K̄ − T̄ }
K̄ = Sign{W̄ J¯ − T̄ }.
Programming Associative Memories 345
.
3 Programming Hopfield Associative Memory ..Linear
Separability
4 Numerical Results
4
λi
W = ( f¯i )( f¯iT )
i=1
4
4
W = λi f¯i f¯iT
i=1
i.e. M = L.
348 G. Ramamurthy et al.
It is clear that the a Hamming distance between the eigenvectors f¯’s (which are
corners of hypercube) L/2. Figures 1 and 2 represent the dynamics of HAM with all
possible conditions.
(A) We conjecture that all corners of hypercube that are at a Hamming distance less
than L/2 from f¯j lie in the domain of attraction desired memory f¯j .
(B) If a spurious memory is reached (starting in an initial condition) it is retrieved or
decoded as the desired or programmed memory which is closest to it in Hamming
distances. This step requires at most M 2 distance comparisons. Such decoding
ensures that always retrieval result is a programmed memory.
Since any two desired memories are at Hamming distance L/2, if the domains of
attraction (like coding spheres) are disjoint at least L−1
4
errors can be corrected. We
expect the domains of attraction to be disjoint with our synthesis procedure. We are
Programming Associative Memories 349
currently attempting a proof of this conjecture (by capitalizing the freedom in the
choice of eigenvalues) [6].
We make the following conjecture: By capitalizing the freedom in choice of
eigenvalues, domains of attraction of desired programmed memories (L of them)
can always be ensured to be disjoint (thereby connecting 21 ( L2 - 1) = L−1 4
).
It is well known that Hopfield associative memory (HAM) is based on McCulloch-
Pitts neuron (which uses linear separability as the basis). But, in [1, 2, 7], we proposed
various interesting associative memories based on spherical separability. Essentially,
in such associative memories(with state space being the symmetric unit hypercube),
updation at any neuron is performed in the following manner: vi (n + 1) =
the state
Sign{ M j=1 Wi j v j (n) − ti }, where d(., .) is a suitable chosen distance measure such
as Hamming, Euclidean, Mahalonibis distance.
Even in such associative memories, spurious memories can be retrieved or decoded
as the closest (in Hamming distance) programmed or desired memories.
5 Conclusions
References
1. Garimella, Rama Murthy, Ganesh Yaparla, and Rhishi Pratap Singh. Optimal Spherical Sep-
arability: Artificial Neural Networks. In International Work-Conference on Artificial Neural
Networks, pp. 327–338. Springer, Cham, 2017
2. Ganesh, Yaparla, Rhishi Pratap Singh, and Garimella Rama Murthy. Pattern classification using
quadratic neuron: An experimental study. In 2017 8th International Conference on Computing,
Communication and Networking Technologies (ICC-CNT), pp. 1–6. IEEE, 2017
3. Haykin, Simon S. Neural networks and learning machines/Simon Haykin. (2009)
4. Garimella Rama Murthy, and Moncef Gabbouj. On the design of Hopfield Neural Networks:
Synthesis of hopfield type associative memories. In 2015 International Joint Conference on
Neural Networks (IJCNN), pp. 1–8. IEEE, 2015
5. RamaMurthy, Garimella, Vamshi Devaki and Divya, Synthesis or Programming of Hopfield
Associative Memory, Proceedings of International Conference on Ma-chine Learning and Data
Science (ICMLDIS2019), December 2019, (ACM Digital Library)
6. G.Ramamurthy and D.Praveen, Complex-valued Neural Associative Memories on the Complex
Hypercube, proceedings of the IEEE conference on Cybernetics and Information Systems (CIS
2004), Singapore, Dec 2004
7. Garimella Rama Murthy, Munugoti, Sai Dileep, and Anil Rayala N:ovel Ceiling Neuronal
Model Artificial Neural Networks. (2015)
8. J. Bruck, V.P. Roychowdhury, On the number of Spurious Memories in the Hopeld Model
(Neural Network). IEEE Transactions on Information Theory 36(2), 393–397 (1990)
350 G. Ramamurthy et al.
Abstract In this research paper, based on the concept of spherical separability (pro-
posed by the authors), novel associative memories are proposed (with centering
vectors being any corners of unit hypercube). Convergence results are established.
Also, hybrid associative memories which associate 1-D stimulus with 2-D memory
states (and 2-D stimulus with 3-D stable states) are proposed.
1 Introduction
Biological memories are capable of associating names, face etc. with an input stim-
ulus presented to the eyes/ears etc. In fact, humans invoke associative memory capa-
bilities in an effortless manner. Hopfield attempted and succeeded in innovating an
artificial neural network (ANN) model of associative memory. Such a model is based
on the concept of “Linear Separability”, utilized in McCulloth-Pitts model of neuron.
In [1–3], the authors proposed the concept of “spherical separability” of pat-
terns. It is reasoned that “liner separability” implies spherical separability but not
the other way. Also, ANNs based on spherical separability were proposed in [4].
As a natural consequence, we innovated a model of associative memory based on
spherical separability which has strong connections to clustering approaches (i.e.
unsupervised learning approaches). Also, ANNs based on spherical separability are
G. Ramamurthy (B)
Department of Computer Science and Engineering, Ecole Centrale School of Engineering,
Mahindra University, Bahadurpally, Hyderabad, Andhra Pradesh, India
e-mail: rama.murthy@mahindrauniversity.edu.in
T. J. Swamy
Department of Electronics and Communication Engineering, Gokaraju Rangaraju Institute of
Engineering and Technology, Bahadurpally, Hyderabad, Andhra Pradesh, India
e-mail: jagan.tata@griet.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 351
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_32
352 G. Ramamurthy and T. J. Swamy
related to radial basis function neural networks (RBFNN). In fact, the idea of hav-
ing “clustering vectors” (utilized in RBFNN hidden layer) finds the equivalent in
ANNs based on spherical separability. In this research paper, we propose a spherical
separability-based associative memory which is more general than that reported in
[4].
This research paper is organized as follows. In Sect. 2, relevant research litera-
ture is reviewed. In Sect. 3, novel spherical separability-based associative memory
is proposed. In Sect. 4, simulation results confirming the convergence theorem are
provided. In Sect. 5 hybrid associative memories based on linear separability are
discussed. The research paper concludes in Sect. 6.
In an effort to understand the logical motivation for Hopfield neural network, the
authors were naturally led to proposing “spherical separability” as a basis for design
of novel associative memories. In some sense, the initial effort borrowed the ideas of
clustering algorithms such as “k-means” algorithm. Also, the relationship between
error correcting codes and ANN’s such as Hopfield associative memory provided the
initial logical basis for the conception of new ideas, discussed in this research paper.
Furthermore, RBFNNs and their operation consolidated our proposal for spherical
separability-based associative memories [5, 6]. In RBFNN’s, the distance/metric
chosen is the Euclidean distance (between centering vectors and input vectors) [7,
8]. In a well-defined sense, the concepts/ideas proposed in this research paper are
not totally incremental but highly innovative. We hope that the humble beginning
will lead to many research efforts on the theme pioneered in this research paper.
In [1, 2], the concept of spherical separability was introduced. W briefly explains
the concept in the following discussion.
Definition: Patterns belonging to two classes in N-dimensional Euclidean space
are said to be spherically seperable if there exists a N-dimensional Euclidean hyper
sphere boundary which seperates the two classes.
Definition: Patterns belonging to ‘L’ classes are spherically seperable if every
pair of them are spherically seperable.
Note: In 2 dimensions, patterns belonging to 2 classes are spherically seperable
if there exists circle which seperates them.
Note: Suppose the patterns lie in a bounded region of N-dimensional Euclidean
space. Then, it readily follows that if the patterns are linearly seperable, they are
spherically seperable, but not the other way.
Novel Associative Memories Based on Spherical Separability 353
Consider a network of M, artificial neurons whose state value is {+1, or −1}. These
neurons correspond to the vertices of a graph G = {V, E}, which are connected to
each other with edges E, whose weights (edge weights) are synaptic weights. The
graph is an undirected graph with the edge weights being symmetric between vertices.
In terms of the synaptic weight matrix, we have that Wi j = W ji , i.e. W is a symmetric
matrix. Let V̄ (n) be the state of the ANN at time index ‘n’, i.e. V̄ = [v1 v2 · · · Vm ]
with vi ∈ {+1, −1}. Thus, the state space of the ANN is symmetric unit hypercube.
In this associative memory model, each artificial neuron is associated with a centering
vector ū i , which lies on the symmetric unit hypercube. There is no external input to
that ANN, and the state updation takes place in the following manner based on the
initial state vector, V̄ (0) in the simplest possible architecture [4].
where (d H (V̄ (n), Ūi ) is the Hamming distance between vectors (V̄ (n), Ūi ) lying on
the symmetric unit hypercube. Thus, the above updation takes place at any neuronal
node ‘i’. The modes of state updation are possible:
• Serial Mode: At any given time ‘n + 1’ the state updation described in (1) takes
place at any one neuron, ‘1’.
• Fully Parallel Mode: The state updation described in Eq. (1) takes place simulta-
neously at all the nodes, at any given time ‘n + 1’.
Partial Parallel Modes: At any given time ‘n + 1’, the state updation in Eq. (1)
takes place at more than one neuron, but strictly less than ‘M’ neurons.
Thus, in the fully parallel mode of the operations, state vector V̄i (n+1)
becomes (2)
⎡ ⎤
Sign{d H (V̄ (n), Ū1 ) − t1 }
⎢ Sign{d H (V̄ (n), Ū1 ) − t2 } ⎥
⎢ ⎥
V̄i (n + 1) = ⎢ .. ⎥ (2)
⎣ . ⎦
Sign{d H (V̄ (n), Ū M ) − t M }
In the state space of the associative memory proposed (based on spherical sep-
arability) there are distinguished states called ‘stable states’. Once the nonlinear
dynamical system reaches a stable state, there is no further change of the state of it.
Formally, we have the following definition. Definition: If Z̄ is a stable state (3), then
⎡ ⎤
Sign{d H ( Z̄ , ū 1 ) − t1 }
⎢ .. ⎥
Z̄ = ⎣ . ⎦ (3)
Sign{d H ( Z̄ , ū M ) − t M }
354 G. Ramamurthy and T. J. Swamy
Note: The main difference compared to the research report in [3] is that the
centering vectors ū i ’s at the neurons need not be orthogonal. (In [4], only the case
where ū i ’s are necessarily orthogonal is considered.)
Note: The above associative memory is based on spherical separability unlike
Hopfield associative memory which is based on linear separability.
Conjecture: Patterns belonging to L classes (L > 2) which are not spherically
seperable in a lower dimensional space can be rendered spherically seperable in a
higher dimensional space (by suitable projection approach).
The following theorem summarizes the dynamics of non-linear dynamical system
proposed in Eq. (1).
Theorem: The dynamical system based on Eq. (1) always converges to a stable
state in serial mode, whereas in the fully parallel mode, either convergence to a stable
state occurs or at most a cycle of length 2 is reached.
Proof: The theorem follows from the argument utilized in theorem (6), in Ref.
[9]. Details are avoided for brevity.
Figures 1 and 2 represent the illustrations of fully parallel mode of operation with 4
and 8 neurons. Figures 3 and 4 representing the illustration of serial mode of operation
4 and 8 neurons.
V̄ (n + 1) = Sign(W̄ V̄ (n) − T̄ )
. . .
where Ṽ (0) = [V̄ (0)..V̄ (0).. · · · ..V (0)], i.e. V̄ (n + 1) is the state of the hybrid asso-
ciative memory at the time ‘n + 1’ (with V̄ (0) being the stimulus column vector).
It is a matrix of {+1’s and −1’s}. Also, T̄ is a matrix of threshold values. W̄ is the
synaptic weight matrix. Let us label such an associative memory as SAM-2.
Novel Associative Memories Based on Spherical Separability 355
In the sprite of the above idea, we now associate a two-dimensional stimulus, i.e.
two-dimensional input signal with a three-dimensional memory state. The architec-
ture of such an associative memory involves stacking SAM-1 ANN’s as depicted in
the following figure, i.e. Fig. 5.
At each level of stack, we implement state updation in the following manner
where Wk is the synaptic weight matrix at the kth level of stack. Also, Ṽ (0) is the
two-dimensional stimulus signal (i.e. initial condition matrix of {+1’s, −1’s}). T̃k is
356 G. Ramamurthy and T. J. Swamy
the threshold matrix at level ‘k’. Let us label such a hybrid associative memory as
SAM-2.
Note: Using the convergence theorem for Hopfield associative memory, bot SAM-
1 and SAM-2 reach stable states (memories) in the serial mode.
The problem of retrieval of spurious memory arises when the initial condition is
corrupted by noise (many errors) moving it from the domain of attraction of associated
desired memory state to that of a spurious memory state. In [4], one possible solution
the problem in presented by retrieving only the desired stable/memory states and a
spurious memory/stable state is never retrieved. We now present another solution
to the problem. This solution depends on the specification of minimum, maximum
number of errors that can be allowed to occur. In this approach, multiple initial
conditions are simultaneously presented to the associative memory for retrieval. The
set of initial conditions are determined by minimum, maximum number of errors
that are allowed by specification.
Thus, in this approach, the initial condition is chosen to be a +1, −1 matrix, and
the associated associative memory is utilized to retrieve all related/associated stable
states/memory states.
Specifically, on distinguished choice of initial condition matrix is that it constitutes
a Hadamard matrix (which necessarily means |M4 | = 0). Thus, the initial condition
vectors are orthogonal and are at a Hamming distance of M/2. The corresponding
memories that are retrieved are utilized to decide the programmed memory that is
“most likely” to be the correct choice.
6 Conclusions
References
1. G. Ramamurthy, G. Yaparla, R.P. Singh, Optimal spherical separability: artificial neural net-
works, in International Work-Conference on Artificial Neural Networks (Springer, Cham, 2017),
pp. 327–338
2. G. Yaparla, R.P. Singh, R.M. Garimella, Pattern classification using quadratic neuron: an exper-
imental study, in 2017 8th International Conference on Computing, Communication and Net-
working Technologies (ICCCNT) (IEEE, 2017), pp. 1–6
3. S.S. Haykin, Neural Networks and Learning Machines (2009)
4. R. Garimella, J.S. Tata, Spherical Separability: Associative Memories. EasyChair Preprint No.
4990 (2021). https://easychair.org/publications/preprint/6gbR
5. G. Rama Murthy, M. Gabbouj, On the design of Hopfield neural networks: synthesis of Hopfield
type associative memories, in 2015 International Joint Conference on Neural Networks (IJCNN)
(IEEE, 2015), pp. 1–8
6. G. Ramamurthy, V. Devaki, Divya, Synthesis or programming of Hopfield associative mem-
ory, in Proceedings of International Conference on Machine Learning and Data Science
(ICMLDIS2019), Dec 2019 (ACM Digital Library)
7. G. Ramamurthy, D. Praveen, Complex-valued neural associative memories on the complex
hypercube, in Proceedings of the IEEE conference on Cybernetics and Information Systems
(CIS 2004), Singapore, 2004
8. G. Rama Murthy, S.D. Munugoti, A. Rayala, Novel Ceiling Neuronal Model Artificial Neural
Networks (2015)
9. G. Rama Murthy, Resolution of P = NP conjecture, Invited talk at 14th Polish–British Workshop,
June 2014 (Sponsored by DST, Government of India)
An Intelligent Fog-IoT-Based Disease
Diagnosis Healthcare System
Abstract Modern health services are the greatest problem in developed countries
in particular, where remote regions are not supplied with good quality drugs and
hospitals. IoT is a major player in medical treatment to provide people with better
clinical services, which also facilitates physicians and hospitals. In this paper, we are
presenting a novel and smart healthcare system focused on advanced techniques such
as IoT (i) which offer a platform to fog-aid IoT-enabled health-related disease diag-
nostics. (ii) The implementation of a server-side health diagnosis device for patient
diagnosis outcome (PDO). (iii) Monitor the severity of the disease by implementing
a mechanism for alarm generation. This device is smart enough for a clinical decision
support system to detect and analyzes patient data. This machine is a suitable alter-
native for people living in rural areas; it can determine if they have a major health
problem and to cure this by approaching nearby hospitals. We also developed a state-
of-the-art IoT process management tool that delivers operating states and facilitates
improved preparation and efficient use of resources and physical resources in the
healthcare process.
1 Introduction
By using the latest scientific and technological innovations, the health system
improves the well-being of a specific individual and the public while putting the
impact on public welfare. That is because proactive medicine normally predicts
diseases and abnormalities much earlier than the actual time that the problem
emerges, because it prevents deaths and injuries. The benefit of constructing an
IoT device is much less than medical and ambulance costs in the latter years. IoT
healthcare technologies will speed up the healthcare market in the next generation,
as its potential varies from clinical surveillance and diagnostic automation to many
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 359
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_33
360 C. K. Roy and R. Sadiwala
2 Related Work
Number of studies have dealt with different concerns relating to IoT in health care,
complex IoT healthcare technologies, IoT healthcare security problems and analysis.
In [10], the authors proposed a system for the tracking of diseases based on cloud-
centric IoT which predicts the condition with its seriousness. In addition, the authors
would discuss the following: They describe their primary concepts such that the
concept of data sciences can be analyzed to produce user-oriented health evaluations.
362 C. K. Roy and R. Sadiwala
For the implementation case, the architectural prototype for intelligent student health
is designed. They measured the findings particularly during treatment with health
measurements. They then created the comprehensive student health data through
UCI and medical sensors to estimate the student with various diseases. Diagnostic
systems are introduced by different state-of-the-art algorithms, and the outcomes are
determined based on accuracy, sensitivity, specificity, and F-measure.
In [11], both psychic and physical health are introduced. They use IoT-dependent
sensors either within or inside the body. In addition, the reactive healthcare infrastruc-
ture can be turned into constructive and preventive healthcare services by leveraging
mobile computing technologies in IoT-based health systems. A smart student m-
healthcare monitoring system focused on the IoT cloud is proposed in this case. In
this context. This system measures the magnitude of student diseases by estimating
the level of the illness by temporarily eliminating medical and IoT measures. They
developed an architectural model for the intelligent student health system to effi-
ciently interpret the student health results. In our case study, 182 alleged students
are simulated with a health dataset to establish applicable waterborne diseases.
Something evaluates further this data to verify our model using a k-cross-validation
approach. They used diagnostics based on patterns for different classification algo-
rithms, and the outcomes are calculated based on precision, sensitivity, specificity,
and response times. The experimental findings show that decision tree (C4.5) and
k-nearest neighbor algorithms are more effective in relation to the above parameters
than in other classifiers. By providing caretakers or doctor’s time-safe details during
a time, the suggested solution is useful for decision-making. Finally, the presentation
focused on temporal granules gets effective diagnostic outcomes for the proposed
scheme.
In [12], the authors propose a cloud-centered energy-efficient method for eval-
uating drought and forecasting the present state of affairs. Based on the study of
data variability using the Bartlett test, the architecture specifies the active and the
sleep interval of IoT sensors. With kernel principal component analysis (KPCA) at
the fog layer, the dimensionality of data on drought-causing elements is decreased.
Drying strength is calculated on the cloud level by the Naïve Bayes classification, and
drought is estimated based on SARIMA models over various periods of time. Exper-
imental and performance research show the feasibility of the method proposed for
the evaluation and estimation of drought with enhanced drought-causing attributes.
It also shows substantial savings in energy in comparison with other systems.
In [13], the authors introduce an IoMT-based diagnostic model of health care
through the usage of smart techniques. In this paper, the healthcare system for
cardiomyopathy predictions based on IoMT is being established based on the BBO-
SVM model. The model suggested includes the tuning of SVM parameters with
the BBO algorithm. A data collection for Statlog Heart Disease is used to vali-
date the proposed model. The thorough experimental review showed strongly that,
by achieving a maximal precision of 88.33%, the proposed BBO-SVM model had
outstanding performance, a recall of 8 77.60%, an accuracy of 89.26%, and an F-score
of 87.96%.
An Intelligent Fog-IoT-Based Disease Diagnosis Healthcare System 363
In [14], authors have adopted a novel systemic method in the areas of diabetes and
the use and prediction of patients who have experienced from diabetes with the UCI
repository dataset and medical sensors have created associated patient information.
They also suggest diagnosing the disease and its seriousness, a modern classification
algorithm named the fuzzy rule-based neural classifier. They performed the tests
with the normal UCI repository dataset and the actual health reports from different
clinics. The experimental results show that the work suggested improves the current
disease prediction systems.
In [15], the authors presented a new algorithm dependent classification with a
deep neural network (DNN) named OGSO-DNN for distributed healthcare systems.
In the oppositional glowworm swarm optimization [15], the cluster heads (CHs) from
the IoT devices are chosen with the OGSO algorithm in this analysis. The chosen
CHs then performed DNN-based classification procedure to the cloud servers. They
performed a simulation and modeling study using data from UCI repository and IoT
devices created from the student perspective to predict the severity of the disease
among the students. By attaining a combined mean sensitivity of 96.95, 95.07%, the
precision of 95.76%, and an F-score value of 96.88%, the proposed OGSO-DNN
model outperformed the previous models.
There has been more effort in the past (existing works) to build a platform for
medical and IT communication, in particular IoT. In these approaches, fortunately,
it is not acceptable to apply strong notions in computing. Machine learning, for
instance, is an interesting area. If a medical specialist is unavailable, a computer
specialist is there to help diagnose the issues of the patient. Algorithms developed this
expertise for machine learning. Practicable to this are attractive topics like learning
techniques and the fuzzy neural system. High-level characteristics must also train
an expert system during signal processing. The challenge that this study will thus
address is how an expert system might be designed using machine learning methods
in IoT. They carry this effort out to monitor, predict, and diagnose major diseases
by developing a cloud and IoT-oriented healthcare program. We develop a disease
detection system based on IoT and cloud in this study.
This IoT machine will also help persistently involve and serve patients by allowing
them to spend more time in communication with their physicians. This proposal
suggested a more intelligent system for monitoring patient health intelligently by
intelligent bio-sensors that capture patient health data in real time. A heartbeat, blood
pressure, and a temperature sensor DS18B20 were connected to the patient. This will
allow the physician to observe the patient from anywhere, and even to submit the
patient directly without leaving the hospital as shown in Fig. 2. Within the healthcare
sector, IoT employs a long-term history of continuous measurements to identify a
disease. In a healthcare context, the diagnosis requires for an aggregated collection of
measures for effective outcomes that cannot be achieved with a single clinic visit. IoT
364 C. K. Roy and R. Sadiwala
PDO from the diagnostic module for the patient. If the probabilistic value for the
PDO instance is lower than the prefixed threshold, we record the health condition
of the individual as healthy. On the other side, if the probabilistic value exceeds the
prefixed threshold, then the health of the patient is not healthy.
They sent the recorded data to the server using sensors. We show the results of the
application and web browser in Arduino. The accuracy of the scheme suggested is
calculated by the formulas.
α(xi)
Accuracy = (1)
m
The accuracy of the scheme proposed is determined by (1). The specificity of
the percentage for the data in the experiment is (1), α(xi), and m is the number
of tests. In this series of data, the average accuracy is 98%. The test results show
that intelligent and logical decision-making renders the IoT device dependent on the
sensor effective and workable. The IoT approach increases device functionality and
performance. The effects are determined by the formula showed the percentage error
(2). Here accuracy is essential for the accepted value, and it achieves an accuracy of
the experiments.
sensor data, and the effects of the patient are three sections of the study. The data
gathered through sensors are presented in Table 2 for 5 patients at different intervals.
The input data is obtained and calibrated; the second stage includes utilizing the
patient diagnostic outcome module to determine the state of the patient. The tuning
performance values of the input data are displayed in Table 3.
The data obtained by the temperature monitor, the pulse rate (PR) sensor, and
the blood pressure sensor (BP) are variations of Figs. 3 and 4. Particularly we can
observe that for patient P5 all the data is abnormal.
The patient diagnostic result module decides, and as seen in Fig. 4, the accuracy
of the decision is calculated. We see the accuracy of the method for the suggested
system in Table 3 from 94 to 100%. It shows that the system proposed operates under
rules specified for patient care and management decision making.
CakePHP: It is an open-source platform for quick web applications developed into
PHP. It is a free source framework. We focus it on the MVC design concept, which
enables users to quickly and easily create PHP web applications with less code.
CakePHP allows one to distinguish business theory from the data layer and layer of
presentation in Fig. 5.
In this work, we use the CakePHP Web server page for the physician to view
medical data in real-time clinical data, and patient historical health data have to enter
patient secret ID, as shown in Fig. 5. In addition, practically VMware or Microsoft
virtualization is the choice of medical institutions that finally decide on a private,
public, or hybrid cloud solution. Usually, we advocate using the Microsoft secured
cloud platform on intelligent designers that leverage Hyper-V and System Center
368 C. K. Roy and R. Sadiwala
120
Sensor value
100
80
60
40
20
0
P1 P2 P3 P4 P5
Patient name
60
50
40
30
20
10
0
P1 P2 P3 P4 P5
Patient name
Windows Server. This scalable solution meets the demands of most expanding busi-
nesses, simply powering cloud applications and/or providing cloud-based services
and operations. In particular, the Microsoft Azure cloud service provider offers easy
access to healthcare applications and data on request. Microsoft offers network,
server, and storage providers with a PaaS environment.
An Intelligent Fog-IoT-Based Disease Diagnosis Healthcare System 369
5 Summary/Conclusion
The diagnostic method can be rendered more accurate and efficient by indulging
medical devices in the IoT environment. We, however, introduced our smart devices,
adapted, and optimized to be configured to a wider elderly and disabled commu-
nity automatically. The method suggested comprises body temperature, pulse, and
blood pressure sensors for determining the patient’s condition under examination.
The system used the information base and patient diagnosis outcome of the system for
smart decision-making of health treatment, surveillance, and management to deter-
mine the potential symptoms and remedy. However, our healthcare system architec-
ture derives a conclusion from the evaluation of patients (PDOs) based on medical
and other sensor measurements. This formal model often includes main terminology
and principles, the technique for diagnosing disease, and the mechanism for alarm
production. Future research needs to be developed in order to create a correctly error-
free tracking and acquisition method in health care with high-quality bio-sensors and
to gather health data may even be useful to predict diseases in patients.
References
4. P. Parker, S. Banerjee, B. Korc-Grodzicki, Communicating with the Older Adult Cancer Patient
(2021). https://doi.org/10.1093/med/9780190097653.003.0085
5. C. Perissinotto, C. Zhang, T. Oseau, D. Balik, C. Sou, C. Burnight, K. Burnight, Feasibility of
a tablet designed for older adults to facilitate telemedicine visits. Innov. Aging 3, S975–S975
(2019). https://doi.org/10.1093/geroni/igz038.3534
6. M. Bansal, B. Gandhi, IoT & Big Data in Smart Healthcare (ECG Monitoring) (2019), pp. 390–
396. https://doi.org/10.1109/COMITCon.2019.8862197
7. C. Rajakumar, S. Radha, Smart Healthcare Use Cases and Applications (2020). https://doi.org/
10.1007/978-3-030-37526-3_8
8. A.D. Preetha, T.S. Pradeep Kumar, Leveraging fog computing for a secure and smart healthcare.
Int. J. Recent Technol. Eng. 8, 6117–6122 (2019). https://doi.org/10.35940/ijrte.B3864.078219
9. U. Ulusar, E. Turk, A. Oztas, A. Savli, G. Ogunc, M. Canpolat, IoT and Edge Computing as
a Tool for Bowel Activity Monitoring: From Hype to Reality (2019). https://doi.org/10.1007/
978-3-319-99061-3_8
10. P. Verma, S. Sood, Cloud-centric IoT based disease diagnosis healthcare framework. J. Parallel
Distrib. Comput. (2017). https://doi.org/10.1016/j.jpdc.2017.11.018
11. P. Verma, S. Sood, S. Kalra, Cloud-centric IoT based student healthcare monitoring framework.
J. Ambient Intell. Hum. Comput. 9 (2018).https://doi.org/10.1007/s12652-017-0520-6
12. S. Sood, Cloud-centric IoT-based green framework for smart drought prediction. IEEE Internet
Things J. 1111–1121 (2019). https://doi.org/10.1109/JIOT.2019.2951610
13. K. Kamarajugadda, M. Pavani, M. Raju, S. Kant, S. Thatavarti, IoMT with Cloud-Based Disease
Diagnosis Healthcare Framework for Heart Disease Prediction Using Simulated Annealing with
SVM (2021). https://doi.org/10.1007/978-3-030-52624-5_8
14. M.K. Priyan, L. Selvaraj, R. Varatharajan, G. Chandra Babu, P. Panchatcharam, Cloud and
IoT based disease prediction and diagnosis system for healthcare using Fuzzy neural classifier.
Future Gener. Comput. Syst. 86 (2018). https://doi.org/10.1016/j.future.2018.04.036
15. K. Praveen, P. Prathap, S. Dhanasekaran, I. Punithavathi, P. Duraipandy, I. Pustokhina, D.
Pustokhin, Deep learning based intelligent and sustainable smart healthcare application in
cloud-centric IoT. Comput. Mater. Contin. 66, 1987–2003 (2021). https://doi.org/10.32604/
cmc.2020.012398.R
Pre-processing of Linguistic Divergence
in English-Marathi Language Pair
in Machine Translation
1 Introduction
The use to translate text from one natural language known as the source language
to another known as target language explores in machine translation (MT) [1, 2].
Machine translation does not simply take over from words but the application of
complex linguistic knowledge; grammar, morphology, and machine translation entire
by humans; meaning all these things are taken deliberation major goals of machine
translation are morphological analysis, POS tagging, chunking, parsing, and word
sense disambiguation [3]. MT uses various approaches for translate source language
to target language between two language pairs [4, 5]. The divergence is a major
complication in translation between two language pairs. The variations that arise
in the language with respect to grammar is known as divergence. The divergence
mainly arises when translation from natural language as a source language to the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 371
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_34
372 S. N. Maniyar et al.
2 Related Work
Jisha P Jayan et al.: Here, authors say that they study the divergence in Malayalam-
Tamil languages pair, the divergence is more reported in structural and lexical level,
that is been resolved by using a bilingual dictionary and transfer grammar The accu-
racy is increased to 65%, which is promising here discussed the type of divergence
related to translation based on Dorr’s classification. Here, they found semantic or
syntactic types. Here, they use statically machine translation technique and apply
the creation rule. In this paper, author study on thematic, promotional, demotional,
structural, conflational, categorical, and lexical divergences [9].
Vimal Mishra et al.: In this study of divergence, issues in MT are required for
accurate classification and detection. The language divergences between English
and Sanskrit can be considered as representing the divergences between subject-verb-
object and subject-object-verb classes of languages. Here, focuses on conjunctions
and particles related divergence, participle related divergence, and gerunds related
divergence using an artificial neural network approach. Here, apply the detection rule
and adaption rule. They develop EST and the system is ANN and rule-based model
[10].
Pre-processing of Linguistic Divergence in English-Marathi … 373
Niladri Sekhar Dash et al.: In this paper, observed translation divergence in first
language English to target Bengali language pair, it also investigates how the different
linguistic and extra linguistic constraints can play decisive roles in translation,
resulting in divergences and other problems. The main objective of this paper is
to examine the type of divergence problems that operate behind English to Bengali
translation, and resolution of this problem is a pre-requesting for designing a robust
machine translation system here used the rule-based machine translation system
focuses on syntactic divergence and lexical-semantic divergence [11].
Sreelekha. S et al.: In the present study, author describes some ways to utilize
different lexical resources to improve the quality of the statistical machine translation
system. They develop the training corpus with different lexical resources such as
IndoWordnet semantic relation set, Kridanta pairs, function words, and verb phrases
here study on the usage of lexical resources mainly focused on two ways such as
augmenting with various word forms and augmenting parallel corpus with more
vocabulary. Here, analysis of errors for both Marathi to Hindi and Hindi to Marathi
machine translation systems. Here, they have used various measures to evaluate
such as BLEU score, METEOR, TER, and fluency and adequacy using subjective
evaluation [12].
Pitambar Behera et al.: In these study, different types based on the problem that they
are focused on grammatical, communicative, cultural, and so on. In case, those two
languages owe their origin to various language families, grammatical divergences
emerge. This research attempts to classify various types of grammatical divergences:
lexical-semantic and syntactic. In addition, it also helps to identify and resolve the
divergent grammatical features between Bhojpuri language to English language pair
him methodology are concerned, they have adhered to the Dorr’s lexical conceptual
structure for the resolution of divergences. This study has proven to be useful for
developing efficient MT systems if the mentioned are incorporated, considering the
inherent structural constraints between source and target languages [13].
Ritu Nidhi et al.: In this paper, the author says Maithili is a less-resourced language
in terms of technology development. This paper, therefore, is an attempt to create
a general-purpose machine translation (MT) system between this pair of languages.
Divergence detection and handling as a pre- or post-process are critical in automatic
translation to result in comprehensible outputs. The authors have focused only on
the lexical-semantic divergences in this paper. “This paper has reported the results
of only one training and testing on the MTHub.” They while reporting the progress
of developing an MT system for the English-Maithili pair present an account of
identifying and classifying MT divergences in the English-Maithili language pair
[14].
R. Mahesh K et al.: In the present study, they take Dorr’s (1994) classification of
translation divergence examine to implication of these divergences and the translation
patterns between Hindi and English locate further details. They attempt to identify
the potential topics that fall under divergence and cannot directly or indirectly be
accounted for or accommodated within the existing classification. They classify the
374 S. N. Maniyar et al.
divergence from Hindi to English and vice versa on the basis of that recommend an
augmentation in the classification of translation divergence, this is the objective of
this study. In this paper, they have examined the barrier of classification of translation
divergence for MT between English and Hindi [15].
3 Methodology of Pre-processing
In any machine translation system, this topic is much needed as to obtain a correct
translation, it is important to solve the nature of translational divergence. Divergence
can be seen at various levels. Here, we do pre-processing steps for ANN technique.
We are using Python language. First, we clean the text, then we tokenize the sentences.
After sentence tokenizing, removes all punctuations of these sentences, we do stop-
words and next we part of speech tagging the sentences and last parsing the sentences
in Fig. 1.
3.1 Database
Most of use a first language is English to other languages. “Indian constitution has
22 “scheduled” or national languages and almost 2000 dialects. Though, only about
5% of the world’s population speaks English as a first language.” English is very
widely used in media, commerce, science, technology, and education in India. In
such a situation, there is an obvious large market for translation between English and
the various Indian languages. In Maharashtra widely speak, the Marathi language. It
Fig. 1 Pre-processing of
ANN technique Input sentences
Tokenization
Stopwords
Parsing
Pre-processing of Linguistic Divergence in English-Marathi … 375
For the data supplied by the organizers, we stick to their sentence segmentation and
tokenization. For the further data, we use a trainable tokenize that can be easily
adapted to a new language simply by providing a few instances of sentence and
token breaks. Tokenization is a process of identification of token/topics within input
sentences and it helps to decreased search with a significant degree [16]. The advan-
tage of tokenization is reduced the storage spaces required to store tokens iden-
tified from input sentences and also tokenization effective use of storage space.
The first step in text analysis and processing is to split the text into sentences and
words, a process called tokenization. This is the first step for machine translation and
tokenizing a text makes further analysis easier.
Input English = “Ram likes Sita, I likes sweets, I want sweets”
Input Marathi-p = “
”
Here, we take an English and Marathi sentences. We tokenize this sentence and
show the output in Fig. 2. In tokenization, break the sentences into the words.
The text is to be classified into different categories; removed from the given text so
that focus can be given to those words which characterized the meaning of the text
in the stopwords. It is like a text classification. After applying stopwords, the time to
train the model decreases and dataset size also decreases. Then, removing stopwords
376 S. N. Maniyar et al.
in the database can potentially benefit better meaningful tokens left and performance
as there are fewer and also help the increases classification accuracy. We use the
NLTK library for removing stopwords in Fig. 3.
Input English = “Ram likes Sita, I likes sweets, I want sweets”
Input Marathi-p = “
”
Here, we remove the stopwords in the database. We show the output Fig. 3, we
classified the database and removing the meaningless words. Clean the database and
we use a NLTK library.
Tagging sentence in a broader sense indicates the verb, noun, etc., by the context
of the sentence. Identification of POS tags is a problematic process. Thus, generic
tagging of part of speech is manually not possible as some words may have specific
meanings give to the structure of the sentence. Conversion of text in the form of a
list is an important step since tagging as each word in the list is looped and estimate
for a particular tag in Fig. 4.
Here is an example using this text:
Input English = “Ram likes Sita, I likes sweets, I want sweets”
Input Marathi-p = “
”
Here, we apply POS of tagging for both sentences. Show the output for Fig. 4.
Tagging means labeling of the words part of speech tagging is the one where we add
the parts of speech category to the word depending upon the context in a sentence, it is
also called the Morph-syntactic Tagging. Tagging is essential in machine translation
to follow the target language. This is the objective of our study. Particularly, in MT,
Pre-processing of Linguistic Divergence in English-Marathi … 377
when the system explains the POS of source language text then only it will translate
into the target language without any errors. So POS Tagging is an important role in
MT.
5 Parsing
The parser is used for main purposes in the MT system. The parser is used for the
syntactic study of the English sentence in order to give the parse tree structure of the
English sentence by a context-free grammar. The parser is used for parts of speech
(POS) tagging of the English sentence to allow English words and their comparable
POS tags.
Input English = I like sweet
Input Marathi =
Here, we show the output of parsing of sentences in Fig. 5. We use an NLTK
library. The parsing process is the first basic of the process engine. This basic is
important for the top to bottom analysis. This is the sub process of parsing such as
the input process, sentence analyzer process, morphological analysis process, the
EtranS Lexicon, and the parsing process.
This divergence occurs owing to the differences in understanding of the argu-
ments of a verb. In the following examples, the accusative case with the other NP
(sweets) whereas in Marathi and the English sentence has the nominative case with
the pronominal, the NP (pasand) has nominative and the pronominal seems to possess
dative case.
378 S. N. Maniyar et al.
6 Conclusion
In this work, it explains the various standardized approaches in the field of machine
translation word wide and especially with context to Indian languages. We use the
method word tokenize. Here, we are splitting the sentence into words. Using the
output of tokenization, understanding the text in machine translation. It can also be
provided as input for further text cleaning steps like punctuation removal, numeric
character removal POS of tagging result. In this result, some words have different
meanings according to the structure of the sentences. Use parsing technique for the
morphological analysis of words in the English and Marathi sentence, to get the
morphology of English and Marathi words. The morphology information of English
is used in the morphological synthesizing for equivalent Marathi words. With the
help of parsing, we identify the structure of sentences and divergences.
References
9. J.P. Jayan, E. Sherly, A Study on Divergence in Malayalam and Tamil Language in Machine
Translation Perceptive
10. V. Mishra, R.B. Mishra, Divergence Patterns Between English and Sanskrit Machine Transla-
tion (2014)
11. N.S. Dash, Linguistic divergences in English to Bengali translation. Int. J. Engl. Linguist. 3(1)
(2013). E-ISSN: 1923-8703
12. S. Sreelekha, R. Dabre, P. Bhattacharyya, Comparison of SMT and RBMT. The Requirement
of Hybridization for Marathi–Hindi MT (2019)
13. P. Behera, N. Maurya, V. Pandey, Dealing with linguistic divergences in English-Bhojpuri
machine translation, in Proceedings of the 6th Workshop on South and Southeast Asian Natural
Language Processing (2016)
14. R. Nidhi, T. Singh, Divergence identification and handling for English-Maithili machine.
Pramana Res. J. 9(2) (2019). ISSN: 2249-2976
15. R. Mahesh, K. Sinha, A. Thakur, Divergence Patterns in Machine Translation between Hindi
and English
16. V. Singh, B. Saini, An effective tokenization’s algorithm for information retrieval system, in
First International Conference on Data Mining (2014), pp. 109–119. https://doi.org/10.5121/
csit.2014.4910
Futuristic View of Internet of Things
and Applications with Prospective Scope
Abstract The IoT connects with physical and digital environment. Nowadays, one
of the purposes of the Internet is its movement. The IoT is an example where
common articles can be incorporated into collecting, acquiring, managing systems,
and managing opportunities that will allow them to compare with the Internet to
achieve any cause. The IoT will generally join everything in our reality under a
broad framework. Each item will have a selected interface and will actually want to
find it for themselves as well as the interface and the Internet. RFID methods will
form basis of IoT. IoT gadgets will be unavoidable, will allow for the surrounding
environment, and enable the development of knowledge in the right way. Paper covers
current state of testing, and importance of IoT is reflected in mobility structure. The
current research takes care of IoT initiatives through systematic review of academic
papers, and competent expert discussions.
1 Introduction
The term IoT was begun by industry examiners however has gotten progressively
ordinary over the long run. Some keep Internet of Things will totally change how
PC networks are utilized in the following 10 or 100 years, while others imagine that
IoT is only a revelation that won’t positively affect the everyday lives of many. The
Present Address:
M. Prasad · B. W. Agajyelew
School of C & I, CoET, Dilla University, Dilla, Ethiopia
T. K. Vijay (B)
Information Technology Department, CoET, Samara University, Semera, Ethiopia
M. Sreenivasu
Department of CSE, GIET College of Engineering, Rajamahendravaram, A.P., India
e-mail: msreenivasucse@giet.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 381
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_35
382 M. Prasad et al.
IoT has become an undeniably developing subject of discussion both all through
the working environment. The thought will not simply influence the way we live so
far how we work. The IoT sets the opportunity to investigate, gather, and abuse the
developing choice of social information. It is normal for IoT devices to be introduced
on a wide range of hot force sources like switches, bulbs, power plugs, TVs, and so
forth and to have the option to speak with the stock association so it can more readily
change energy age and force utilization. Web of Things (WoT) is the original name
of Kevin Ashton, who saw the worldwide tactile arrangement of this present reality
on the Internet. Aside from the way, that things, the Internet, and access are three
things in the IoT, which is imperative to close the hole between the actual world and
the high-level world in self-creation and self-improvement [1, 4].
Any individual who asserts that the Internet has generally improved society might
be correct yet at the same time, a reasonable change is still in front of us. The different
works of art of hypothesis are as of now dynamic which implies that the Internet is
very nearly opportunity advancement as the tales are huge and unusual and imitate
their Web uniqueness. The triumph from the Internet of PCs where our workers and
PCs are associated with the more extensive authoritative system, just as the Internet of
adaptable structures, while it was the turn of telephones and other portable segments,
the following period of progress is IoT, will be coordinated and open in the visual
space [2, 3]. This change will be a significant extension of the Internet consistently and
will have control limits on every industry, just as the completion of our day-by-day
presence.
1.1 History
the climate by exchanging data and information got on the climate [1–4]. Special-
ists have gaged that IoT have almost 50 billion duplicates by 2020 which is too big
achievement and it gets increased year-on-year basis.
The IoT is something irritating about the finish of enlistment and writing, and
its advancement relies upon uncommon unique improvements in an assortment of
significant fields, from far off sensors to nanotechnology. The principle Internet assets
were the Coke machine at Carnegie Melon University during the 1980s.
The essential idea of IoT was introduced in an uncommon ITU report in 2005. It
has genuine articles on the Web. To be exact, genuine articles are related with that
utilization of the Internet [7]. The ITU has verbalized the idea of IoT, and it has
coordinated the material into four key classes of marking materials, items, ideas,
and agreements. Moreover, Wikipedia additionally traces the advantages of IoT,
and it recommends six classifications of shrewd designing [6–8], complex systems,
size thinking, time contemplations, and space contemplations. Consequently, model-
driven strategies and compelling methods will be accessible just as new ones that
are prepared to treat the delivery and strange improvement of cycles. For IoT, the
meaning of the occasion won’t be founded on a dynamic or execution model. It tends
to be, so to speak, in light of the setting of the real occasion.
The brand name is an unforeseen construction. In open or shut circles, after some
time, it will be thought of and engaged as an erratic structure because of the huge
number of various associations and communications between smug purchasers, just
as its capacity to fuse new possible pioneers and the thought of time. In the IoT,
comprised of billions of times something similar and reliable, time will presently
don’t be utilized as a norm and exact measure yet relies upon everything (object,
estimation, information system, and so on).
2 IoT Use
2.1 Health
The IoT is proposed to improve the quality of life of the individual through the use
of technology as part of the vital human endeavors. In that sense, the test and the
force of movement can be removed from the human side to the machine side [16].
The main use of IoT in medical services is in assisted living conditions. Sensors can
be set up to monitor the hardware used by the patients. The information collected
by these sensors is made accessible to professionals, relatives, and other investors to
improve treatment and more responses.
Utilization of IoT
(1) Smart use
Keen metropolitan networks should screen water accessibility to guarantee
that there is satisfactory admittance to the necessities of inhabitants and orga-
nizations. Far off sensor networks offer new systems to metropolitan networks
to completely assess their water admission structures and recognize their most
pessimistic scenario situations [11–16]. Metropolitan zones that focus on the
progression of water through tangible innovation make better quality to end
speculations from their business. Tokyo, for instance, has chosen to save $ 170
million yearly by distinguishing the impacts of early flooding (LIBELIUM,
2013). The structure may report the subtleties of the funneling of the streams
through pipelines, similarly as it sends prudent steps if water use is past ordi-
nary assumptions. This permits the reasonable city to decide the area of the
pipelines and to zero in on therapeutic estimates dependent on the size of the
water debacle that can be shut by woods [20–23].
(2) Smart homes and work environments
Different electrical deficiencies around us, for instance, microwaves,
coolers, warming, constrained air frameworks, fan, and lights are surrounding
us. Actuators and sensors can be acquainted into these devices with utilize
increasingly more energy to add additional solace all through your everyday
life. These sensors can quantify open air temperature and can likewise figure
out which inhabitants are inside and suitably control the measure of warmth,
cooling and light dispersion, and so forth Doing this can assist us with
decreasing expenses and increment energy investment funds [10].
(3) Improved rec offices
Consolidating new advancements, for example, an alternate exercise profile,
which can be brought into the machine, can improve the comprehension of the
amusement community and everybody can be seen from their own interesting
character, and therefore, the significant profile will be presented [23–25].
(4) Food stockpiling
The food we eat requirements to go through various stages before it can
show up in the fridge. They are caught in an intricate food cycle: making,
reaping, shipping, and conveying. With the utilization of suitable sensors, we
Futuristic View of Internet of Things and Applications … 385
can hold food back from being harmed by the climate by checking the temper-
ature, stickiness, light, heat, etc. [10]. The sensors can quantify these sorts
straightforwardly and educate the individual concerned. Attention to assets to
keep away from a plant you can envision [24].
3 Trouble of IoT
utilized in its body to give state of being [20]. Also, it is inescapable that when prop-
erty is obtained through genuine climate, it can without a doubt be misused by the
intruder [25].
Proof isolating article and the executives are viewed as one of the approach diffi-
culties during the time spent making the World Wide Web (IoT). Various ID the
board is accessible these days for an assortment of approaches to create and test
ID for better information protection. Regardless, it has never been clarified on this
point concerning what proof-based methods are intentionally proper or how to apply
them in IoT environment. What is more, by far, most of existing proof frameworks
have effectively been made for momentary use in nearness to clear purposes. Conse-
quently, the requirement for a worldwide trust to recognize proof is fundamental
[21].
Heaps of sensor-fueled devices and actuators should keep up clear techniques and
rules for confirmation to permit sensors to communicate their information. As of
now, shortsighted plans in this field have not been given much regarding setting [23].
Presently, in a circumstance where we need to give tactile security, we need to utilize
practical courses of action, which is in opposition to the objective of the IoT strategy
to give lightweight gatherings [14].
We will have many data made by IoT. Joining this data to give more prominent
comprehension by just giving an incredible new wide-running security mix can make
information methodologies, which moves us to a sudden customer profile. Nonethe-
less, these segments can jeopardize the security of clients by sharing their data which
can make critical challenges in such manner [9].
388 M. Prasad et al.
As innovation turns into the quantity of clients and contraptions with a wide scope of
correspondence and consistently developing turns of events. IoT needs to give coop-
eration on a limitless number of things that are basic in authoritative construction.
Consequently, IoT needs to empower segments that control energy consumption,
guaranteeing the wellbeing of this enormous number of items [15].
Handling IoT versatility tests should be driven so as to have secure settings and
fabricate. A significant system of the structure can be made dependent on wellbeing.
For instance, help can be coordinated so all customers can adapt to specific assort-
ments of individuals who can get to information, and populace decay can be seen with
power [26]. Along these lines, it is essential to furnish security designing with proper
projects. At the end of the day, having symmetric or upside-down cryptographic
declarations according to conditions give a safe structure. Joint effort to construct
this undertaking is trying, particularly the enormous number of IoT contraptions that
it will confront [18–20].
IoT will create a sensitive market by providing relevant information from a variety of
sources. In this way, it will help to meet the needs of most consumers. Accordingly,
the provision of various strategies for verifying individual information will be a legal
issue for brushing with the associated data. This goal should be met with low-weight
protection measures, which are considered experiments [20].
Futuristic View of Internet of Things and Applications … 389
The impact of Internet development is undeniable on IoT. How the Internet is used
and the framework for doing things online are the two main components that affect
IoT.
In any case, data security and protection play a key role in building the Internet.
Obtaining security and readiness to protect the Internet by doing normal work will
create difficulties in this field [12].
There should be a certain level of trust a person can have in various parts of the IoT.
Relying on machines alongside people who can actually have protection is highly
regarded by analysts. Reliability can be viewed as a level of confidence that can be
achieved with explicit help or an object. In addition, trust is not seen differently from
humans, it can be seen in structures or machines, for example pages, showing a level
of trust in the computer community. From another standpoint, trust can be seen as a
means by which we can be sure that the framework manages its work properly and
provides accurate details.
Another point of view can be seen as a way to deal with information. The crypto-
graphic and convention components as a rule are excellent data verification decisions,
but for now, we will not be able to apply these techniques to small things. As a result,
we need to be strategic in how we can handle information with different strategic
tools. Alternatively, if this assumption needs to be made, we have to change a lot of
current instruments [16].
Life expectancy for individual IoT items.
The way any object in the IoT should have a short life, and should not last for
long years is obvious. For example, User Datagram Protocol (UDP) management
provides a level of durability, which means they respond with more information than
they have mentioned in the UDP [16]. This development is a result of the fact that the
source address can be ridiculous for the reason that UDP is not connected. Similarly,
the same situation with Global System for Mobile Communications (GSM), wired
equivalent privacy (WEP), and various other remote conventions has shown that this
assumption is wrong [17].
390 M. Prasad et al.
New information about our IoT future has prevented organizations from looking
at key Internet of Things objects—i.e., equipment, systems, and support—to give
designers the ability to transfer applications that can connect anything within the IoT
level. In this paper, we introduced the IoT and summarized the contextual investiga-
tion into the IoT. With the various propels of the new Internet, the world is looking at
anytime, anywhere, for anyone looking at the world. In the current scenario, “Things”
are basically those gadgets that are programmed and programmed into the IoT. Part
of those items will be available directly on the Internet although some will apparently
be integrated into nearby organizations behind logs and addressing tools. New appli-
cations and organizations are constantly being made, and Internet content is evolving.
In this area, many scientists have suggested the development of IoT. Besides, there
is still much difficulty. To address these issues, we must overcome the difficulties of
IoT. Therefore, future work requires the goal of these problems. Gathering the Web
management required by the client as their disclosure is an important issue in the
IoT environment.
5 Conclusions
IoT has been finding a sea of new changes in our day and day of life, which works
to make our lives easier and more accessible but progress and different applica-
tions. There is an endless supply of IoT applications in all areas including medical,
manufacturing, fabrication, transportation, training, management, mining, and envi-
ronment. Although the fact that IoT has many advantages, there are few errors in
IoT management and execution rate. The main points in this document are that (1)
there is no universal definition however, the most important requirements that exist
in IoT application areas and document development. These areas will develop and
influence human existence through mysterious practices over the next decade.
References
6. R. Aggarwal, M. Lal Das, RFID security in the context of “Internet of Things”, in First
International Conference on Security of Internet of Things, Kerala, 17–19 Aug 2012, pp. 51–56
7. M.-W. Ryu, J. Kim, S.-S. Lee, M.-H. Song, Survey on internet of things: toward case study.
Smart Comput. Rev. 2(3), 195–202 (2012)
8. E. Biddlecombe, UN Predicts “Internet of Things” (2009). Retrieved 6 July
9. D. Butler, Computing: everything, everywhere. Nature 440, 402–405 (2020)
10. R. Parashar, A. Khan, Neha, A survey: the internet of things. Int. J. Tech. Res. Appl. 4(3),
251–257 (2016). e-ISSN: 2320-8163. www.ijtra.com
11. H.Yinghui, L. Guanyu, Descriptive models for Internet of Things, in IEEE International Confer-
ence on Intelligent Control and Information Processing, Dalian, China, 2010, pp. 483–486
12. Y. Bo, H. Guangwen, Application of RFID and Internet of Things in monitoring and anticoun-
terfeiting for products, in International Seminar on Business and Information, Wuhan, Hubei,
China, 2008, pp. 392–395
13. A. Grieco, E. Occhipinti, D. Colombini, Work postures and musculo-skeletal disorder in VDT
operators. Bollettino deOculistica Suppl. 7, 99–111 (1989)
14. K. Pahlavan, P. Krishnamurthy, A. Hatami, M. Ylianttila, J.P. Makela, R. Pichna, J. Vallstron,
Handoff in hybrid mobile data networks. Mob. Wirel. Commun. Summit 7, 43–47 (2007)
15. X.-Y. Chen, Z.-G. Jin, Research on key technology and applications for the Internet of Things.
Phys. Procedia 33, 561–566 (2012)
16. M. Chorost, The networked pill. MIT Technology Review (2008)
17. E.Zouganeli, I. Einar Svinnset, Connected objects and the internet of things—a paradigm shift,
in International Conference on Photonics in Switching, Pisa, Italy, 2009, pp. 1–4
18. Z. Tongzhu, W. Xueping, C. Jiangwei, L. Xianghai, C. Pengfei, Automotive recycling infor-
mation management based on the internet of things and RFID technology, in IEEE Inter-
national Conference on Advanced Management Science (ICAMS), Changchun, China, 2010,
pp. 620–622
19. G. Gustavo, O. Mario, K. Carlos, Early infrastructure of an Internet of Things in Spaces (2008)
20. B. Gubbi, P. Marusic, Internet of Things (IoT): a vision, architectural elements, and future
directions. Future Gener. Comput. Syst. 29, 1645–1660 (2013)
21. M. Wu, T.-J. Lu, F.-Y. Ling, J. Sun, H.-Y. Du, Research on the architecture of Internet of
Things, in 2010 3rd International Conference on Advanced Computer Theory and Engineering
(ICACTE), Chengdu, 2010, pp. V5-484–V5-487. https://doi.org/10.1109/ICACTE.2010.557
9493
22. M. Weyrich, C. Ebert, Reference architectures for the internet of things. IEEE Softw. 33(1),
112–116 (2016)
23. J. Gubbi, R. Buyya, S. Marusic, M. Palaniswami, Internet of Things (IoT): a vision, architectural
elements, and future directions. Future Gener. Comput. Syst. 29(7), 1645–1660 (2013)
24. F. Bonomi, R. Milito, P. Natarajan, J. Zhu, Fog computing: a platform for internet of things and
analytics, in Big Data and Internet of Things: A Road Map for Smart Environments, pp. 169–186
(Springer, Berlin, Germany, 2014)
25. J. Sung, T. Sanchez Lopez, D. Kim, The EPC sensor network for RFID and WSN integration
infrastructure, in: Proceedings of IEEEPerComW’07, White Plains, NY, USA, March 2007
26. G. Broll, E. Rukzio, M. Paolucci, M. Wagner, A. Schmidt, H. Hussmann, PERCI: pervasive
service interaction with the internet of things. IEEE Internet Comput. 13(6), 74–81 (2009)
Identifying and Eliminating
the Misbehavior Nodes in the Wireless
Sensor Network
Abstract In recent years, advanced research in wireless sensor networks (WSN) has
become a trending and emerging technology. Sensors can be used to monitor physical
and environmental conditions, and they are also used in the manufacturing industry.
Battery life and security issues are the two most significant problems and challenges
in wireless sensor networks. Many algorithms have been developed to implement
the above issues in many other situations, but both issues are not fully resolved due
to a variety of factors, such as duplication of data that is not filtered, wasting battery
power, and bandwidth. Some nodes in the network become selfish, unable to forward
packets to neighboring nodes. These nodes cause network misbehavior, rendering
the network partially inactive. Our proposed method entails removing misbehaving
nodes from the network as well as checking for message duplication in the network
before sending data, and our algorithm satisfies the aforementioned requirements.
1 Introduction
WSNs are organized automatically by own and structured relationships that periodi-
cally watching physical or common conditions such as temperature, sound, vibration,
pressure, and improvement harms and obligingly send their information through the
relationship to a fundamental area or sink where the information can be seen and
N. Selvaraj (B)
Research Scholar, Department of CINTEL, SRM Institute of Science and Technology,
Kattankulathur, Chennai, India
e-mail: ns2066@srmist.edu.in
E. S. Madhan
Assistant Professor, Department of CINTEL, SRM Institute of Science and Technology,
Kattankulathur, Chennai, India
e-mail: madhane2@srmist.edu.in
A. Kathirvel
Professor, Department of Computer Science and Engineering, NIOT research group, Karunya
Institute of Technology and Sciences, Coimbatore, TN, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 393
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_36
394 S. Navaneethan et al.
isolated. The sink or base station serves as an interface. By blending questions and
assembling results from the sink, one can recover required data from the relationship.
A far-flung sensor network almost certainly has a large number of sensor focuses.
The sensor network nodes can communicate with one another using radio signals.
A sensor place point that has been removed is equipped with recognizing and force
part.
The individual nodes in a wireless sensor network (WSN) are generally resource
constrained: they have a managed speed gathering cutoff and a correspondence infor-
mation transfer limit. The sensor networks are in danger of self-figuring out a suitable
connection structure after they have been sent occasionally, I will have to deal with
them via multi-skip correspondence. By this time, the sensors have been installed and
are collecting data from requests sent to a “control site” in order to perform express
headings or give perceiving tests. The sensor environments can work in either a
perpetual or event driven mode. All things system (GPS) and local engineering eval-
uations can be used to get a sense of the area and sort out data. Actuators can be added
to difficult-to-reach sensor devices to allow them to “act” when certain conditions are
met. These associations are more plainly referred to as wireless sensor and actuator
networks.
Wireless sensor networks (WSNs) link new applications, necessitating non-
standard ideal models for a few goals. As a result of the requirement for low contrap-
tion multi-layered nature close to low energy use, a suitable agreement between
correspondence and sign/information dealing with limits should be found (e.g., long
affiliation lifetime). This necessitates a massive effort in resealing since the last
decade; there have been numerous hypotheses in this field. Currently, a large portion
of WSN research has focused on the design of energy and computationally amazing
persuading figures and shows, and the application area has been limited to organize
data designed seeing and deciding. A propose a cable mode transition (CMT) evalua-
tion, in which a small number of dynamic sensors are selected to maintain K-thought
about a scene in the same way that K-accessibility of the association is maintained.
It assigns lethargy times to interface sensors without affecting the alliance’s idea
and receptiveness requirements, which are subject to close scrutiny. A careful date
is important in a deferral network.
The proposed network structure aims to reduce data collection delays in difficult-
to-reach sensor networks, extending the alliance’s lifespan. The experts considered
hand-off concentrations to reduce the alliance’s numerical requirements and used
particle swarm optimization (PSO)-based calculations to find the best sink loca-
tion. Energy-saving correspondence has also been capable of proposing a numerical
response for locating the ideal sink position for extending the connection lifetime.
Traditionally, evaluations of inaccessible sensor networks have focused on homoge-
neous sensor location centers. In any case, researchers are now concentrating their
efforts on heterogeneous sensor networks, which differ from one another in terms of
energy consumption.
In any case, researchers are now focusing their efforts on heterogeneous sensor
networks, which differ in terms of energy consumption from one another. New asso-
ciation structures with heterogeneous devices, as well as a new movement in this
Identifying and Eliminating the Misbehavior Nodes … 395
direction, are removing current roadblocks and expanding the range of possible
applications for WSNs, all of which are rapidly evolving.
Wireless sensor networks have recently piqued people’s interest (WSNs). It isn’t
a trick to think of WSNs as one of the most recently explored zones. Here is a
summary of a difficult from the structure. Every day, two or three new uses and
business opportunities emerge. The WSN market is figure to ascend from $0.45
billion of each 2012 to $2 billion out of 2022. Figure 1 shows the chose ascending
in income from the WSN market for the hour of 2010–2014.
A WSN is a network of small devices known as sensor focuses that are spatially
distributed and collaborate to provide data totaled from the saw field via remote affil-
iations. The data collected by the various focus points are sent to a sink, which either
uses it locally or connects it to other networks, such as the Internet. WSN progres-
sion has several advantages over traditional structure-based association strategies,
including lower costs, adaptability, endurance, exactness, adaptability, and affilia-
tion simplicity, all of which are associated with their use in a variety of applications.
As levels of progress advance and sensors become more astute, truly unassuming,
and moderate, billions of distant sensors are passed on in various applications.
Military, climate, clinical benefits, and security are just a few of the potential
application areas. Sensor focuses can be used in the military to see, find, and track
foe headways. If an occurrence of damaging events occurs, sensor focus focuses can
continuously separate the climate to identify problems early. Sensor focus focuses
can assist in observing a patient’s flourishing from a clinical standpoint. Sensors can
provide wary reconnaissance and expanded knowledge of potential controller attacks
in security.
The examined climate plays a significant role in determining the size, geog-
raphy, and strategy of the affiliation. For example, if the checked climate is a vast
region that is difficult to reach by people, a spur-of-the-moment dispatch of focuses
is preferred over a planned affiliation. Furthermore, outside conditions necessitate
constant focus focuses to cover a vast area, whereas indoor conditions necessitate
fewer focus focuses to lay out a relationship in a constrained space [1, 2].
A WSN also has two or three asset destinations, each of which has a limited amount
of energy, a short correspondence range, uninformed transmission, and a limited
396 S. Navaneethan et al.
preparing limit and cutoff. WSN’s evaluation goal is to address recently mentioned
course of action and asset objectives by presenting new plan ideas, improving existing
shows, and growing new calculations. WSN is a promising advancement with enor-
mous potential to change our reality if we can resolve some examination issues. In
the WSN synthesis, there are two or three overviews on various evaluation districts,
for example, directing system, MAC shows, blockage control, information blend,
power safeguarding constringent security, and applications. We should add that target
application driven headway of advancements achieves a Silo approach.
A wireless sensor network (WSN) is made up of sensor center points or pieces that
are devices with a processor, a radio interface, a simple to digital converter, sensors,
memory, and a power supply. The processor prepares the data and sets the board limits
for the digit. Temperature, clamminess, light, and other factors can all be detected
using the sensors attached to the piece. Due to move speed and power requirements,
bits essentially maintain uninformed units with limited computational power and a
limited recognizing rat. Programs (bearings that the processor executes) and data
are stored in memory (unrefined and dealt with sensor assessments). Pieces are
equipped with a low-rate (10–100 kbps) and short-range (under 100 m) distant radio
to communicate with one another. Because radio correspondence consumes the vast
majority of the power, the radio should incorporate energy-saving correspondence
techniques.
Common power source is batteries that are fueled by batteries. Because pieces can
be passed on in far-flung and dangerous environments, they should be low-power
and worked in sections to extend network lifetime. Pieces could, for example, be
equipped with convincing power gathering strategies, similar to solar cells, allowing
them to be left unattended for long periods of time. Batteries that are powered by
batteries. Because pieces can be passed on in far-flung and dangerous environments,
they should be low power and worked in sections to extend network lifetime. Pieces
could, for example, be equipped with convincing power gathering strategies, similar
to solar cells, allowing them to be left unattended for long periods of time. On the
other hand, coordinated sending is helpful for limited incorporation where less center
points are sent at express zones with the potential gain of lower network support and
the heads cost.
Remaining section are organized as follows: Section 2 describes detailed literature
works of previous papers are given. Our proposed solution is illustrated in Sect. 3.
Simulation results is tested using Qualnet 5.02 simulator and result were discussed
in Sect. 4. Finally we have conclude our work in Sect. 5
2 Related Works
Algorithms for detecting selfish nodes in a MANET have been developed in the
literature. To encourage packet forwarding without discrepancies, a fuzzy reputation
system is used to discipline nodes that behave selfishly [2]. Deering [3] proposes
a reputation-based algorithm in which each node is expected to keep track of all
Identifying and Eliminating the Misbehavior Nodes … 397
other nodes and obtain reputation from a centralized node. In Fig. 1, Ballardie and
Crowcroft [4] propose a scheme in which each node earns credits by forwarding
packets of other nodes, allowing them to transmit their possess packet. In addi-
tion, Zhou et al. [5] propose activity-based overhearing, which uses iterative and
unambiguous probing to detect the presence of selfish nodes in MANETs. To distin-
guish between trusted and selfish behavior in nodes, Jeong et al. [6] use a fuzzy-
based analyzer. To combat selfishness, the method incorporates trust and certificate
authority. In [7], the authors propose a collaborative watchdog for detecting selfish
nodes. Miner and Staddon [8] propose a two-tier acknowledgment scheme to iden-
tify misbehaving nodes and then inform the routing protocol to take routes without
misbehaving nodes in the future. Game theory is applied as a tool to encourage
cooperation in [9], and reputation is used to study node behavior.
Ad hoc On-Distance Vector (AODV) [10, 11] and DSR [12] are traditional
MANET routing protocols that presume which all modules throughout the system
cooperate and concur on forwarding. Packet level, suggesting that certain devices
are truthful in their packet forwarding actions. According to findings published in
the journal [13, 14], nodes in a MANET tend to be narcissistic over time. Personality
devices are reluctant to devote resources like rechargeable batteries time and storage
to the benefit of other nodes. Dishonesty is frequently linked to the onset of resource
scarcity, such as battery power, and the desire of nodes to save resources for their
own use such as electric power and nodes’ willingness and save reserves for their
own consumption. As a consequence, a module in a MANE has a powerful incentive
and being selfish. The setting Marti et al. [15, 16] classified selfish node behavior
into the following categories:
• Selfish nodes reduce the TTL value and slip packets or tamper with routing
protocol and reverse path packets.
• If a device does not pay attention to hello messages, the following conditions
apply: A selfish node could refuse to acknowledge hello messages, making it
impossible for nearby nodes to identify it and reverse packets.
There is no immune reaction to foreign bacteria in the gut or to the food we eat
although both are foreign entities. The principle behind danger theory [17, 18] stems
from the fact that everything foreign is not harmful for the cell structure. Instead of
attacking everything foreign, it is better to measure the degree of damage incurred
to the cells based on the distress signal sent by the cells due to the foreign entity.
Danger theory advocates the principle of non-isolation of foreign entity until proved
dangerous.
3 Proposed Method
Selfish nodes cause a faulty network because there is no guarantee that they will not
delay, split, or make packets, or put them out of order. To provide that reliability,
the protocols that put forward truthful communication over those networks use a
398 S. Navaneethan et al.
G = [0 2 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 4 0 9 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 1 0 0 0 6 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 2 2 3 4 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 2 7 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 5 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 5 7 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 5 6 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 5 6 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 3 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 8 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]
After finding the shortest path, we are applying equation to every node in the
path between source node and destination node and find out the presence of selfish
node in the network. This process continues to check every related nodes. From the
observation, we can identify the node as a selfish node or non-selfish node by taking
threshold value with the average retransmission numbers and maximum average
retransmission numbers of the node as given in equation.
Each node in the network generates a token and the token consist of two fields first
one is flag bit, i.e., status bit
• The status bit consists of two parts green flag and red flag. The first flag indicates
that it is a valid path.
• The red flag indicates that it is not a valid path to the destination.
• After the identification of valid path, the second token is generated only by source
node it contains the address and data field.
• The source node monitors the intermediate address of each node which the packet
travels until it reaches the destination.
• Once the packet reaches the destination, the token is released by the source.
400 S. Navaneethan et al.
5 Conclusion
The algorithm that we devised improves the network’s detection rate. Because of
its no cooperative nature to other nodes, selfish behavior causes network failure and
degrades overall network performance in wireless sensor networks. The self-centered
node for proper network data communication management, timely detection is a
critical issue. The selfish node has a significant. The network is disrupted as a result
of the impact. In order to solve the problem of selfish nodes and their behavior in
WSN, we must transform them into cooperative nodes that forward data packets to
other nodes.
Identifying and Eliminating the Misbehavior Nodes … 401
References
1. A. Kathirvel, R. Srinivasan, ETUS: enhanced triple umpiring system for security and robustness
of wireless mobile ad hoc networks. Int. J. Commun. Netw. Distrib. Syst. 7(1/2), 153–187 (2017)
2. A. Kathirvel, R. Srinivasan, ETUS: an enhanced triple umpiring system for security and
performance improvement of mobile ad hoc networks. Int. J. Netw. Manag. 21(5), 341–359
(2018)
3. S.E. Deering, Multicast routing in internetworks and extended LANs, in Proceedings of
the ACM SIGCOMM Symposium on Communication Architecture and Protocols, Aug 2019,
pp. 55–64
4. T. Ballardie, J. Crowcroft, Multicast-specific security threats and counter-measures, in Proceed-
ings of the Second Annual Network and Distributed System Security Symposium (NDSS ‘95),
Feb 2019, pp. 2–16
5. Y. Zhou, X. Zhu, Y. Fang, MABS: multicast authentication based on batch signature. IEEE
Trans. Mob. Comput. 9(7), 982–993 (2018)
6. J. Jeong, Y. Park, Y. Cho, Efficient DoS resistant multicast authentication schemes, in Proceed-
ings of the International Conference on Computational Science and Its Applications, 2010,
pp. 353–362
7. A. Perrig, R. Canetti, J.D. Tygar, D. Song, Efficient authentication and signing of multicast
streams over lossy channels, in Proceedings of the IEEE Symposium on Security and Privacy
(SP ‘00), May 2000, pp. 56–75
8. S. Miner, J. Staddon, Graph-based authentication of digital streams, in Proceedings of the IEEE
Symposium on Security and Privacy (SP ‘01), May 2001, pp. 232–246
9. N. Koblitz, Elliptic curve cryptosystems. Math. Comput. 48, 203–209 (1987)
10. M. Kefayati, H.R. Rabiee, S.G. Miremadi, A. Khonsari, Misbehavior resilient multi-path data
transmission in mobile ad-hoc networks, in Proceedings of the fourth ACM Workshop Security
of Ad Hoc and Sensor Networks (SASN ’06), 2006
Identifying and Eliminating the Misbehavior Nodes … 403
Abstract The main persistence of this research is to spread over machine learning
for plant species identification in agricultural science. This discipline has so far
received less attention rather than the other image processing application domains.
Various plant species may have extensive resemblance among them, and it is time
consuming to make a distinction. Plant species empathy takes in pre-processing,
dissects, feature drawing, and organization. This paper proposes plant species iden-
tification by image classification using AI and machine learning techniques which
take account of huge information in the form of binary leaf pictures and features like
dimensions, thickness, and color to identify the species of plants using various image
classification techniques/classifiers. The commonly used necessary classifiers are
linear, non-linear, bagging, and boosting. The algorithms proposed are random forest,
K-nearest neighbor, support-vector machine, gradient boosting, and Naive Bayes
for on-spot checking. Linear discriminant analysis is performed to plot graphs for
accuracy and loss versus classifier to expand cataloging presentation of a particular
species. And finally, selecting preeminent model for prediction after standardization
of the dataset, the results are used.
1 Introduction
The world inherits a very large number of plant species. In ancient days’, identifica-
tion of plant species was done by the experience of peculiar touch and smell sense.
Plant identification is not exclusively the job of botanists and plant ecologists. It is
required or useful for large parts of society, from professionals to the general public.
But, the identification of plants by conventional means is difficult, time consuming.
Automatic plant image identification is the most promising solution toward bridging
the botanical taxonomic gap, which receives considerable attention in both botany and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 405
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_37
406 E. Venkateswara Reddy et al.
2 Proposed Algorithm
The moto of the proposed work compacts the acquisition of data followed by
processing of various data collected which is called pre-processing of image, with a
further step is extraction of different characteristics of the leaves for identification of
plant species [3]. Here, the role of intelligent algorithms or classifiers like K-nearest
neighbor, random forest, support-vector machine, and gradient boosting algorithms
for classifying the plant species comes into play [4]. This proposal briefly reviews the
workflow of applied machine learning techniques, and discuss challenges of image-
based plant identification. A data flow diagram illustrates how progressions stream
end-to-end framework. Figure 1 shows that flow pattern for plant species recognition.
3 Methodology
In hitherto times, there was no particular standard pattern for a leaf to catalog the
plant species automatically. After a great research which led to dataset of Flavia was
ready to capture the huge dataset for plant species classification.
The leaf photographs hold only one entity, the leaf [5]. As all leaves doesn’t have
absolutely smooth level, while photo grasping it may hold shadow beneath the leaf,
which must be detached before dissection of image [6, 7]. Initially, any color pixel
of desired picture was changed to Hue saturation value esteem. This progression fills
in as a track leading to the relevant edge discovery of RGB esteem leaf pictures, as
opposed to creating a last picture to pull out the characteristics. Further, treatment
intricate a stage in modifying actual pictures over to grayscale pictures. Further,
pictures were changed over from gray scale to binary pictures as shown in Fig. 2.
The binary pictures detached any irregularity within the leaf outline and exhibited
the ample leaf in white patch. The segmentation process of leaf is shown in Fig. 3.
Every plant species retain matchless attribute that creates it as a unique one. Attributes
or characteristics are categorized as dimensions [8], color, and venation [9, 10]. In
Fig. 4, it is shown about different kinds of features in an image leaf.
Geometry of the leaves defines many divergences in dimensions. The proportion
of length with its respective width of a leaf is defined as aspect ratio. Area is premed-
itated as the product of range of pel and over-all count of pels existing in a leaf.
Rectangularity (R) is a property of being shaped like a rectangle. The color features
408 E. Venkateswara Reddy et al.
include mean and standard deviation. These veins are unique attributes of each plant
species [11].
3.4 Classification
Upon the completion of leaf pattern abstraction, the evidence is called as attribute
vectors which are caste of two auxiliary scrutiny, appraisal before being clustered
into their precise modules. And the intelligent methods suggested for classifiers
are random forest, KNN, SVC, gradient boosting, and Naive Bayes for on-spot
checking. For image cataloging, support-vector network and random forest methods
are proven [12]. Random forest is a classifier that encloses a figure of decision trees on
innumerable detachments of the given dataset and takes the middling to advance the
prognostic precision of that dataset. The countless figures of trees in the forest leads
to greater precision and foils the tricky of overfitting. It takes too short time upon
other algorithms even for enormous dataset [13]. K-nearest neighbor is the naivest
machine learning algorithms centered on controlled learning technique which holds
good for regression and classification too, but much suited for classification purpose
[14]. During training, it works to withhold the dataset, and as soon as it receives
new data, it organizes the data into a set which is in resemblance to it. Extraordinary
dataset can be transacted by most widespread supervised learning algorithms called
as support-vector machine, which can be also used for regression problems [15]. To
map the high dimensional data, kernels are used. Support-vector network chooses
the extreme vectors that help in creating hyperplane which triggers the accuracy of
SVM [16, 17].
Proof of identity for plant species has foremost benefits for its wide range of stake-
holders extending from pharmaceutical laboratories, botanist, forestry services, and
consumers. Ten diverse machine learning classifying techniques were used to assess
the identification accuracy rate. To find the best classifier among all the proposed
classifier techniques, a series of experiments have been conducted on dataset. Table
1 shows the comparison of test accuracy and log loss among the proposed models
supported for spices identification, and Table 2 shows that precision, recall, and f1-
score of proposed algorithm. The plot for classifier versus accuracy and classifier
versus log loss is shown in Figs. 5 and 6, respectively.
410 E. Venkateswara Reddy et al.
5 Conclusion
The proposed work for cataloging of plant species is supported by matching various
algorithms. 98.9% of precision is shown by using random forest algorithm for cata-
loging and is the faster learning model by performing undeviating proportionality of
compound attributes with learning capability for machine learning algorithms.
412 E. Venkateswara Reddy et al.
References
Abstract The paper considers the inventory routing and storage problem and
suggests a satisfactory solution by finding dispatch quantities and vehicle route allo-
cations when the objective is to minimize transit cost, vehicle cost and storage cost.
To be specific, the problem can be categorized as a cyclic inventory routing problem
(CIRP) with homogenous fleets. The approach mentioned here is a hybrid of vehicle
routing problem (VRP), graph-based clustering (GC) and mixed integer program-
ming model (MIP) to find a solution when the scale is large enough which makes it
difficult to solve it using exact methods. The VRP module is used to find the feasible
routes of the customers from the depot using metaheuristic approach. The GC module
is further used to decompose the route network into clusters using connected graph
networks, and eventually, a MIP model is used to select the routes, find the daily
dispatch of gases and thus also find the optimal storage required both at the depot
and the customers. The MIP formulation is designed in a way to reduce the solving
time complexity by converting the binary variables which are used in a traditional
formulation of the inventory routing problem to integer variables by decomposing the
constraint. The approach has been tested on a simulated business case that spans two
hundred customer locations, demands fulfilment for a week and homogenous fleet
with a truck carrying capacity of four cylinders. The scaling studies have been done
on the GC module by analysing the time complexity and the optimization feasibility
with respect to the cluster size. The MIP approach is designed to solve the problem
to less than 1% MIP gap considering the number of customer locations.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 413
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_38
414 A. Kumar and A. Munagekar
decisions that needs to be made at the time of setting up a distribution network is the
storage capacities at the depot, the storage capacities at the customer/retail locations
and the last mile fleet for the fulfilment of the demand. The demand of the customers
can either be fixed (constant rate for the entire horizon) or it may have a fixed pattern
over the given time horizon or even for that matter variable demand in a given time
horizon. However, if the supply product is flammable or else needs special freight
for transportation that needs a homogeneous fleet, there is an additional capital cost
which is incorporated in the formulation.
This paper focuses on the aspect of strategic decision making, i.e. to decide on the
vehicle fleet required, the storage required at the depot and the customer locations by
minimizing the travel cost, vehicle cost and storage cost and satisfying the demand
in the time horizon. The flammable liquid supply chain considered is a two-stage
distribution network consisting of a primary network which delivers the material
to intermediate storage units from the manufacturing unit. The secondary network
delivers the material from the depots to the customer locations. The focus of this
paper is on the secondary distribution network where the depot and the customers
mapped to that particular depot are considered. The CIRP problem was solved by
[1] which solves the problem of fixed demand rate for infinite time horizon.
Reference [2] provides a variable neighbourhood descent heuristic for problems
more than 20 customers while a MIP model for lesser scale problems which provides
a good solution for the problem.
The objective of this papers is to solve the problem when the problem size is large,
i.e. more than 50 customer locations, and each location has either fixed demand rate
or variable demand rate for a time frame greater than a week and also when the route
times are more than a day.
The challenges are due to the nature of the problem. The complex formulation
makes it a difficult use case to solve in a limited computational time and resources.
Particularly, due to the space, the VRP which uses a meta-heuristics method does not
guarantee optimality as it is not guaranteed that the metaheuristic technique reaches
global optimum [3]. In this paper, we try to solve the problem keeping into account
the scale and the limited amount of computational capability to reach to an agreeable
solution that is useful to the business.
The VRP model is solved using [4]. Due to the size of the customer locations,
solving the VRP in a desired time is not realizable; hence, we used a GC-based method
to split the problem space into several clusters and solved the clusters independently.
We have developed a hybrid approach that solves the problem in three stages,
namely: vehicle routing problem (VRP), graph-based clustering (GC) and mixed
integer programming (MIP). The VRP module is used to find the feasible routes
of the customers from the depot using metaheuristic approach. The GC module is
further used to decompose the route network into clusters using connected graph
networks, and eventually, an MIP model is used to select the routes, find the daily
dispatch of gases and thus also find the optimal storage required both at the depot
and the customers.
Inventory, Storage and Routing Optimization … 415
2 Literature
Inventory routing problems are present in various domains and industries like gas
industries [5], crude oil refineries [6], cold storage food distribution, food and super-
market chains [7]. The cyclic inventory routing problem was first introduced by [8].
It was more like a strategic-level inventory routing problem whose objective was
to minimize the required fleet size over a very long period. Another example of
the application of the long-term IRPs is in the ship fleet sizing for a liner shipping
company with fixed long-term cargo contracts [9].
The basic inventory routing problem assumes that vehicles are available. The
objective considers a trade-off between inventory costs and transportation, without
taking fixed vehicle costs into account [10]. Integration of inventory and distribution
decisions are formulated and approached in different directions. Some constraints
considered are different inventory policies, time window horizons and service restric-
tions. The literature on IRP in the past couple of decades summarize the problems.
Some of these reviews are [11–15]. IRP and its variants are now well-developed. The
IRP may be further classified in continuous time models [2, 16, 17], most often with
a constant demand rate over a time period, and models in discrete time [18–20], with
a fixed time period but varying demand rates. The various graph-based clustering
algorithms and machine learning methods used for clustering are reviewed in [21,
22]; the methods described use different ensemble method approaches to optimize
on the unsupervised learning methods. This paper is the combination of VRP and
MIP and connected graphs approach to formulate a modified IRP which undercuts
the global optimum but eventually provides a very good solution in a feasible amount
of computation time in cases where quick solutions are required.
3 Problem Description
4 Optimization Framework
Optimization Flow:
See Fig. 2.
The input parameters include the coordinate locations of the customers and depot,
demand of the customers, vehicle capacity, vehicle fixed cost, storage cost and transit
cost. The demand at the customers is provided in terms of the tones of liquid fuel
required thus mapped to the number of cylinders required shown in Fig. 2.
Inventory, Storage and Routing Optimization … 417
Distance matrix:
In the data pre-processing step, the data is prepared for the mathematical model.
The distance matrix is prepared using open source API for calculating the distance.
Finding the feasible routes using VRP (vehicle routing problem).
The input for the VRP is the daily demand at the customer locations and the
coordinates of the depot and the customer locations. Additionally, vehicle capacity
is also considered. The methodology used for the capacitated VRP is simulated
annealing using [23] as shown in Fig. 3. Any other metaheuristic, MIP formulation
or heuristic can be used as the problem is a standard problem with multiple approaches
available to solve the problem.
The heuristic used in Fig. 4 is considering the vehicle capacity as 4 cylinders;
hence, the iteration stops after i = 4. One assumption used for all the instances is
that a truck travels at max 450 km in one day. The idea here is to find the routes such
that the MIP model can select whichever route gives the best optimal. Any alternate
strategy to find the feasible set of routes can be used. The output of the heuristic
contains the feasible routes for the customer locations as shown in Fig. 5.
The output of the VRP is a set of feasible routes which can be represented as a graph
with connected nodes. If a customer does not have a route in which it is connected
to another customer, we can consider that particular customer independently. Thus,
the problem can be decomposed into smaller problems and these sub-problems can
be solved independently. Clustering using connected components of an undirected
graph [24–26] helps us in reducing the time complexity. A straightforward breadth
first search strategy can be implemented for finding the connected components of
the undirected graph. Output: Feasible routes connected to each customer as shown
in Figs. 6 and 7.
Inventory, Storage and Routing Optimization … 419
Fig. 6 GC structure
5 Problem Formulation
In this section, the mixed integer problem formulation is elaborated in detail. The
following notations are considered:
I: A vector containing the set of customers.
R: A vector containing the set of routes.
T: A vector containing the sequence (set) of days.
Decision Variables:
Subject to:
t Q inv
i,t ≤ Q i
min _inv
∀(i ∈ I ) (3)
422 A. Kumar and A. Munagekar
veT
i∈r pi,r,t
out
Nr,t ≤o + 0.99 ∀(r ∈ R, t ∈ T ) (4)
p vexcile_cap
veT
i∈r pi,r,t
out
Nr,t ≤o ∀(r ∈ R, t ∈ T ) (5)
p vexcile_cap
out
Q i,r,t = Q in K
i,r,t, ÄT ATr
∀(i ∈ I, r ∈ R) (6)
i r Q i,r,t
out
≤ Q st
O ∀(t ∈ T ) (7)
out
pi,r,t
Ri, >= t ∀(i ∈ I, r ∈ R) (8)
M
veT
out
Q i,r,t , Q ink
i,r,t , Q i,t , Q O , Q i
inv st min _inv
, Nr,t ≥ 0 ∀(i ∈ I, r ∈ R, t ∈ T ) (10)
The MIP model tries to solve for two main problems. One is the inventory problem
considering the fixed demand rate at the customer location and the other being to
minimize the resources, i.e. the storage and the fleet. It considers the three objectives
with equal weightages (1) as the transit cost with the assumption that each kilometre
costs 60 units; the surplus inventory cost and the vehicle cost. The inventory balance
is achieved using (2). The minimum inventory constraint is also added (3). (4) and
(5) represent the mapping of the outflow to the number of vehicles as the number of
vehicles in this case is an integer variable. The inflow and outflow balance is achieved
using (6). (7) is used to map the storage at the depot and the outflows and thus used
to find the optimal capacity of storage required at the depot. (8) is used to convert
the outflow from route r to a binary value in order to add constraints on it. (9) is used
to limit the number of unique routes which should be allowed to be selected by the
model for each customer.
6 Results
Data set used are random locations in the state of Madhya Pradesh, India. The storage
capacity, demand rate and parameters are also generated randomly to test out the math
model.
From Table 1, we see the scaling of the model based on number of customer
locations considered. The horizon length considered for all the instances in 30 days.
Except the last instance of 100 customer locations, the MIP model was able to
successfully converge to 0.01% gap. When the number of customers were 100, there
Inventory, Storage and Routing Optimization … 423
was a time limit of 1800s provided and the CPLEX solver was able to converge to
0.2% gap.
Figure 9 is a box plot of the distance travelled by the vehicles on an average each
day. There were 3 instances considered, i.e. horizon length: 10, 20 and 30 days. The
distribution for the three instances is very similar with the medians being 297 km,
296 km and 296 km, respectively for 10, 20 and 30 days of horizon length. This indi-
cates that there is a similar pattern which is generated by the MIP model considering
the demand was cyclic in nature. Table 2 shows the metrics for the MIP model output
for the horizon lengths 10, 20 and 30 days and we can see that the metrics are very
similar.
Figure 10 shows the box plot of the efficiency of vehicles on each route based on
the distance travelled (day level utilization) and the number of cylinders transported
(capacity efficiency). The reason the day level utilization is considered as in most of
the real-world scenarios, a truck is generally rented on a day level irrespective of the
distance it travels. Thus, the idea being minimizing the idle time of the truck. The
equation to find the efficiency of the trucks on the routes is:
No. oK cylinders
Loading efficiency = (14)
ã
In (14), the denominator signifies the capacity of the vehicles which is 4 in the
case of the instances. The idea here is to find the loading efficiency of the vehicles
at the time of dispatch.
In Table 3, the objective function results are tabulated for time interval of 300 s.
The stopping criteria considered is being 0.5% for 3 consecutive iterations, i.e. if
the objective function does not improve more than 0.5% in any of the 3 consecutive
iterations, the model is stopped.
7 Conclusion
In this paper, we have demonstrated one of the approaches for solving the inventory
storage and routing optimization with homogeneous fleet in the secondary distri-
bution network. The method demonstrated here uses a hybrid VRP, Clustering and
MIP approach. We have demonstrated “satisfactory” optimality in the VRP stage by
426 A. Kumar and A. Munagekar
showing scaling study results. In the MIP stage, we have solved the problem up to
an MIP gap less than 1% depending on the number of customers. We have used a
stopping criterion for VRP based on the improvement in the results, the stopping
criteria is an improvement of less than 0.5%. In addition, we have shown several
reports related to vehicle capacity utilization and day-level utilization. The method
can be applied to distribution networks where there are sources, depots and customer
location and there is a demand, inventory and transit cost involved. Even though we
have used homogeneous fleet, it can easily be scaled to heterogeneous fleet. The VRP
formulation is solved using Google-OR tools, Graph-based connected components
using Python and the MIP using CPLEX.
Appendix
References
Abstract There are many research carried on cloud computing, but key factor is
load balancing. Load balancing plays an important role to organized work allocation
on server which consists of cost, material, time duration, etc. There are few issues
which are faced in load balancing are resource utilization, security reason, tolerance,
and many more. Many researchers have done work on different parameters of load
balancing, and similar results are found in different articles. This paper consists of
simulation of three different loads balancing which include static as well as dynamic
strategies. Three algorithms are taken each from static and dynamic. Static algorithm
include are round robin, threshold, and randomized. Dynamic algorithm include are
active clustering, honey bee and Join-Idle-Queue where simulation is performed in
cloud simulator and parameter considered is resource utilization, response period,
and processing time with overall data transfer cost. This paper shows study regarding
comparison between three static algorithm and dynamic algorithm with their param-
eter and result outcome. This work is still in progress where it aims to do analysis
and evaluation for success of load balancing. In cloud computing to improve quality
approaches.
1 Introduction
H. Durani (B)
RK University, Rajkot, Gujarat, India
MCA Department, B H Gardi College of Engineering and Technology, Rajkot, Gujarat, India
N. Bhatt
MCA Department, RK University, Rajkot, Gujarat, India
e-mail: nirav.bhatt@rku.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 429
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_39
430 H. Durani and N. Bhatt
every utilization premise (Brown 2017; Buyya et al. 2010). Even in 2020, due to
pandemic situation, it gave boom to IT company which lead to widely usage of
cloud environment.
This can be grouped into two different ways: First on the spot and administrations
offered, second on the spot whether it is public, private, mixture or network base. It
moves both processing and information from compact PC’s with work area to enor-
mous server farms. Accordingly, this environment give a casing work to reasonable
admittance to registering assets and that too in on request approach. Cloud computing
likewise expands accessibility of assets.
Major issue involve in cloud environment consists of load balancing. It distributes
load across all the nodes in the cloud. But still, the major issue in load balancing
is some node remains idle, whereas few nodes are never occupied. Major avoiding
factor is when situation occurs where some nodes are overloaded and few which are
idle. Thus, the working principle of load balancing increases the overall performance
system along with its resource utilization property.
2 Load Balancing
In could environment, key factor is load balancer where work is divided among all
node equally. The main basic requirement of load balancing is to reach user satisfac-
tion where no node is overloaded neither under loaded but give best performance.
Even resource consumption can be minimize if proper use of load balancing [7]
is done which result to give benefit of scalability, avoid disturbance, and time is
reduced.
2.2 Purposes
Static
In this algorithm, all the basic information is provided which include memory perfor-
mance, power processing, and user data requirement. The main disadvantage of this
algorithm is when there is sudden failure where task allocation cannot be done. Best
example is round robin algorithm in static load balancing. Even this calculation had
many disadvantages which lead to new process weightage round robin algorithm. In
this algorithm, individual server is been allocated with weight. The server with the
highest weight is been allocated more connections which balances traffic.
Dynamic
This algorithm is choice-related load adjusting dependent on present status, for
example prior information is not needed. This will overthrow the weaknesses of
static methodology. The dynamic calculations are intricate; however, it results to
improvement then static algorithm. A few approaches are utilized in unique load
adjusting calculation. These can be characterized by following boundaries like
exchange strategy, determination strategy, location strategy, information strategy,
load assessment strategy, measure move strategy, and need task strategy.
3 Techniques
The present techniques are classified into two parts static and dynamic [11]. In this
research paper, three static and dynamic load balancing algorithm are selected, and
simulation is performed according to taken parameters [3].
Following algorithms of static load balancing.
The working apparatus is like disk style. In this algorithm, process will perform
single as per time allocated to it. In this algorithm [8], all task are processed in
group. Process works until all tasks are completed. This algorithm [12] is use in
Web-like http request. In this algorithm, work is selected by VM and then assigning
request to virtual machine in circular order.
3.2 Randomized
Without knowing any data from the current phase or previous phase, nodes are
selected randomly. In this algorithm, each node keeps its own record of heap. Even
to adjust the heap randomly, nodes are chosen at time of processing. Calculation is
Evaluation and Comparison of Various Static and Dynamic Load … 433
maintained in such a way that firstly size of cycle is checked then after it does testing
of nodes that are moved one after another in VM. This record is maintained in stack
for further processing.
3.3 Threshold
In this algorithm, measurement is done with help of nodes. Here, a heap structure is
maintained with three different levels which are low level, medium, and over data.
Two parameters are taken that is t_upper and t_lower that revels below formula:
Under loaded: load < t_under
Medium: t_under ≤ load ≤ t_upper
Overloaded: load > t_upper
Adjustment of load is done by setting criteria of limits. On the off chance that we set
edge boundary 30% better than expected worth then it will be exceptionally stacked.
Setting threshold boundary 70% above normal worth will be softly stacked. In the
event that the processor state isn’t over-burden then the cycle is distributed locally.
Load balancer will circulate a portion of its work to the VM. If it is over-burden
having least work, VM is similarly stacked.
Following algorithms of dynamic load balancing.
Active Clustering algorithm deals with the guideline of collection of the comparable
node and work together on the accessible gatherings. A bunch of process is alliter-
atively executed by every node on the organization. At first, any node can turn into
an initiator and chooses another node from its neighbors to be the go between nodes
fulfilling the rules of being an unexpected sort in comparison to the previous one.
The intermediary node at that point shapes an association between neighbors of it
which are like the initiator. The relational arranger node at that point eliminates the
association among itself and the initiator.
In this algorithm, it behaves like honey bee which goes to search for food after
searching of food it does announcement. They pass message while dancing which
is known as waggle dance. In load balancing, this algorithm behaves same as honey
bee in virtual machine. Here, it has server of clusters where it has its own virtual
queue in load balancing. This mechanism occupies server for process.
434 H. Durani and N. Bhatt
3.6 Join-Idle-Queue
4 Cloud Simulator
Tool use for simulation is cloudsim were various model can be use like user base,
datacenter, and VM load balancer [4] on large platform of cloud in Fig. 3.
The point-by-point configuration did in this paper describes the use of VM load
balancer which helped to allocate proper node in datacenters. It contains the mainly
three things which are: user base, datacenters (DCs), VM load balancer.
User Base
Use Base means where collection of different clients are done as per their allocated
unit in their cell [6]. The main important goal is to avoid traffic for starting new cycle.
Data Center
In cloud analyst, the heart which control in the substance is data center (DC). It works
on cloud simulator where all things are controlled. In data center [14], UB user base
solicitation is send.
VM Load Balancer
In virtual machine VM, everything is controlled by data center for doing load
balancing. To check which virtual machine is allotted in cloudlet [9], it is done
by user base on its destination point in Figs. 4, 5, and 6.
The paper illustrates the simulation [10] of static load balancing algorithms which
are RR (round robin), R (randomized), and T (threshold). The overall parameter
performance was based on overall data transfer cost, response, and time of processing
in Fig. 7.
Here, in Tables 1 and 2, source allocations of DCs 5, VMs 25, and 50 have been
taken to compare overall. Here, consideration of three algorithms from static and
dynamic have been chosen. While comparing all the overall data transfer cost [15]
remain same, but there is variation in time of response and processing time.
Figure 8 indicates the processing time taken by three algorithms of static which
include RR—round robin, R—randomized, and T—threshold.
Figure 9 indicates the processing time taken by three algorithms of dynamic which
include AC—active clustering, HB—honey bee and JIQ—Join-Idle-Queue.
7 Comparison
Past segment distinctive algorithm [13] proposed by different scientists has been
chatted in Table 3.
Evaluation and Comparison of Various Static and Dynamic Load … 439
8 Conclusion
Simulation is done on six different load balancing algorithm which includes three of
static and similarly of dynamic. Each algorithm was observed with their criteria like
processing time, response time, and overall data transfer. In this paper, simulation is
done for 5DC with 25 and 50 VMs and result is shown in above figures. Still, future
enhancement is to develop on large data and improve overall response time and cost.
440 H. Durani and N. Bhatt
References
1. E. Choi, B.P. Rima, I. Lumb, A taxonomy and survey of cloud computing system, in
International Joint Conference on INC, IMS and IDC, Seoul, Korea, Aug 2009
2. I. Chana, N. Jain, Cloud load balancing techniques: a step towards green computing. IJCSI
(2012)
3. S. Kinger, S. Kaur, Review on load balancing techniques in cloud computing environment. Int.
J. Sci. Res. (2015)
4. H. Bhatt, H. Bheda, An overview of load balancing technique in cloud computing environment.
Int. J. Eng. Comput. Sci. (2019)
5. http://www.loadbalancing.org/
6. S. Gibbs, Cloud computing, international journal of innovative research in engineering and
science (2012)
7. P.J. Patel, H.D. Patel, P.V. Patel, A survey on load balancing in cloud computing. IJERT (2012)
8. N. Pasha, A. Agarwal, R. Rastogi, Round robin approach for VM Load Balancing Algorithm
in cloud computing environment. Int. J. Adv. Res. Comput. Sci. Softw. Eng. (2014)
9. S. Wang, W. Liao, Towards a load balancing in a three level cloud computing network, in IEEE
International Conference and Computer Science and Information Technology, Sept 2016
10. B. Wickremasinghe, Cloud Analyst—a cloud sim based visual and modeler for analyzing cloud
computing environment and applications (IEEE, 2010)
11. XFZ, RXT, A load balancing strategy based on the combination of static and dynamic in
database technology and application (IEEE, 2010)
12. G.A. Chopra, Dynamic Round Robin for load balancing in cloud computing. Int. J. Comput.
Sci. Mob. Comput. (2013)
13. R. Dubey, R. Choubey, A survey on cloud computing security. Challenges and threats Int. J.
Comput. Sci. Eng. IJCSE (2011)
14. A. Singh, M. Korupolu, D. Mohapatra, Server storage virtualization: integration and load
balancing in data centers. J. Res. Dev. (2008)
15. S. Tayal, Task Scheduling Optimization for the cloud computing systems. IJAEST (2011)
Dielectric Resonator Antenna
with Hollow Cylinder for Wide
Bandwidth
Abstract This paper presents a stacked dielectric resonator antenna with a drilled
hollow cylinder. The antenna has thirteen dielectric layers with different permittivity
(Er1 = 12 and Er2 = 4.4) on a FR4 epoxy substrate having a dielectric constant of
4.4 with 0.8 mm thickness. It uses one hollow cylinder of diameter 0.8 mm which
is drilled at right bottom corner in a design of the proposed antenna. The analysis
is performed on 3D EM simulator high-frequency structure simulator. The wide
improvement in bandwidth of the DRA with a drilled hollow cylinder is presented
with the help of the proper excitation and selection of the resonator parameters based
on a −10 dB reflection coefficient.
1 Introduction
Nowadays, antenna plays a significant role in our day to day life. In the branch of
antenna’s dielectric resonator, antenna is a category of antenna which allows the
transmission of waves ranging from microwaves to millimetre waves with poky
losses [1]. Due to numerous reasons dielectric resonator, antenna has achieved great
importance in the field of radio frequency engineering [2]. For desired application,
different geometry of the antenna is applied. Features like high gain, wide bandwidth,
high radiation efficiency, and low losses evoke radio frequency engineers towards
the use of DRAs [3]. Detailed study on dielectric resonator antennas (DRAs) were
earliest done by Longer al. [4].
G. Kumar (B)
Guru Gobind Singh Indraprastha University, Delhi, India
R. S. Yaduvanshi
Netaji Subhas University of Technology, Delhi, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 441
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_40
442 G. Kumar and R. S. Yaduvanshi
After that, a lot of study is being done on DRAs on their various geometries like
cylindrical, spherical or rectangular, etc. [5] Rectangular dielectric resonator antenna
has various advantages over spherical and cylindrical geometries [6]. This is due to
because modification can be done in the rectangular form of the dielectric resonator
antenna and various advantages can be achieved. Like, degeneration of mode can be
avoided by properly choosing the three dimensions of the antenna. It is known that
degeneration of modes always exists in the case of a spherical DRA [7]. Higher order
modes can be achieve having same antenna dimensions, which can make antenna
to work on the same frequency and also in the manifestation of hybrid modes of
a cylindrical DRA [8]. Advanced for of DRA was proposed in order to increase
the various parameter of the antenna. One of the ideas was using a stacked form
of rectangular DRA [9], i.e. instead of using whole volume of single material, also
there are number of methods to enhance the bandwidth of the antenna either by using
different material having different material or different shape the antenna [10]. One
of the methods is to use different material having different permittivity in rectangular
shape placed on each other called stacked form of DRA [11–15].
2 Antenna Design
The design here we took is in the form of stacked rectangular slabs of equal dimen-
sions (except top most slab) placed on one above with one drilled hollow cylinder
at right bottom corner. There is two different kind of material used place alterna-
tively which means consecutively there is two different permittivity material attach
together. The substrate is of length = 50 mm, breadth = 50 mm, and height = 1.6 mm
made up of FR4_epoxy having permittivity of 4.4. There are 13 slabs placed on one
another. Each slab is having the dimension of length = 6 mm, breadth = 6 mm, and
height = 0.8 mm except the top most slab which has height of 0.4 mm. The height
and diameter of hollow cylinder are 10 mm and 0.8 mm, respectively.
So that the complete height of the antenna is 10 mm having material of the
slab is FR4_epoxy with permittivity 4.4, and TMM 13i of permittivity 13 is placed
alternatively as shown in Fig. 1.
Antenna is made to work on 13 GHz. Now, the antenna design analysis is done in
which two slabs are removed, i.e. one from top and another is from bottom is removed
and air took their place and various antenna parameters are recorded. Again, 3rd slab
from bottom and 11th slab from bottom are removed and air is introduced instead of
those slabs and again variations in antenna parameters are recorded. Same work is
done by removing middle slab or 7th slab from the bottom of the antenna.
Top view of the antenna is being shown in Fig. 2, which shows the position of the
hollow cylinder, drilled in the stacked dielectric antenna.
Dielectric Resonator Antenna with Hollow Cylinder … 443
Antenna parameter return loss (S11 ) signifies the amount of power which has been
delivered to antenna relative to impedance matching of proposed antenna with respect
to source. The plot of return loss for proposed antenna is shown in Fig. 3. This plot
includes all S11 parameters for their corresponding slab removal states. In this, return
losses for seven different slab removal states such as removal of slabs (1, 13), slabs
(2, 12), slabs (3, 11), slabs (4, 10), slabs (5, 9), slabs (6, 8), slab (7), and one with all
slabs have been introduced.
The antenna parameter gain is known as the ratio of power radiated in the direction
of maximum radiation to that of the power radiated by hypothetical lossless isotropic
444 G. Kumar and R. S. Yaduvanshi
antenna. As shown in Fig. 4, the simulated total gain of the antenna is found to be
6.83 dB.
The radiation pattern of the stacked rectangular dielectric resonator antenna at
frequency of 12.7 GHz shown in Fig. 5.
In Fig. 6, retotal shows intensity of electric field at 12.7 GHz in three dimensional.
Red colour shows highest electric field intensity, whereas blue colour shows lowest
electric field intensity.
4 Conclusion
The proposed antenna shows good results in range of frequency from 12.00 to
18.00 GHz. In this antenna design, as we replaced number of slabs by air medium
including drilled hollow cylinder, we obtained wide impedance bandwidth and high
gain on removal of middle (7th) slab. Thus, this antenna can be used potentially in
Dielectric Resonator Antenna with Hollow Cylinder … 445
Acknowledgements We would like to thanks, Mr. Chandra Prakash, lab assistant of microwave
lab, AIACTR, GGSIPU for arranging and granting a liberal access to the lab and has been extremely
cooperative and helpful throughout the research work.
References
1. S.M. Shum, K.M. Luk, Stacked annular ring dielectric resonator antenna excited by axi-
symmetric coaxial probe. IEEE Trans. Antennas Propag. 43(8), 889–892 (1995)
2. A. Petosa, A. Ittipiboon, Y.M.M. Antar, D. Roscoe, M. Cuhaci, Recent advances in dielectric
resonator antenna technology. IEEE Antennas Propag. Mag. 40(3), 35–48 (1998)
3. A. Petosa, Dielectric Resonator Antenna Handbook (Artech House, Norwood, MA, 2007)
4. G. Kumar, M. Singh, S. Ahlawat, R.S. Yaduvanshi, Design of stacked rectangular dielectric
resonator antenna for wideband applications. Wirel. Pers. Commun. 109(3), 1661–1672 (2019)
446 G. Kumar and R. S. Yaduvanshi
5. A. Petosa, A. Ittipiboon, Dielectric resonator antennas: a historical review and the current state
of the art. IEEE Antennas Propag. Mag. 52(5), 91–116 (2010)
6. A.A. Kishk, B. Ahn, D. Kajfez, Broadband stacked dielectric resonator antennas. Electron.
Lett. 25(18), 1232–1233 (1989)
7. K.M. Luk, K.W. Leung (eds.), Dielectric Resonator Antennas (Research Studies Press,
Baldock, England, 2003)
8. Y.-X. Guo, Y.-F. Ruan, X.-Q. Shi, Wide-band stacked double annular-ring dielectric resonator
antenna at the end-fire mode operation. IEEE Trans. Antennas Propag. 53(10), 3394–3397
(2005)
9. A.A. Kishk, et al., Numerical analysis of stacked dielectric resonator antennas excited by a
coaxial probe for wideband applications. IEEE Trans. Antennas Propag. 51(8), 1996–2006
(2003)
10. A. Sangiovanni, J.Y. Dauvignac, Ch. Pichot, Stacked dielectric resonator antenna for multifre-
quency operation. Microw. Opt. Technol. Lett. 18(4), 303–306 (1998)
11. K.M. Luk, K.W. Leung, K.Y. Chow, Bandwidth and gain enhancement of a dielectric resonator
antenna with the use of a stacking element. Microw. Opt. Technol. Lett. 14(4), 215–217 (1997)
12. Y. Ge, K.P. Esselle, T.S. Bird, A wideband probe-fed stacked dielectric resonator antenna.
Microw. Opt. Technol. Lett. 48(8), 1630–1633 (2006)
13. R.S. Yaduvanshi, H. Parthasarathy, Rectangular DRA Theory and Design (Springer, Berlin,
2016)
14. Y.M. Pan, S.Y. Zheng, A low-profile stacked dielectric resonator antenna with high-gain and
wide bandwidth. IEEE Antennas Wirel. Propag. Lett. 15, 68–71 (2015)
15. W.J. Sun, W.W. Yang, H. Tang, P. Chu, J.X. Chen, Stacked dielectric patch resonator antenna
with wide bandwidth and flat gain. J. Eng. 2018(6), 336–338 (2018)
Recent Techniques in Image Retrieval:
A Comprehensive Survey
Abstract In recent days of image processing, retrieval of images (IR) is very popular,
important, and rapidly developing area of research in multimedia technology. There
is a rapid increase in image transactions in the digital computer world. For various
activities, most of the digital equipment generates images. This creates a massive
picture archive. In recent years, a large amount of visual content from various
fields, such as social media sites, medical images, and robotics, has been created
and shared. Searching databases for similar information, i.e., content-based image
retrieval (CBIR), is a long-established area of study, and real-time retrieval involves
more effective and accurate methods. There are enormous methods of image retrieval.
One of the approaches for obtaining low-level image characteristics is CBIR. Color,
shape, texture and spatial position are some of the features. We have done extensive
survey to understand CBIR, various retrieval techniques, image attributes, standard
image datasets aimed at promoting a global view of the CBIR sector.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 447
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_41
448 K. D. K. Ajay and V. Malleswara Rao
medical image search [5], and online marketplace shopping recommendation [6],
among many others. CBIR can broadly be grouped into instance-level retrieval and
category-level retrieval. In the instance-level image retrieval, a query image of a
specific object is given, and the goal is to be able to find images for the same object
or scene that can be captured under different viewpoints, illumination conditions,
based on occlusions [7, 8].
It may require a search of thousands, millions, or even billions of images to
find the desired image. Searching effectively is therefore as important as searching
correctly, to which continued efforts have been devoted [7, 8]. Compact but rich
feature representations are at the heart of CBIR to enable accurate and effective
retrieval of large image collections.
Two fundamental difficulties are occurred in content-based image retrieval. Those
are intent gap and semantic gap. The intent gap refers to a user’s difficulty accurately
communicating the desired ocular content through a query at hand, such as a sketch
map or an illustration image.
The semantic gap emerges from the difficulty of explaining a low-level visual
function of a high-level semantic concept. Extensive attempts from both academia
and industry have been made to close the gaps. Content-based image retrieval utilizes
an image’s visual information, such as shape, color, spatial layout, and texture, for
indexing the images.
For the database images in traditional image retrieval systems based on content
(Fig. 1), multidimensional feature vectors are extracted, and their quality metrics
will be identified. A database of features represents the feature vectors of database
images. In order to get the images, researchers need to develop the system for retrieval
with sample data.
Such examples will then be translated by the device to extract the features. Using
these feature vectors, distance is measured between them. Thereafter, indexing is
done among the images to perform retrieval process. An effective way of searching
for the image database is given by the indexing scheme. Recent retrieval methods
are using the relevance feedback technique to get the efficient and improved retrieval
results in a more meaningful way.
Researchers have developed many image retrievals approaches, and among those
significant and widely using different types of image retrieval methods are shown in
Fig. 2.
Recent Techniques in Image Retrieval: A Comprehensive Survey 449
Image retrieval based on text is also called image retrieval based on description.
For a particular multimedia query, in retrieving the XML documents that contain
images based on textual information, text-based image retrieval is used. The ocular
content of images is represented in the TBIR by manually assigning tags/keywords to
address CBIR restrictions. As a textual query, it helps a user to present their need for
information and to find the required images based on the match between the textual
query and the annotations of the image manual.
In content-based image retrieval, images are searched and retrieved using image
characteristics based on the similarity of their visual content to the query image. A
feature extraction module is used for extracting low-level image features from the
images in the series. Commonly extracted image features include color, texture, and
shape.
Data fusion and machine learning algorithms are used in multimodal fusion image
retrieval. Fusion of data, also termed as the merging the evidences, is a method of
integrating different evident sources. The chorus effect, skimming effect, and the
dark horse effect can be learned by using several modalities [9].
The semantic gap in CBIR systems is called the difference between the information
needs of the user and the representation of the image. The intrinsic semantic gap
is basically due to the limited precision of the image nuclear systems for retrieval.
Relevance feedback is very useful in the CBIR for reducing distance. The funda-
mental concept behind relevant feedback is to incorporate the subjectivity of human
experience into the questionnaire and include users to determine the effects of the
retrieval.
Image’s most common visual feature is color. Eyes of the human are more sensitive
than gray-level images to color images. RGB color intensities are used in the RGB
color space. There are different methods for extraction of color features such as color
moments, auto-correlogram, and color-histogram. A color histogram is a representa-
tion of color distribution in an image. Histogram of an image indicates the gray-level
spectrum from 0 to 255. To minimize these values, the spectrum of color is split into
discrete intervals. Color moments are steps that can be used to distinguish images
according to their color characteristics. Once measured, these moments provide a
color similarity measurement between images. These similarity values can then be
compared to image values indexed in a database for tasks such as image recovery. A
color correlogram is a three-dimensional table that expresses how the spatial correla-
tion of color changes with distance in a stored image, indexed by color and distance
between pixels. To differentiate an image from other images in a database, a color
correlogram may be used.
Shape features in an image are used to determining the determination of the edge.
Shape features are represented as:
1. The exterior form from the boundary.
2. Regions of a sort.
To test the categories mentioned above, Fourier descriptor and invariant methods
of the moment are used.
452 K. D. K. Ajay and V. Malleswara Rao
The image texture function describes the periodic reproduction around a part or on
a surface pattern. The texture factor determines the structural selection of the plane
and its interaction with the adjacent area. Using gray-level co-occurrence matrix
(GLCM), wavelets, Fourier transform, entropy, and correlation methods, the texture
feature can be extracted. The transformation of Gabor and wavelet gives the statistical
image distribution. Roughness, directionality, coarseness, regularity, line likeness,
and the contrast are the six texture properties.
In order to interact with the imaging system, various types of datasets are accessible
online. This paper explores a total of eleven datasets and those are described in Table
1 [10–12].
5 Techniques of Evaluation
Using utility as well as competency, the CBIR system is evaluated. These character-
istics mainly describes the precision and efficiency for image retrieval. In Table 2,
various techniques of evaluation used in different CBIR systems are listed.
6 Literature Survey
Based on different characteristics, such as form, texture, color, and shape, CBIR is
the method of extracting and retrieving features. This section discusses the litera-
ture survey for different CBIR techniques. Content-based image retrieval (CBIR) is
currently a daunting problem due to the large scale of the image database. It is also
difficult to identify images, handle big image files, and the total time for retrieval. The
survey mainly focuses on newly established CBIRs, useful to give accurate image
retrieval results in many different applications. There are various types of images in
each application, and each image has different characteristics. It is therefore more
important to pick an image’s functionality for a specific application and also to
calculate the similarity between the images. Table 3 gives some of the techniques for
retrieval [14, 15].
In general, the content of the image is analyzed based on the texture, shape,
and color attributes. Those attributes are popularly, extensively practiced in image
processing. The main observations which are found in this review are
454 K. D. K. Ajay and V. Malleswara Rao
Table 3 Different methods of texture, color, and shape feature-based CBIR systems
Author and year Method and features Future work
Lingadalli et al. [16] GLCM for texture In the future, it can be improved by
using maximum attributes. The best
result was obtained by combining
multiple attributes
Shaker et al. [17] Principle component analysis In the future, it can be improved by
(PCA) with cloud using maximum attributes
Singh et al. [18] LBP for color images In the future, analyzing the pixel
color as a vector having m-parts
and structure a hyperplane
Qazanfari et al. [19] Color difference histogram Our future work will use the
method weighting of the features and use
relevant feedback to obtain a more
efficient image retrieval system
Du et al. [20] Pulse-coupled neural network Future work might want to research
increasingly complex plans to
decide the heaviness of fusion
similarity part
Wei et al. [21] Intensity variation descriptor There are some potential
applications of the proposed IVD
method, and it can be explicated to
texture recognition
Papushoy et al. [22] Earth mover’s distance (EMD) Need to improve the recognition
rate
Akram et al. [23] Region-oriented segmentation of Enhances efficiency of the
images particular system
Memon et al. [24] Integrated region matching In future, robust object discovery in
method the mixed class dataset are used
Latif et al. [25] Various types of color histogram Not to classify the dataset into an
unnoticed learning process
Singh et al. [26] Bi-layer content-based image The proposed system will be
retrieval expanded in future work with
convolution-based image features
based on neural networks that can
further enhance Bi CBIR
performance
• The methods used to obtain color characteristics are color correlogram, color
histogram, color—coherence vector, color moment, HSV histogram, HMMD and
color descriptor, etc. [13].
• For extracting texture functions, Haar wavelet transform, Gabor wavelet trans-
form, discrete wavelet transforms, GLCM [13], etc., are used.
• To extract shape features, Canny edge detection, edge detection histogram, and
edge-based histogram descriptor, etc. [13], are used.
Recent Techniques in Image Retrieval: A Comprehensive Survey 455
• The Euclidean and Chi square distance, wavelet depose, Naïve Bayes, and K-
means clustering [13] are used to test similarity.
7 Conclusion
In this paper, we studied the basic IR model based on content useful in retrieval of
the image features. We addressed three distinct basic features here, various types
of CBIR benchmark datasets with their features, tools useful for feature extraction,
and methods used for the feature extraction. Each approach aims to solve the current
challenges facing the system of image retrieval. Various factors are responsible for
influencing the system’s efficiency. To maximize the efficiency of the system, the
variables that positively affect the system can be combined.
References
14. G.S. Naveen Kumar, V.S.K. Reddy, Detection of shot boundaries and extraction of key frames
for video retrieval. Int. J. Knowl. Based Intell. Eng. Syst. 24(1), 11–17 (2020)
15. L.R. Nair, K. Subramaniam, G. Prasanna Venkatesan, A review on multiple approaches to
medical image retrieval system, in Intelligent Computing in Engineering, vol. 1125 (2020),
pp. 501–509
16. R.K. Lingadalli, N. Ramesh, Content based image retrieval using color shape and texture
features. IARJSET. 2(6) (2015)
17. S.H. Shaker, N.M. Khassaf, Methods of image retrieval based cloud. Int. J. Innov. Technol.
Explor. Eng. (IJITEE), 9(3), 2278–3075 (2020)
18. C. Singh, E. Walia, K. Kaur, Color texture description with novel local binary patterns for
effective image retrieval. Pattern Recogn. 76 (2018)
19. H. Qazanfari, H. Hassanpour, K. Qazanfari, Content-based image retrieval using HSV color
space features (2019)
20. A. Du, L. Wang, J. Qin, Image retrieval based on colour and improved NMI texture features.
Automatika 60, 491–499 (2019). https://doi.org/10.1080/00051144.2019.1645977
21. Z. Wei, G.H. Liu, Image retrieval using the intensity variation descriptor. Math. Probl. Eng.
(2020)
22. A. Papushoy, A.G. Bors, Content based image retrieval based on modelling human visual
attention, in Computer Analysis of Images and Patterns. CAIP 2015, Lecture Notes in Computer
Science, vol. 9256, ed. by G. Azzopardi, N. Petkov (Springer, Cham, 2015)
23. F. Akram, J.H. Kim, C.G. Lee, K.N. Choi, Segmentation of regions of interest using active
contours with SPF function. Comput. Math. Methods Med. 710326 (2015). https://doi.org/10.
1155/2015/710326
24. I. Memon, Q. Ali, N. Pirzada, A novel technique for region-based features similarity for content-
based image retrieval. Mehran Univ. Res. J. Eng. Technol. 37 (2017). https://doi.org/10.22581/
muet1982.1802.14
25. A. Latif, A. Rasheed, U. Sajid, A. Jameel, N. Ali, N.I. Ratyal, B. Zafar, S. Dar, M. Sajid, T.
Khalil, Content-based image retrieval and feature extraction: a comprehensive review. Math.
Probl. Eng. (2019)
26. S. Singh, S. Batra, An efficient bi-layer content based image retrieval system. Multimed. Tools
Appl. (2020)
Medical Image Fusion Based on Energy
Attribute and PA-PCNN in NSST
Domain
Abstract Medical image fusion framework takes quite prominence place in iden-
tification of tumors, finding diseases, treatment of disorders. Acquisition of the
complementary data into a composite image is quite essential task, named multi-
modal image fusion. A new energy attribute-based activity measure and parameter
adaptive-PCNN for merging the medical modalities with NSST is presented. Firstly,
the NSST decomposition is used for input images, then low-pass sub-band coef-
ficients are selected using energy attribute function. The band-pass sub-bands are
selected using PA-PCNN. Finally, inverse NSST on fused coefficients provides final
fused image. This fused image is taken as reference for diagnosticians in order to
assess the disorder level of disease and planning treatment. Our proposed method
proved its robustness in finding the disorders using both quantitative and subjective
assessments.
1 Introduction
Multimodal medical imaging, a research field has been gaining great attention in
scientific era, especially because of the prominence in computer vision, disorder diag-
nosis, and medical image analysis [1]. Medical image fusion (MIF) task plays a most
prominence for facing biggest challenges in this bio-medical field. The major thing to
be considered is how to merge the information optimally from multi-modalities, such
as positron emission (PET), computed tomography (CT), single-photon emission
computed tomography (SPECT), and T1, T2-weigthed magnetic resonance imaging
(MRI) [2]. This MIF process mainly consists of several techniques and research areas,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 457
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_42
458 K. Vanitha et al.
with the goal of developing very accurate medical diagnosis and medical decision-
making with great efficiency [3, 4]. Since in the most of cases, directly we are merging
the pixel intensities into composite image so called pixel-level, which has been widely
preferable for MIF task. Multi-scale analysis transform (MSAT) methods have been
categorized as famous and most reliable. The three steps involved in each MSAT
scheme are firstly, the source modalities have been transformed into corresponding
MST domain. Then, properly coefficients have been chosen using measures in order
to give composite image. Finally, these selected coefficients have been applied with
inverse MST to reconstruct an output image. The MST methods like discrete wavelet
transform (DWT) [5], curvelet transform (CVT) [6], non-subsampled contourlet
transform (NSCT) [7], and the non-subsampled shearlet transform (NSST) [8]. These
tools provide fused modality with blocky artifacts, color distortion, and inconsisten-
cies in some regions. To overcome the above presented disadvantages, some more
fusion measures have been preferred in the MST schemes [9, 10]. They are spatial
frequency (SF), modified spatial frequency (MSF), local variance (LV), directive
contrast (DC), the energy of image gradient (EIG), weighted local energy (WLE),
and weighted sum-modified-Laplacian (WSML) [11]. However, these measures does
not give accurate results. The edge-preserving filtering-based (EPF) MSAT has been
most preferably used as MST methods. EPF-MST-based decomposition disintegrates
imaging modality into three layers—one base and two scale. Suitable strategies are
used to merge these layers. Lastly, an input image is reconstructed using inverse
process. Generally, EPF methods use filters such as bilateral (BF), Gaussian (GF),
and curvature (CF), and co-occurrence (CoF) [12–14].
The basic improvements of the proposed MIF framework are listed below:
1. For an effective fusion task, a new energy attribute-based merging strategy is
presented.
2. The application of energy attribute function to merge low-pass sub-bands
preserves the most of modalities energy into fused.
3. The application of parameter adaptive-PCNN to merge band-pass sub-bands
extracts all structural details by properly estimating the prominence of coeffi-
cients.
Experiments have been taken place on diseases such as glioma, Alzheimer’s, and
metastatic bronchogenic carcinoma. This fused image is taken as reference for diag-
nosticians in order to assess the disorder level of disease and planning treatment. Our
proposed method proved its robustness in finding the disorders using both quantita-
tive and subjective assessments. In Sect. 2, the most prominent works used for MIF
task have been given. Section 3 contains the step-wise algorithm of proposed NSST-
EA-PAPCNN. In Sect. 4, proposed has been evaluated using metrics and compared
to analyze its performance. Lastly, in Sect. 5, the basic conclusions of proposed work
have been presented.
Medical Image Fusion Based on Energy Attribute and PA-PCNN … 459
2 Preliminaries
The NSST decomposition and new energy attribute-based strategy for MMIF scheme
are explained here.
NSST is one of the MSAT, which has been presented by Easley. It merges the
non-subsampled pyramid transform with distinct shearing filters, which leads to
producing multi-scale and directionality characteristics. Because of its basic func-
tion superiority, NSST outperforms the most popularly used MSAT and so most
frequently used in MMIF. For n = 2, the shearlet function is satisfied [15].
For our convenience, NSST and its inverse are represented using two functions
as follows:
{L n , Hn } = nsst_de(I ) (1)
F = nsst_re(L n , Hn ) (2)
where nsst_ de( ) and nsst_re( ) indicate NSST decomposition and reconstruction,
respectively.
I and F denote input and fused images;
L n and H n indicate low-frequency and high-frequency sub-bands, respectively.
The low-pass sub-bands are merged using energy function (EA), calculated using
the steps as mentioned [11].
1. First the mean (M) and median (med) values of each LP sub-bands LP1 , LP2
are calculated and denoted as M 1 , M 2 , med1 , med2 , respectively.
2. The intrinsic property values of corresponding bands are given as:
3 Proposed Method
The present proposed MMIF algorithm steps and its block diagram are shown as in
Fig. 1:
1. Read the two multimodal imaging modalities which have been considered for
MMIF process and are aligned, with 256 rows and columns.
2. Then, each input is decomposed using NSST providing a set of lowband-pass
sub-bands denoted as {LP1 , BP1 } and {LP2 , BP2 }.
3. The low-pass sub-bands are merged using the new energy-based attribute EA
given in Eq. ().
4. The band-pass sub-bands are merged using PA-PCNN and given mathematically
in detail, see [15, 16]:
BP1 (a, b), if N1 [T ] ≥ N2 [T ]
BP F (a, b) = (8)
BP2 (a, b), Otherwise
Low-pass
Energy Attribute
sub-bands
(EA) rule
Source Fused
Images NSST Image
IF(a, b)
Band-pass PAPCNN
sub-bands
4 Experimental Results
The experiments are carried on modalities which have been affected with Alzheimer
disease (MRI-SPECT), (MRI-PET), (T1, T2-weighted MRI images, (CT-MRI),
all [17] are given in Fig. 2. Our algorithm was validated in dataset with modali-
ties of several diseases, namely glioma, Alzheimer’s, and metastatic bronchogenic
carcinoma. The five methods such as NSCT-SF-PCNN [18], NSCT-RPCNN [19],
NSCT-IT2FS [20], LLF-IOI [21], and CSMCA [22] are considered as comparative
methods, and results of these schemes are shown in Figs. 3a–f, 4a–f, 5a–f and 6a–f.
Metrics for each method have been shown in Tables 1, 2, 3 and 4. Here, the values
which are observed as the highest among all other have been bolded. Feature-based
mutual information with no reference (NFMI) measures the transfer of features into
composite in an effective manner [23, 24]. Xydeas and Petrovic proposed QG to
find edges preserving in output image. Zhao et al. and Liu et al. proposed QP to
measure features using phase congruency. Normalized MI (QMI ) measures informa-
tion contained by the fused image, which is taken from merging of modalities. QNCIE
gives the degree of dependence of output data on input modalities. QW measures
similarities based on considering the structural details and loss of details. QCB , visual
perception-based metric which measures retention of salient, contrast, etc [25]. STD
is the most familiar metric to assess the image quality, easy to measure and its value
should be high for better results [26].
The above results affirm that fused image of SF-PCNN has poor quality by reason
of less contrast. NSCT-RPCNN fails in taking out more details from source modali-
ties, resulting in lack of significant details. Edges of modalities are not preserved, so
output image is so blurred and not visibly good. Noise-like artifacts are observed in
the output of LLF-IOI, due to its failure in preserving the structural details. CSMCA
Fig. 2 a, e CT and MRI, b, f CT and MR-T2, c, g MRI and PET, d, h MRI and SPECT
462 K. Vanitha et al.
method normally performs ably, but still intensity inconsistencies have been seen in
some regions of the output image. Our method gives accurate results by preserving
most of input modalities details, energy, and reducing the noisy artifacts.
STD 85.628, QMI 0.7051, QG 0.6412, and QW 0.8165 are holding first place for
our method, remaining metrics are in almost second and third places with respect
Medical Image Fusion Based on Energy Attribute and PA-PCNN … 463
Table 2 Objective assessment of distinct fusion schemes for MR-T1 and MR-T2
Metrics [18] [19] [20] [21] [22] Proposed
STD 73.74 0.305 0.229 80.89 69.37 75.345
NFMI 0.861 0.862 0.861 0.851 0.865 0.849
QNCIE 0.8091 0.809 0.8081 0.8093 0.808 0.8088
QMI 0.7781 0.807 0.7065 0.8114 0.759 0.7759
QG 0.385 0.058 0.3674 0.3617 0.744 0.6939
QP 0.2568 0.312 0.3165 0.1527 0.524 0.3264
QW 0.4998 0.506 0.9979 0.547 0.825 0.8294
QCB 0.6332 0.242 0.1771 0.6204 0.696 0.6754
Table 3 Objective assessment of distinct fusion schemes for MRI and PET
Metrics [18] [19] [20] [21] [22] Proposed
STD 0.4988 8.909 0.218 0.271 0.211 68.19
NFMI 0.778 0.806 0.865 0.834 0.851 0.852
QNCIE 0.804 0.806 0.807 0.807 0.804 0.808
QMI 0.4869 0.761 0.737 0.736 0.124 0.771
QG 0.4624 0.529 0.579 0.382 0.532 0.511
QP 0.1948 0.031 0.414 0.225 0.125 0.263
QW 0.9621 0.656 0.998 0.517 0.994 0.515
QCB 0.3995 0.618 0.145 0.656 0.667 0.681
poor quality details, fails in providing noisy free output. CSMCA fails in preserving
edges and structures. Our method performs very well with respect to preservation of
structural, edges information without any artifacts.
Only two values such as STD 75.345, QG 0.6939 are holding first place for our
method. The metrics such as QP , Qw have been standing in second place, and still,
Medical Image Fusion Based on Energy Attribute and PA-PCNN … 465
Table 4 Objective assessment of distinct fusion schemes for MRI and SPECT
Metrics [18] [19] [20] [21] [22] Proposed
STD 0.4975 10.137 0.202 0.277 0.182 64.538
NFMI 0.8012 0.8094 0.870 0.823 0.856 0.8708
QNCIE 0.8047 0.8075 0.808 0.806 0.804 0.8088
QMI 0.6364 0.8109 0.733 0.705 0.102 0.804
QG 0.4906 0.4995 0.565 0.382 0.499 0.483
QP 0.284 0.0223 0.404 0.352 0.279 0.4203
QW 0.9713 0.5989 0.998 0.492 0.989 0.4915
QCB 0.431 0.6499 0.217 0.71 0.59 0.699
remaining are in third rank. However, objectively results are good and subjectively
very good as it achieves high robustness.
The above results affirm that fused image of SF-PCNN has been produced good
details with less color distortion. From NSCT-RPCNN output, details are not at
all extracted in good manner, so quality is worst, and this method does not work for
integration of color images. NSCT-IT2FS method extracts good amount of structural,
spatial information, functional details from MRI and PET, still in some white regions
are having visual inconsistencies. LLF-IOI suffers from color fidelity issue, so result
is so poor. CSMCA fails in avoiding the artifacts and color distortion problems. Our
method achieves almost high visual quality with respect to both color preservation
and less color distortion among other methods (Fig. 5a–h).
The values of our method are as STD 68.19, QMI 0.771, QNCIE 0.808, QCB 0.681,
indicating that fused has good quality based on human perception, good amount of
color information, and less artifacts with good visible consistency.
Figure 6 has shown the MR and SPECT fusion results, observed that SF-PCNN,
NSCT-RPCNN, NSCT-IT2FS does not preserves color fidelity in good manner. The
LLF-IOI method makes to lose the prominent functional information of SPECT as it
enhances overly the anatomical details in MRI. CSMCA fails in giving fused image
with color details, functional, and anatomical information. Our method achieves
great visual consistency in all regions with good color preservation and almost no
artifacts, color distortion.
The metrics except QMI , QG , and Qw are bolded for proposed with STD 64.538,
NFMI 0.8708, QNCIE 0.8088, QP 0.4203, QCB 0.699. These values depicts that our
method takes first rank for almost all metrics. Thus, our method achieves great visual
consistency in all regions with good color preservation and almost no artifacts, color
distortion.
466 K. Vanitha et al.
5 Conclusion
A new energy attribute-based activity measure and PA-PCNN for merging the
medical modalities with NSST is presented. Firstly, the NSST decomposition is
used for input images, then low-pass sub-band coefficients are selected using energy
attribute function. The band-pass sub-bands are selected using PA-PCNN. Finally,
inverse NSST on fused coefficients provides final fused image. This fused image is
taken as reference for diagnosticians in order to assess the disorder level of disease
and planning treatment. Our proposed method proved its robustness in finding the
disorders and proved its robustness by giving good performance with respect to both
quantitatively and subjectively.
Human and Animal Rights No animals/humans were used for studies that are the basis of this
research.
Availability of Data and Materials The authors confirm that the data supporting the findings of
this research are available within the article.
Funding None.
References
1. A.P. James, B.V. Dasarathy, Medical image fusion: a survey of the state of the art. Inf. Fusion
19(1), 4–19 (2014)
2. Fatmael-Zahra, Ahmedel-Gamal, Current trends in medical image registration and fusion.
Egypt. Inform. J. 17(1), 99–124 (2016)
3. S. Li, X. Kang, L. Fang, Pixel-level image fusion: a survey of the state of the art. Inf. Fusion
33(1), 100–112 (2017)
4. Tirupal, B. Chandra Mohan, S. Srinivas Kumar, Multimodal medical image fusion techniques—
a review. Curr. Signal Transduction Ther. (2020)
5. R. Vijayarajan, S. Muttan, Discrete wavelet transform based principal component averaging
fusion for medical images. AEU 69(6), 896–902 (2015)
6. R. Srivastava, O. Prakash, A. Khare, Local energy-based multimodal medical image fusion in
curvelet domain. IET Comput. Vis. 10(6), 513–527 (2016)
7. G. Bhatnagar, Q.M.J. Wu, Z. Liu, Directive contrast based multimodal medical image fusion
in NSCT domain. IEEE Trans. Multimedia 15(5), 1014–1024 (2013)
8. G. Guorong, X. Luping, F. Dongzhu, Multi-focus image fusion based on non-subsampled
Shearlet transform. IET Image Process. 7(6), 633–639 (2013)
9. V. Bhateja, H. Patel, A. Krishna, A. Sahu, A. Lay-Ekuakille, Multimodal medical image sensor
fusion framework using cascade of wavelet and contourlet transform domains. IEEE Sens. J.
15(12), 6783–6790 (2015)
Medical Image Fusion Based on Energy Attribute and PA-PCNN … 467
10. K. Vanitha, D. Satyanrayana, M.N. Giri Prasad, A new hybrid approach for multi-modal medical
image fusion. JARDCS 12(3), 221–230 (2018)
11. W. Huang, Z. Jing, Evaluation of focus measures in multi-focus image fusion. Pattern Recogn.
Lett. 28(4), 493–500 (2007)
12. D.P. Bavirisetti, R. Dhuli, Fusion of MRI and CT images using guided image filter and image
statistics. Int. J. Imaging Syst. Technol. 27(3), 227–237 (2017)
13. W. Tan, P. Xiang, J. Zhang, H. Zhou, H. Qin, Remote sensing image fusion via boundary
measured dual-channel pcnn in multiscale morphological gradient domain. IEEE Access 8,
42540–42549 (2020)
14. Z. Zhou, B. Wang, S. Li, M. Dong, Perceptual fusion of infrared and visible images through
a hybrid multi-scale decomposition with Gaussian and bilateral filters. Inf. Fusion 30, 15–26
(2016)
15. K. Vanitha, D. Satyanarayana, M.N. Giri Prasad, Medical image fusion algorithm based on
weighted local energy motivated PAPCNN in NSST domain. JARDCS 12(3), 960–967 (2020)
16. M. Yin, X. Liu, Y. Liu, X. Chen, Medical image fusion with parameter-adaptive pulse coupled
neural network in non subsampled shearlet transform domain. IEEE Trans. Instrum. Meas.
68(1), 49–64 (2018)
17. www.med.harvard.edu/AANLIB/
18. X.B. Qu, J.W. Yan, H.Z. Xiao, Z.Q. Zhu, Image fusion algorithm based on spatial frequency-
motivated pulse coupled neural networks in non subsampled contourlet transform domain. Acta
Autom. Sin. 34(12), 1508–1514 (2008)
19. S. Das, M.K. Kundu, A neuro-fuzzy approach for medical image fusion. IEEE Trans. Biomed.
Eng. 60(12), 3347–3353 (2013)
20. Y. Yang, Y. Que, S. Huang, P. Lin, Multimodalsensor medical image fusion based on type-2
fuzzy logic in NSCT domain. IEEE Sens. J. 16(10), 3735–3745 (2016)
21. J. Du, W. Li, B. Xiao, Anatomical-functional image fusion by information of interest in local
Laplacian filtering domain. IEEE Trans. Image Process. 26(12), 5855–5866 (2017)
22. Y. Liu et al., Medical image fusion via convolutional sparsity based morphological component
analysis. IEEE Signal Process. Lett. 26(3), 485–489 (2019)
23. C.S. Xydeas, V. Petrovic, Objective image fusion performance measure. Electron. Lett. 36(4),
308–309 (2000)
24. M.B.A. Haghighat, A. Aghagolzadeh, H. Seyedarabi, A non-reference image fusion metric
based on mutual information of image features. Comput. Elect. Eng. 37(5), 744–756 (2011)
25. z. Liu et al., Objective assessment of multi resolution image fusion algorithms for context
enhancement in night vision: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell.
34(1), 94–107 (2012)
26. P. Jagalingam, A.K. Hegde, A review of quality metrics for fused image. Aquat. Procedia 4,
133–142 (2015)
Electrical Shift and Linear Trend
Artifacts Removal from Single Channel
EEG Using SWT-GSTV Model
Abstract Electrical shift and linear trend (ESLT) artifact is often present in recorded
electroencephalogram (EEG) due to electrode fluctuations or a sudden drop in skin
touch, which contrarily affects the exact estimation of cerebrum activities in brain-
computer interfacing (BCI) applications. In this work, a novel model was proposed
by combining stationary wavelet transform (SWT) with group sparsity total variation
(GSTV) filter, denoted SWT-GSTV to remove the ESLT artifacts. SWT method was
used to decompose the interfered single channel EEG into several frequency bands.
The contaminated sub-band signal is applied to GSTV filter to estimate the artifact
signal. This estimated artifact signal is subtracted from the contaminated sub-band
signal, and it gives the filtered sub-band signal. Then, the filtered sub-band signal was
added back to the remaining decomposed components of SWT, which produce the
final denoised EEG signal. Matlab simulations were demonstrated that the proposed
method outperformed the existing methods by exhibiting a high CC, low RRMSE,
and least MAE in α band.
1 Introduction
EEG signals are commonly used to analyze the neurological diseases like sleep disor-
ders, epilepsy, and applications of BCI [1]. Either the physiological artifacts (motion,
electrocardiogram (EOG) and electrocardiogram (ECG), etc.) or non-physiological
artifacts (surrounding high-frequency noise and power line interference, etc.) are
often involved in the recorded EEG signals [2]. The involvement of these artifacts
contrarily affects the exact estimation of cerebrum activities [3]. Several algorithms
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 469
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_43
470 S. K. Noorbasha and G. F. Sudha
[4–7] are proposed in literature to suppress the physiological artifacts from the inter-
fered EEG. But, there are not much focused algorithms in literature to remove the
non-physiological artifacts, specifically electrical shift and linear trend (ESLT) [8].
The reasons to ESLT artifacts might be electrode sudden drop in skin touch, or fluc-
tuations in electrode or transient currents drifts due to triggered effects etc., [9–11].
In [9], Fast ICA, Infomax, and second-order blind identification (SOBI) were imple-
mented for EEG denoising with satisfactory ESLT elimination. Following that, two
more fully automated methods for ESLT artifact removal are presented: automatic
wavelet-ICA (AWICA) [10] and enhanced AWICA (EAWICA) [11].
Recently, stationary wavelet transform (SWT)—kurtosis-based method for ESLT
artifact removal has been proposed in [12]. In this, SWT with thresholding is used to
remove the ESLT artifacts, and a kurtosis-based strategy is used to select the optimal
decomposition level of SWT to reach the artifact components. The limitation is that
thresholding leads to some loss of wanted EEG in the reconstruction of the signal due
to its non-stationary behavior [4], which is not acceptable for biomedical applications
like BCI.
In this work, a novel model was proposed by combining stationary wavelet trans-
form (SWT) with group sparsity total variation (GSTV) filter, denoted SWT-GSTV
to remove the ESLT artifacts. SWT method was used to decompose the interfered
single channel EEG into several frequency bands. The contaminated sub-band signal
is applied to GSTV filter to estimate the artifact signal. This estimated artifact signal
is subtracted from the contaminated sub-band signal, and it gives the filtered sub-
band signal. Then, the filtered sub-band signal was added back to the remaining
decomposed components of SWT, which produce the final denoised EEG signal.
The orientation of the paper shall be described as follows. A short description of the
methods and the databases used was discussed in Sect. 2. The experimental results
are discussed in Sect. 3. The conclusion of the paper is provided in Sect. 4.
2.1 SWT
N
Ci, j = y(k)i, j (k) (1)
k=1
Electrical Shift and Linear Trend Artifacts Removal from Single … 471
The approximate, Ai, j (k) and detail, Di, j (k) coefficients are
where ↑ 2i−1 L 1 = L i (k) and ↑ 2i−1 H 1 = H i (k) are the over- sampling of the low-
pass filter L i−1 (k) coefficients and the over- sampling of the high-pass filter H i−1 (k)
coefficients, respectively.
2.2 GSTV
Total variation (TV) filter is an effective tool, which is used for several applications
like, decomposition, deconvolution, and denoising [15]. However, the drawback of
TV filter is that the filtered signals contain stair case components. To avoid this
drawback, group sparsity (GS) technique is combined with TV filter to yield the
GSTV filter. From Fig. 1, let us consider the approximation component of sixth
level, A6 (k) = g(k) + h(k), where g(k) is unknown artifact signal and h(k) is the
wanted signal. The estimation of unknown artifact signal by GSTV filter is given as
in [16],
where λ and D indicate the regularization parameter and the first-order difference
matrix, whose size is (N − 1) * N, respectively. Let us consider the group size of
sparsity as N-point and its vector, R is represented as in [16],
where K is index number of group size of sparsity. The sparsity function, in terms
of N-point vector is indicated [16] as below.
N −1
(R) = |R(n + K )| 2
(7)
K n=0
Then, the filtered sub-band signal, X (k) is added back to the remaining decom-
posed components of SWT, which produces the final denoised EEG signal, x̃(k)
as,
For the purposes of experimentation, both simulated and real-time databases are
considered, from CHB-MIT scalp EEG [17] and EEG-LAB [18], respectively. The
sampling frequencies of these databases are 256 Hz and 128 Hz, respectively.
Electrical Shift and Linear Trend Artifacts Removal from Single … 473
RMS(a − x̃)
RRMSE = (10)
RMS(a)
where a and x̃ are clean EEG and denoised EEG signals, respectively.
2. Mean absolute error (MAE) is given as [5, 6]
m
| pc (k) − pe (k)|
MAE = k=n
(11)
n−m
where pc (k) indicates power spectrum of the denoised EEG; pe (k) represents the
power spectrum of interfered EEG, and n −m indicates the range of frequencies.
This parameter should be as minimum as possible. To identify the capability
of restoring the original EEG in the denoised signal, another metric correlation
coefficient (CC) is also computed between clean EEG, a(k), and denoised EEG,
x̃(k).
3 Results
The pure EEG, ESLT artifact and their summation produces the interfered EEG are
shown in Fig. 2a–c, respectively.
Simulations were carried out in Matlab. The interfered EEG is applied to the
SWT stage of proposed method with six level decomposition. In this paper, the
number of decomposition levels for SWT are considered six, because most of the
ESLT artifact appears in the sixth level of approximation sub-band, A6 as compared
to other decomposition levels. The parameters of GSTV filter, the size of group
sparsity, N = 3, and regularization parameter, λ = 1.5 are considered for effective
performance.
The SWT decomposed detailed and approximation sub-bands are shown in Fig. 3.
The approximation sub-band, A6 is applied to the GSTV filter to extract the ESLT
artifact. The extracted ESLT artifact by GSTV filter is shown in Fig. 4b. This artifact
is subtracted from the approximation sub-band, A6 which produce the residue of the
wanted EEG component as shown in Fig. 4c. Then, this residue of EEG is added
Fig. 4 a Approximation sub-band, A6 , b estimated ESLT artifact by GSTV filter, c residue of EEG,
and d denoised EEG
Electrical Shift and Linear Trend Artifacts Removal from Single … 475
back to the remaining detailed sub-bands of SWT, which yield the final denoised
EEG as shown in Fig. 4d.
Both proposed and existing methods are applied on ninety-five records of ESLT
artifact interfered EEG by varying different SNR values. The performance measures
RRMSE and CC are calculated and the averaged performance measures with their
standard deviations are plotted as shown in Fig. 5.
Five number of real-time records with five seconds duration are shown in Fig. 6.
Similar to simulated data results, the real-time data records are applied to both
proposed and existing methods and denoised signals are obtained. To evaluate the
performance of these methods, power spectral density (PSD) plots are drawn for
Fig. 6 Real-time EEG signals. a Record 1, b Record 2, c Record 3, d Record 4, and e Record 5
476 S. K. Noorbasha and G. F. Sudha
Fig. 7 PSD plots of a interfered EEG (blue), b denoised EEG by EAWICA (black), c denoised
EEG by SWT-kurtosis (magenta), and d denoised EEG by proposed method (green)
denoised signals and calculated the mean absolute error (MAE) in the α band (8–
12 Hz). The PSD plot of the interfered EEG and denoised EEGs of proposed and
existing method with respect to Record 1 are shown in Fig. 7.
From Table 1, it is noticed that the averaged MAE of the proposed method is least
value compared to the existing methods. It means the recovery of α band (8–12 Hz)
component in the denoised EEG by the proposed method is satisfactory compared
to existing methods, which is very crucial for BCI applications.
In the existing EAWICA [11] process, the DWT is first applied to decompose the
interfered signal, and then, these decomposed signals are fed into ICA for artifact
removal. The disadvantage of this approach is that it needs certain predefined artifact
markers to distinguish the correct artifact. In the method [12], SWT with thresholding
is used to remove the ESLT artifacts, and a kurtosis-based strategy is used to select the
optimal decomposition level of SWT to reach the artifact components. The limitation
is that the thresholding leads to some loss of wanted EEG in the reconstruction of
the signal due to its non-stationary behavior [4]. The proposed approach overcomes
the drawbacks of the existing algorithms, EAWICA [11] and SWT—kurtosis [12]
and has overall enhanced performance measure MAE with the decrement of 0.1257
and 0.0717, respectively.
Electrical Shift and Linear Trend Artifacts Removal from Single … 477
4 Conclusion
In this work, a novel SWT-GSTV model has been proposed to remove the ESLT
artifact from the interfered EEG. In this, SWT method was used to decompose the
interfered single channel EEG into several frequency bands. The contaminated sub-
band signal is applied to GSTV filter to estimate the artifact signal. This estimated
artifact signal is subtracted from the contaminated sub-band signal, to give the filtered
sub-band signal. Then, the filtered sub-band signal was added back to the remaining
decomposed components of SWT, which produces the final denoised EEG signal.
Simulation results demonstrate that the proposed method outperforms the existing
methods with low RRMSE, MAE, and high CC.
References
14. M. Meraha, T.A. Abdelmalika, B.H. Larbic, R-peaks detection based on stationary wavelet
transform. Comput. Methods Programs Biomed. 121(3), 149–160 (2015)
15. A. Chambolle, An algorithm for total variation minimization and applications. J. Math. Imaging
Vis. 20, 89–97 (2004)
16. I.W. Selesnick, P.-Y. Chen, Total variation denoising with overlapping group sparsity, in IEEE
ICASSP, May 26–31, 2013, Vancouver, Canada
17. A. Shoeb, Application of machine learning to epileptic seizure onset detection and treatment,
Ph.D. Thesis (2009)
18. A. Delorme, S. Makeig, EEGLAB: an open-source toolbox for analysis of single-trial EEG
dynamics including independent component analysis. J. Neurosci. Methods 134, 9–21 (2004).
Available: http://sccn.ucsd.edu/eeglab/
Forecasting Hourly Electrical Energy
Output of a Power Plant Using
Parametric Models
1 Introduction
In a combined cycle power plant, the electricity is generated by gas and steam
turbines. This kind of plants generates more than 50% than the traditional power plant
[1, 2]. Electricity generated by the power plant oscillates due to number of reasons
including environmental conditions. Traditional mathematical models require high
number of parameters to predict the actual system output [3, 4]. Instead of the math-
ematical models, machine learning (ML) models can be used for better predictions
even with few parameters [5].
The concept of ML is to make the computers to learn themselves by adopting a
model instead of acting according to a program written by a programmer [6]. The
arrival of new data makes the customized ML models to understand, modify, and
improve themselves.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 479
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_44
480 Ch. V. Raghavendran et al.
2 Parametric Models
A learning model that précises data with a set of parameters of fixed size is termed as a
parametric model. Irrespective of the volume of the data given to a parametric model,
it won’t change its attention on the number of parameters it requires. Examples of
parametric learners include linear models like—linear regression, logistic regression,
and Linear Support Vector Machine (LSVM).
Linear Regression determines a plane to minimize the sum of squared errors (SSE)
among the observed and predicted response [9]. Now, we model linear regression to
predict the target variable PE. We will start with simple linear regression which is
used for forecasting continues result.
• Simple linear regression works on only one independent variable
• Multiple linear regression works on more than one independent variable.
The following equation is used by the linear regression to calculate target feature
y = β0 + β1 x1 + β2 x2 + · · · + βn xn (1)
y = β0 + β1 x + β2 x 2 + · · · + βk x k + ε (2)
2.4 Metrics
Evaluation metrics for classification problems, such as accuracy, are not useful for
regression problems. The metric accuracy used as a metric to evaluate classifica-
tion problem is not used for regression problems. These problems need metrics for
comparing continuous values. The three common evaluation metrics for regression
problems are as follows:
Mean Square Error—the mean of the squared error and the formula is
1 2
MSE = yactual − ypredicted (3)
n
Root Mean Square Error—the square root of the MSE, and this states us how
intense the data is around the line of the best fit.
482 Ch. V. Raghavendran et al.
yactual − ypredicted 2
RMSE = (4)
n
The dataset for this analysis is gathered from combined cycle power plant. The dataset
is a collection of 6 years of power plant data with 9568 records. The dataset has four
independent features and one target feature. The description of the features is hourly
average temperature—AT, ambient pressure—AP, relative humidity—RH, exhaust
vacuum—V, net hourly electrical energy output of the power plant—PE.
The descriptive information of the dataset are shown in the Table 1. From Table 1,
it is evident that all the features are continuous, but ranges are varying. The data are
to be normalized to 0–1 before applying any machine learning model to overcome
the variations in the mean and standard deviation.
3.2 Visualization
Data visualization plays a vital role to analyze datasets and to understand the insights
of collected dataset. Data visualization will make it easier to understand through
various plots—boxplot, scatter plot, distribution plot, heat map, correlation chart,
pair plot, etc.
A Boxplot is a regular way of presenting the scattering of data built on a five
number summary. The boxplots of all the continuous features is presented in the
Fig. 1.
Distribution plot is the most convenient way to present the univariate distribution
of a feature, which show data in histogram and fit a kernel density estimate (KDE).
A histogram is a tool that represents the distribution of data in the form bins along
the range of the data. The KDE is useful for plotting the shape of a distribution. The
bell-like curve in the plot shows the density which is a smooth form of the histogram.
The y-axis is in terms of density, and the histogram is stabilized by default so that
it has the same y-scale as the density plot. The distribution plots of the continuous
features in comparison with target feature are shown in Fig. 2.
Scatter plot presents the relationship between two continuous features. It illustrates
how one feature affects the other feature in every fraction of the value of the dataset.
The scatter plots of all the features are presented in Fig. 3.
3.3 Overview
From the plots, it is evident that the data in all the features are almost equally
distributed except for feature AP. For this feature, it is observed that it has some
484 Ch. V. Raghavendran et al.
Fig. 3 Regression plots for all features with target feature (PE)
outliers on left side of the plot in Fig. 1. Observing fourth plot in Fig. 1, it is clear
that the data for the RH feature is skewed left side. In the implementation part, we
have consider these observations.
Forecasting Hourly Electrical Energy Output of a Power Plant … 485
4 Implementation
The original dataset with dimension (9568, 5) is partitioned randomly into train set
and test set in the 75:25 ratio. So, the train and test set dimension becomes (7176,
5) and (2392, 5), respectively. The following commands will partition the dataset
into four partitions as x_train, y_train, x_test, and y_test. x_train and x_test contain
the all independent features of train and test set. y_train and y_test contain only the
dependent feature of train and test set.
x = data[[‘AT’, ‘V’, ‘AP’, ‘RH’]]
y = data[‘PE’]
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1)
According to Eq. 1, the predicted feature for the dataset is calculated using the
following equation:
y = β0 + β1 × AT + β2 × V + β3 × AP + β4 × RH (6)
The β values are termed the model coefficients. These values are “learned” through
the model fitting phase by means of the “least squares” measure [10–12]. Then, the
fitted model can be used to create predictions. The coefficients and slope values of
the fitted model are as follows:
Coefficient = 447.06297098687327
Slope values = [−1.97376045, −0.23229086, 0.0693515, −0.15806957]
The results in Table 2 show that the model is giving approximately 93% accuracy
for train, test data and also in 10 folds cross validation.
4.2.1 Normality
In a linear regression model, the errors or residuals are generally distributed. If the
resultant plot is linear, then the residuals are usually scattered. The plot in Fig. 4
shows a linear plot, and this proves that the residual are distributed normally.
4.2.2 Linearity
4.2.3 Homoscedasticity
This is to check for variability in the response variable is the same at all levels of
exploratory variables. The residuals should be constant or with identical variance.
Forecasting Hourly Electrical Energy Output of a Power Plant … 487
If it is not constant across the error terms, then there is a chance of heteroscedas-
ticity. Because of the existence of outliers, the non-constant variance across the error
terms occurs. If there is a funnel shaped distribution, then consider a non-constant
variance, i.e., heteroscedasticity. Figure 6 shows the homoscedasticity plot between
the predicted value and residuals. If there is no funnel shape distribution in the plot,
then this is an indication of homoscedasticity.
4.2.4 Independence
For linear regression, it is required that the residuals have very slight or no auto-
correlation in the data. If the residuals are dependent on each other, then auto-
correlation happens. Error (i + 1) term is dependent of error (i) term. This indicates
that the current residual value is dependent on the previous residual value. Presence
488 Ch. V. Raghavendran et al.
of auto-correlation considerably decreases the R square value and increase the error
of the model. We check auto-correlation using ACF (auto-correlation function) plot
shown in Fig. 7.
4.2.5 Multicollinearity
Collinearity indicates whether the two features are highly correlated and having
related information about the variance in a dataset. Correlation matrix is used to
identify collinearity among features. But, multicollinearity is more difficult to detect
because it emerges when three or more features in the dataset are highly correlated.
So, this is used to check whether the independent features are highly correlated with
each other or not. The variance inflation factor (VIF) is metric of collinearity among
predictor features in multiple regression. The following Fig. 8 shows the initial VIF
of all the predictor features and VIF after removing features with high VIF. Finally,
all the four features are to be included in the model.
Fig. 8 a VIF of all features. b VIF values after removing V. c VIF values after removing AP
Forecasting Hourly Electrical Energy Output of a Power Plant … 489
Support vector machine (SVM) is a tool used for both regression and classification.
SVM uses ML theory to maximize predictive accuracy, and at the same time, it
automatically avoids overfit to the data. The above Table 3 shows the values of
metrics.
5 Result Analysis
The values of the metrics obtained for three models—linear regression, polynomial
regression, and linear support vector machine are analyzed. This indicates that the
polynomial regression is resulting lower values for MSE and RMSE comparing with
the other two models. But for R2 Score, these three models are giving almost same
accuracy. The R2 score is verified with cross validation test with 10 folds and is
almost matches with train and test R2 score values. So, the models are not either
overfit or underfit.
490 Ch. V. Raghavendran et al.
6 Conclusion
In this paper, we have consider the combined cycle power plant dataset to predict
the hourly electrical energy output of a power plant using three parametric models—
linear regression, polynomial regression, and linear support vector machine with
tenfold cross validation. All the three models have given approximately 92–94% of
accuracy polynomial regression has given the highest accuracy for both train and test
data. We have also validated the linear regression with five assumptions—Normality,
Linearity, Homoscedasticity, Independence, and Multicollinearity. This paper can be
further extended by applying the other parametric models and also the non-parametric
models such as decision tree, random forest, and KNN.
References
1. L.X. Niu, X.J. Liu, Multivariable generalized predictive scheme for gas turbine control in
combined cycle power plant, in 2008 IEEE Conference on Cybernetics and Intelligent Systems
(2008), pp. 791–796. http://doi.org/10.1109/ICCIS.2008.4670947
2. V. Ramireddy, An overview of combined cycle power plant (2015). http://electricalengineer
ing-portal.com/an-overview-of-combined-cycle-power-plant
3. U. Kesgin, H. Heperkan, Simulation of thermodynamic systems using soft computing
techniques. Int. J. Energy Res. 29, 581–611 (2005)
4. A. Samani, Combined cycle power plant with indirect dry cooling tower forecasting using
artificial neural network. Decis. Sci. Lett. 7(2), 131–142 (2018)
5. P.R. Norvig, S.A. Intelligence, A modern approach. Manuf. Eng. 74, 111–113 (1995). http://
doi.org/10.1049/me:19950308
6. B. Lakshmi Sucharitha, C.V. Raghavendran, B. Venkataramana, Predicting the cost of pre-
owned cars using classification techniques in machine learning, in Advances in Computational
Intelligence and Informatics. ICACII 2019. Lecture Notes in Networks and Systems, vol. 119,
ed. by R. Chillarige, S. Distefano, S. Rawat (Springer, Berlin, 2020)
7. H. Moayedi, D. JahedArmaghani, Optimizing an ANN model with ICA for estimating bearing
capacity of driven pile in cohesionless soil. Eng. Comput. 34, 347–356 (2018). https://doi.org/
10.1007/s00366-017-0545-7
8. M. Khandelwal, A. Marto, S.A. Fatemi et al., Implementing an ANN model optimized by
genetic algorithm for estimating cohesion of limestone samples. Eng. Comput. 34, 307–317
(2018). https://doi.org/10.1007/s00366-017-0541-y
9. G. Naga Satish, Ch.V. Raghavendran, M.D. Sugnana Rao, Ch. Srinivasulu, House price predic-
tion using machine learning. Int. J. Innov. Technol. Exploring Eng. 8(9), 717–722 (2019). http://
doi.org/10.35940/ijitee.I7849.078919
10. C.V. Raghavendran, G.N. Satish, V. Krishna, S.M. Basha, Predicting rise and spread of COVID-
19 epidemic using time series forecasting models in machine learning. Int. J. Emerg. Technol.
11(4), 56–61 (2020)
11. Ch.V. Raghavendran, G. Naga Satish, T. Rama Reddy, B. Annapurna, Building time series
prognostic models to analyze the spread of COVID-19 pandemic. Int. J. Adv. Sci. Technol.
29(3), 13258 (2020). Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/
31524
12. K. Helini, K. Prathyusha, K. Sandhya Rani, Ch.V. Raghavendran, Predicting coronary heart
disease: a comparison between machine learning models. Int. J. Adv. Sci. Technol. 29(3),
12635–12643 (2020). Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/
30385
SOI FinFET-Based 6T SRAM Design
Abstract This paper describes the design and implementation of 6T SRAM cell
by considering sub 20 nm FinFET model and the circuit performance like read
0, 1, write 0, 1 and leakage power dissipation are evaluated along with transistor
sizing for device stability. Due to their exceptional characteristics such as enhanced
channel controllability, high I ON /I OFF , diminished short channel effects, completely
depleted SOI FinFET devices are introduced as a promising nanoscale replacement
for traditional bulk CMOS devices. Read and write operation of 6T SRAM are
confirmed by H-Spice simulation.
1 Introduction
The continuous downscaling of bulk CMOS brings into being major outflow due to its
stipulation in process technology and primary essential material. The contamination
in the semiconducting channel is the key impediment to CMOS-based design [1–3].
The device’s best output can be accomplished by reducing the threshold voltage and
scaling down the supply voltage to improve leakage [4, 5]. The leading hindrance to
the downscaling of CMOS devices down to 20 nm and at the lower nodes is to incor-
porate second-order effects like subthreshold leakage, short channel effects which
results in low throughput [6]. According to the International Technology Roadmap
for Semiconductors (ITRS), multi-gate MOS devices will serve to minimize leakage
and channel duration [7].
CMOS IC technologies turned out to be steadily downscaled to the sub-nanometer
region over the last three decades. The classical structures of devices are holding out
their scaling limits and “end-of- roadmap” substituting devices are considered for the
study. Among all the device parameters, multi-gate MOSFETs are extensively being
studied in a recent study [8, 9], such as double-gate MOSFET, tri-gate MOSFETs
which are also called as FinFETs, and gate all around MOSFETs (surrounding gate
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 491
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_45
492 V. Vijayalakshmi and B. Mohan Kumar Naik
MOSFETs) [10, 11]. The ITRS has identified the significance of these devices and
called them out as advanced CMOS devices.
FinFET-based transistors have become a popular option and viable alternative
for CMOS design technology with downgraded device scaling limits in recent times
[12–14]. The impact of short channel duration could be regulated in these system
structures by restricting off-state leakage currents. Furthermore, FinFETs are more
superior to classical devices for limiting short channel effects, and they have greater
control over lower leakage current and higher yield among many other benefits that
help address scaling challenges [15]. When a cut-off voltage (V t ) is less than the
applied supply potential, all the controlling gates of the novel device set in motion
for the electrons from the source region to the drain region. The applied potential
from all three gates impacts the channel potential and diminishes the drain induced
barrier lowering (DIBL) providing increased swing for FinFET-based design.
This FinFET transistor has a better power-to-delay ratio. Memory requirements
have risen dramatically in many VLSI designs, from industrial applications to
consumer products. It emphasizes the importance of using nanometer technology
to improve memories on a single chip [16, 17]. Nanotechnology, especially SRAM
cells, has a wide range of applications and has improved integrated memories. SRAM
cells are evolving as a critical circuit component in very large-scale integrated (VLSI)
circuits such as FPGAs and microprocessors, whereas SRAM-dependent memories
(also known as caches) influence the processor’s space, timing, control, and schedule
yield. Due to this outcome, SRAM is expected to occupy >90% of the die’s surface
area.
This research paper is organized as follows. In Sect. 2, a short depiction
of the device description and characteristics of FinFET is delivered. Section 3
describes the circuit performance of the developed FinFET model, a basic inverter
circuit is modeled using the look-up table-based FinFETs device characteristics.
Section 4 states the operation of the conventional FinFET-based 6T SRAM cell,
illustrating significant design constraints and read/write functioning of a 6T SRAM
cell. Section 5 explains the inferences recorded during simulation followed by the
conclusion of the research work in Sect. 6, respectively.
2 Device Model
Figure 1a, b shows the 3D structure and cross-sectional view of FinFETs. The FinFET
model structure consists of the following device parameters channel length (Lg), fin
height (H Fin ), fin width (W Fin ), gate oxide thickness (t ox ) and source (N s ), drain (N d ),
and channel (N c ) doping concentrations which are mentioned in Table 1. FinFET-
based model—The FinFET was originally known as a folded channel MOSFET
because of the wafer’s short vertical fins. In FinFETs, the gate width is normally
twice that of the fin height. FinFETs are the most cost-effective instruments to use
instead of CMOS in less than 20 nm technology due to their low processing costs.
To reduce IOFF leakage, the geometric core parameters of FinFETs are important.
SOI FinFET-Based 6T SRAM Design 493
Based on the model parameters explained above, the primary functioning of the
FinFET device is identified in terms of transfer characteristics. Figure 2 compares the
transition characteristics of bulk FinFETs and SOI-based FinFETs, demonstrating
that SOI-based systems work better for future circuit applications with lower off
current and a higher ION-IOFF ratio. Another strength of FinFETs is the ability to
control the threshold voltage (V t ) in a stack of the high-k/metal gate. The current
flowing from the drain to the source is largely determined by the operation, temper-
ature, and voltage. As the voltage or temperature rises, the mobility carrier and
threshold voltage fall, due to a reduction in drain current.
494 V. Vijayalakshmi and B. Mohan Kumar Naik
3 Inverter Design
In this section, an inverter circuit is implemented using the proposed FinFET model.
Since the circuit-level performance of the defined device is examined by evaluating
the 2D lookup tables of I DS and gm as a function of V GS and V DS employing the
raw data acquired by the Silvaco TCAD ATLAS tool. By using this 2D lookup table,
the circuit performance is investigated with the help of the Synopsis H-spice tool,
Verilog-A model is used to define a net-list for FinFET-based inverter as a portion
of designing semiconductor memories. The flowchart for device-circuit simulation
by using H-Spice is represented in Fig. 3 pictorially.
The simulation of the inverter circuit is investigated using H-Spice by Synopsys.
It is an optimized device-circuit simulator, which is utilized to simulate electrical
parameters of VLSI circuits in the transient/DC/AC domain. H-spice is more adapted
for speedy, accurate analysis, and performance of any VLSI circuit.
Finite element method-based numerical simulations or TCAD simulation tools
are beneficial for the design exploration of model-dependent circuits. The FinFET
device is carried in this work with 20 nm technology.
To design any digital circuit, a thorough understanding of simple CMOS inverter
operation and properties is required. Understanding the operation and properties of
any inverter circuit will aid in the development of digital logic and semiconductor
memories. A simple inverter circuit is the most crucial consideration to acknowledge
while evaluating any VLSI circuit; it is the fundamental circuit that is best suited to
examining the device’s circuit efficiency for the specific technology node.
Figure 4 depicts a basic inverter circuit as shown; the structure consists of a simple
combination of pMOS FinFET at the top, the source is associated with V DD and an
SOI FinFET-Based 6T SRAM Design 495
Fig. 4 A TFET-based
inverter circuit
nMOS FinFET at the bottom with the source connected to the ground. Gate terminal
of both the transistor is connected to the V in and the drain of both the transistor are
connected to the V out terminal.
496 V. Vijayalakshmi and B. Mohan Kumar Naik
The voltage transfer characteristics of the digital inverter are calculated as an inverted
phase function, which shows the exact switching between on and off states. VTC
signifies that a lower input voltage results in a higher output voltage. The transition
region’s slope is a function of how well steep slopes produce accurate switching.
Figure 5a shows the voltage transfer curve for the FinFET-based inverter or DC
characteristics of an inverter. Figure 5b represents the input and output characteristics
of the proposed 20 nm FinFET-based inverter circuit. As for low input, the value reads
a high output value. The delay of the circuit is characterized by the rise time and fall
time of the pulse.
4 SRAM Design
A memory cell’s primary function is to accumulate a single bit of data using a pair
of inverters that are cross-coupled and a pair of access transistors. Since an inverter
is a fundamental component of every circuit simulation, spice modeling is used to
construct the SRAM cell, which is then simulated using the H-SPICE tool.
SRAM design methodology—The SRAM cell is a promising and reliable appli-
cation of FinFET-based architecture, Fig. 6, shows the logical cell of the 6 transistor
SRAM architecture based on the developed FinFET model, where each of the gates
in the FinFET device is controlled independently. 6T SRAM cell is composed of
2 cross-coupled inverters (P1, D1, and P2, D2), each of which has its output fed
into the other; the loop is employed to maintain the states of respective inverters.
Access transistors (A1 and A2) and WL/BL referring word line and bit line which
are utilized to operate write and read cycles, from each cell. By rendering the low
word line in the halt mode, the access transistor will be turned off, thus the inverter
output will be complementary.
The left inverter’s PMOS transistor will be turned-on, showing higher perfor-
mance, and the second inverter’s PMOS is turned off. The term line drives the
controlling gates of transistor that links the bit line and the word lines. The SRAM
cell is detached from bit lines by keeping the word line short. It is important to
do appropriate transistor sizing of each cell about the specifications for better cell
activity.
The architecture and execution of FinFET-based 6T SRAM cell. The basic oper-
ation and implementation of static RAM, as well as transistor sizing for system
stability, are discussed. Simulation is used to study and validate the read and write
operations.
Figure 6 shows the devices are connected to form a cross-coupled inverter and
implementing a static RAM. It consists of two pull-up pMOS transistors P1 and P2
and two pull-down transistors D1 and D2 which are also called drive transistors from
cross-coupled inverters followed by two access transistors A1 and A2, connected to
SOI FinFET-Based 6T SRAM Design 497
Fig. 5 a VTC curve for inverter. b Input and output voltages for an inverter
498 V. Vijayalakshmi and B. Mohan Kumar Naik
Table 2 Operation of 6T
Read operation Write operation
SRAM
1.Charging bit, bit_b high 1. Charging bit high
2. Both bit and bit_b float 2. Let the bit to float
3. Pull up the word line 3. Pull down bit_b to ground
Bit will contain the data value 4. Pull up the word line
Qholds a high value
the nodes which contain the stored bit and its complement to the bit lines. Table 2
shows the operation of 6T SRAM.
connects the Q-node to V dd . As a result, P must be frailer than A for the write to be
prosperous.
These two conditions (D > A and A > P) gives the basic essential relationships
between the transistor pairs for correct operation. D transistor must be stronger, and
P must be the weakest, with a transistor in between and used D-8/2, A-4/2, and P-3/3
which are tabulated in Table 3.
When the word line is high, the-NMOS performs and binds the inputs of the inverter
and two vertical bit lines are invoked to the outputs. During a read operation, both
of the inverters impel the stored current within the cell of memory into the bit line
and invert the value-over the flipped bit line, these data produces the SRAM cell’s
output-value.
Input drivers activate the powerful bit lines to write any data into the memory
during writing operations. A short-circuit condition may occur depending on the
current value, thus in SRAM the value is overwritten.
5 Results
Synopsis spice tool is used to develop the circuit performance of the 6T SRAM
cell by using previously modeled 20 nm LUT (Lookup-table)-based FinFET device
model. It consists of values for parameters related to 20 nm technology, as well as
a 20 nm gate length to synthesize the result for a 6T SRAM cell. In Table 3, the
transistor size for the SRAM cell has also been measured and defined.
Figure 7 shows the write HIGH function achieves by shifting q from LOW to
HIGH by flipping the bit. It is also shown that around 0.3 ns after the word line stops
rising, q hits a voltage that’s within 10% of HIGH.
Similarly, Fig. 8 shows that as the write LOW feature succeeds, the bit is flipped
and q is changed from HIGH to LOW. Here, Fig. 8 is virtually similar to Fig. 7,
during the write LOW q drops down to 0 V, before the word line completes its rising.
500 V. Vijayalakshmi and B. Mohan Kumar Naik
Although it may seem that writing a LOW is quicker than writing a HIGH, that is
not the case, looking at qb, it only takes 3 ns for the word line to reach within 10% of
1 V after it completes rising. Whatever data value is being recorded, the cell should
be stable for the same amount of time.
To arbitrate the completion of the read functions, it is mandatory to interpret the
bit lines and the q-nodes. For the functioning shown in Fig. 9, since q is high, the
read should keep the bit high. In particular, the bit remains HIGH while bit_b drops
to LOW, as anticipated. Another crucial aspect for a good read is that the operation
does not result in the bit being flipped in an unintended manner. In this case, qb rises
marginally in voltage but does not exceed 0.3 V. The read was effective because the
bit stayed constant when the value was read.
SOI FinFET-Based 6T SRAM Design 501
Figure 10 shows the bit lineand q-nodes are reversed. Where, q is LOW, and the
read should outcome in dropping the bit to LOW, i.e., to 0 V while the bit stays stable
for a successful read operation.
It is important to note that the falling bit line for the read function might not
fall inside 10% of 0 V until around 3 ns just after the word line has finished rising.
The FinFET-based 6T SRAM shows that the leakage power consumption in standby
mode is 219 pW. It shows low leakage power for FinFET-based SRAM than conven-
tional COMS-based SRAM design. Using FinFETs certainly outperforms the CMOS
counterpart for SRAM cell in terms of reliability, power consumption, and robustness.
6 Conclusion
SOI-based FinFETs are being extensively used in VLSI circuits and semiconductor
memories. The basic device is modeled using Silvaco TCAD atlas for various perfor-
mance characteristics. Based on the lookup table evaluated, circuit characteristics are
studied using h-spice-based Verilog implementation. Fully depleted SOI FinFET-
based 6 transistors static RAM cell is designed and evaluated. This research work
demonstrates the necessary operations like read 0, read 1, write 0, write 1, and
leakage power is evaluated using h-spice. The improved data stability and low leakage
power show FinFET-based SRAMs shows promising candidates for implementation
of semiconductor memories in microprocessors and semiconductor applications.
Acknowledgements The authors wish to thank New Horizon College of Engineering, Bengaluru
for supporting this work.
References
1. T. Skotnicki, J.A. Hutchby, T.-J. King, F. Boeuf, The end of CMOS scaling toward the intro-
duction of new materials and structural changes to improve MOSFET performance. IEEE Circ.
Devices Mag. 21(1), 16–26 (2005)
2. A. Chin, S.R. McAlister, The power of functional scaling: beyond the power consumption
challenge and scaling roadmap. IEEE Circ. Devices Mag. 21(1), 27–35 (2005)
3. S.E. Thompson, R.S. Chau, T. Ghani, K. Mistry, S. Tyagi, M.T. Bohr, In search of “forever,”
continued transistor scaling one new material at a time. IEEE Trans. Semicond. Manuf. 26–35
(2005)
4. Y. Taur, CMOS scaling beyond 0.1 [mu] m: how far can it go?, in Proceedings of 1999
International Symposium on VLSI Technology, Systems, Applications (1999), pp. 6–9
5. D.J. Frank, R.H. Dennard, E. Nowak, P.M. Solomon, Y. Taur, W.H.-S. Philip, Device scaling
limits of Si MOSFET’s and their application dependencies, in Proceedings of IEEE, vol. 89
(2001), pp. 259–288
6. A. Keshavarzi et al., Leakage and process variation effects in current testing on future CMOS
circuits. IEEE Des. Test Comput. 19(5), 33 (2002)
7. International Technology Roadmap for Semiconductors (ITRS)
8. T. Mizuno, N. Sugiyama, T. Tezuka, T. Numata, S. Takagi, High performance CMOS operation
of strained-SOI MOSFET’s using thin film SiGe-on-insulator substrate, in 2002 Symposium
on VLSI Technology. Digest of Technical Papers (2002), pp. 106–107
9. R. Vaddi, R.P. Agarwal, S. Dasgupta, Compact modeling of a generic double-gate MOSFET
with gate S/D underlap for subthreshold operation. IEEE Trans. Electron Dev. 59(10) (2012)
10. S. Jha, S.K. Choudhary, Impact of device parameters on the threshold voltage of double-gate,
tri-gate and gate-all-around MOSFETs, in 2018 IEEE Electron Devices Kolkata Conference
(EDKCON)
11. R. Ramamurthy, N. Islam, et al., The tri-gate MOSFET: a new vertical power transistor in
4H-SiC. IEEE Electron Dev. Lett. 42(1)
12. E.J. Nowak et al., A functional FinFET-DGCMOS SRAM cell, in IEDM Technical Digest
(2002), pp. 411–414
13. S.S. Rathod, A.K. Saxena, S. Dasgupta, A proposed DG-FinFET based SRAM cell design with
RadHard capabilities. Microelectron. Reliab. 50(8), 1181–1188 (2010)
SOI FinFET-Based 6T SRAM Design 503
14. R.V. Joshi et al., FinFET SRAM for high-performance low-power applications, in ESSCIRC
(2004), pp. 211–214
15. M. Ishida et al., A novel 6T-SRAM cell technology designed with rectangular patterns scalable
beyond 0.18μm generation and desirable for ultra high speed operation, in IEDM Technical
Digest (1998), pp. 201–214
16. F. Moradi, G. Panagopoulos, G. Karakonstantis, D. Wisland, H. Mahmoodi, J.K. Madsen, K.
Roy, Multi-level word line driver for low power SRAMs in nano-scale CMOS technology, in
IEEE 29th International Conference on Computer Design (ICCD), 9–12 Oct 2011, pp. 326,
331
17. A.B. Sachid, C. Hu, Denser and more stable SRAM using FinFETs with multiple fin heights.
IEEE Trans. Electron Dev. 59, 2037–2041 (2012)
Cataract Detection Using Deep
Convolutional Neural Networks
Abstract A cataract is one of the important reasons for blindness globally, resulting
in more than 50% of blindness. Early revelation and medication of cataracts can mini-
mize the risk of blindness. We propose a cataract detection model using deep convo-
lutional neural networks (DCNN) based on GoogLeNet architecture, i.e., incept ion
module (award-winning architecture of ILSVRC 2014) it uses 22 layers deep network
which makes it the most reliable architecture. To bring high accuracy while training,
we have used deeper GoogLeNet architecture which comes under the category of
CNN. It uses a convolutional layer, activation layer, fully connected layer, SoftMax
layer, inception layer, and finally Max pooling layer for bringing high accuracy and
efficient training. The best accuracy our method has achieved 86.9 as overall training
accuracy and 35.8 as validation accuracy. We have trained this module using 66
images of three different categories, normal, severe, and mild, and preprocessed it to
1452 image samples. This method has the feasibility to be applied for the detection
of many diseases.
1 Introduction
A cataract is a misty area within the membrane of the eyeball that results in a
shrinkage of vision. Cataracts will elevate slowly and can affect both eyes. Symptoms
of cataracts are murky colors, blur vision, difficulty in facing luminous objects, and
difficulty in seeing at night. This may result in inconvenience with driving, seeing,
and perceiving faces. Poor vision caused by cataracts may also result in an increased
risk of depression. Cataracts result in high chances of blindness and visual disability
worldwide. It increases slowly and finally intrudes with the vision. We may turn out
with cataracts in both eyes, but cataracts won’t form at the concurrent time. They are
common in older people.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 505
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_46
506 A. Jones et al.
Cataract mostly occurs in old people and results in foggy view. They may also form
due to other genetic disorders. Cataracts can also be caused due to past eye surgeries,
diabetes, etc. The following is an enumeration of some short-term and long-term
effects of cataracts. Short-term effects of cataracts are diplopia or miscellaneous
images in the beginning stage. A nimbus sort of effect may develop around lights.
Near-vision blurring, sensitivity to bright lights. Some long-term effects are near
vision becomes much poor after preliminary improvement. Vision becomes very
bleary or unclear; this can affect activities of daily living; driving, reading. Colors
appear much more illuminated than before. In very rare cases, uncured cataracts can
cause glaucoma or blindness.
2 GoogLeNet Architecture
3 Training Methodology
In this training process, we are going to train the model by performing image augmen-
tation first. This image augmentation will help us to train the model effectively, by
preprocessing all the images present in the data set and converting each image into
Cataract Detection Using Deep Convolutional Neural Networks 507
multiple images of different categories [3]. This process is carried on till imple-
menting the GoogLeNet architecture for training. Various steps involved in our
training are, performing image to array preprocessing, AAP, rotation, flipping, etc.
After performing all these preprocessing operations, the dataset is ready to be trained,
and the deep GoogLeNet architecture is implemented. The GoogLeNet architecture
contains various layers such as convolutional layer, activation layer, max pooling
layer, fully connected layer, and SoftMax layer as given below [4]. The preprocessed
images will be affected on each appropriate layers according to their predefined
operations. Each layer is interconnected with the other, and the output from previous
layer will be given as an input for next layer [3]. Finally, after the training process,
the model is tested using some sample testing images of three different categories as
normal, severe, and mild, and the prediction output is noted from the graph obtained.
We have imported the required libraries for training the model. Some libraries like
NumPy, Matplot, OS, Adam, and SGD are imported and the process is continued
[2]. Google Colaboratory is one of the user-friendly platforms for developers, rather
than downloading each package like NumPy, pandas, and Matplot separately, Google
Colab provides us with pre-installed packages.
Here, we are converting a single image into multiple images of different categories
by rotating, flipping, contrast adjustment, inverting [5], etc.
508 A. Jones et al.
1. Rotating: Rotating the image to different angles of 200, 400, 600, 800, 1000,
1200, 1400, 1600, 1800, 2000 etc. Rotation is required to train the module to
an extent that it should detect the exact category of an image despite different
angles and positions
2. Sharpness: The image is sharpened using a Python code and edges are given a
high concentration
3. Brightness: The hue and background contrast of the image are adjusted and
brightness is increased
4. Flipping: Flipping the images into two different categories as horizontal flipping
and vertical flipping
5. Finally Max Pooling, Activation, and SoftMax layers are included and clustered
together to produce the output.
During the preprocessing and training of images, the area of cataract in the image
should not be affected by any other operations like cropping, flipping, rotating,
resizing, etc. So, to maintain the original features of an image, we are doing aspect-
aware preprocessing. This preprocessing methodology comes under OpenCV library
of Python’s machine learning module. The OpenCV (computer vision) is most
commonly used in case of image processing operations and for developing image
editing applications [6]. On using this preprocessing technique, image’s original
features can be maintained accurately without losing the required details [6]. It is
advisable to use this technique along with image augmentation operation because
the ultimate aim of this technique is to protect the images from losing its important
details by various augmentations.
The pixels of each image contain a 0:255 aspect ratio and 8 bit data. If we give
this huge value for training, it will not be trained accordingly [6]. So, to avoid this
accuracy problem, we are reducing the aspect ratio from 0:255 to 0:1 by importing
image to array function from Python’s TensorFlow library. So that the training will
be more efficient. 0:1 in the sense we are reducing it to 0:42, 0:35 [2]. Use the SI as
the primary unit. English units are used as secondary units.
of architecture. The global average pooling is used at the cusp of the architecture
instead of using fully connected (FC) layers. The idea of implementing these two
techniques is from a paper “Network In Network [7].”
Convolution means to filter; single image is the combination of many pixels, here
we are taking one of the convolutional filters of size 3 × 3 or 5 × 5, placing and
replacing it in all the areas of an image [8] and extracting the features of each existing
layers.
The activation layer manually inserts higher-order functions to the features taken
from the convolution layer to increase data complexity activation layer is nothing
but just the output of a function [8]. You will feed input to one of the activation
functions and it will give one output, and this operation we call a layer. Yes, just like
we have a different function in mathematics, where you feed so me matrix or values
to a function and it will give output.
The Max pooling layer defines the characteristics present in an area generated by a
convolution layer. So, additional operations are performed on defined characteristics
instead of largely initiated characters produced by the convolution layer [9]. This
makes the model more resilient to variations in the position of the characters in the
input image. It calculates the maximum value for each part of the feature map.
From the overall features obtained, they cluster the categories of normal, mild, and
severe to their respective classes [5]. When the image is given, it will automatically
detect the category of the image and match it with the appropriate cluster.
510 A. Jones et al.
Global average pooling was designed to substitute fully connected layers in convo-
lutional neural networks [10]. It reduces the parameters in a model for controlling
overfitting; there are no parameters for optimizing in global average pooling and
thus, overfitting can be avoided here as shown in Fig. 3.
Setting the number of weights as zero, moving from the fully connected layers
(FC) to average pooling improves the accuracy to 0.6%. This idea is from network
in network [7] which can be less prone to overfitting.
There are some intermediate SoftMax branches at the middle, they are used for
training only. These branches are auxiliary classifiers as in Tables 1 and 2.
The label binarizer covers the name of labels such as normal, severe, and mild into
Binary format so that the machine can understand it and we can get accurate output.
As the machine cannot understand the human language, it is advisable to convert the
human language to binary [11].
4 Related Works
This section describes the works and findings made so far related to machine
learning modules and algorithms, convolutional neural networks-based architecture
for cataract detection [13]. Using Google Colaboratory, Google organization’s online
development platform for programming purposes, and python as the programming
language, this paper contains research about various types of cataracts in the human
eye, causes of cataracts, and collect ion of datasets from various sources. Method of
classification of cataract mainly comprises four parts namely preprocessing, feature
extraction, feature selection, and classifier or model [14]. The primary purpose of
any classifier is achieved based on the input provided to it. Input for cataract clas-
sification is a dataset of eye images of three different classes called normal, mild
cataract, and severe cataract. For many years, research on Fundus image analysis
has been conducted. Fundus images are obtained by Fundus camera which clearly
distinguishes cataracts from the normal eye [1]. The five convolution layers in deep
learning are used to separate the features in Fundus images [15]. Some of the features
of eye images have been extracted such as texture, sketch, color, wavelet, acoustical,
spectral parameters, and so on [1]. One of the other methods for cataract classifica-
tion and detection is the GoogLeNet model. The collected dataset is preprocessed for
removing the noise in images. The training model is built using a convolutional neural
network. Transfer learning is used where a model developed here is used as the input
for another model [7]. In few other models, image preprocessing is done with the
help of the maximum entropy method. The features are collected automatically using
Caffe. The extracted features must be identified and compared. In this case, SSVM
is used for classification. For the collected dataset, four different classifications are
done. But SoftMax gives better accuracy [15].
5 Implementation Methodology
Fig. 5 Implementation
methodology
514 A. Jones et al.
categories as normal, mild, and severe. Finally, the model is tested by giving input
images of three different categories.
6 Proposed Methodology
This proposed methodology gets the input from the user directly and detects the input
image using the trained model and throws the accurate output.
The image is given as the input, the trained model will perform 2D convolution,
3 × 3 Max Pooling, flattening of pixels, aspect-aware preprocessing, etc., and throws
the accurate result the convolution operations are performed at respective layers, and
finally, 3 × 3 Max Pooling is performed [11]. The pixels in each image are flattened
to reduce the size of an image, and dense operations are performed in Fig. 6. Finally,
the output is obtained. The parameters used for evaluation are as follows.
6.1 Accuracy
Accuracy = TP + TN/TP + TN + FP + FN
Macro average takes the average of precision and recalls the machine on various
datasets. This can be used when we need to analyze the overall system performance
on the given dataset. It will be useful when our dataset varies in size.
The weighted average or micro average will epitomize the individual true positives
(TP), false positives (FP), and false negatives (FN) for different images in the dataset
and implement them to obtain statistics.
6.4 Precision
Precision = TP/TP + FP
6.5 Recall
Recall = TP/TP + FN
6.6 F1 Score
The F1 Score can be represented as the biasing of precession and recall. Where this
F1 score reaches its best value of 1and the worst value of 0. The formula of F1 score
516 A. Jones et al.
is
6.7 Support
The support denotes the actual occurrences of class in a given dataset. The structural
weakness in our architecture may due to the imbalanced support. Support value
indicates the need for rebalancing or sampling. Table 3 shows comparison of proposed
methodology with existing methodology.
7 Result
Hence, we have obtained the following result on running our programing Google
Colaboratory. Our trained model gives an accuracy of around 89% as training
accuracy and 86% as validation accuracy (val_acc).
Figure 7 shows the training loss (train_loss) and training accuracy (train_acc) of
cataract, which contains four categories of training results as training loss (train_loss),
training accuracy (train_acc), validation loss (val_loss), and validation accuracy
(val_acc) represented by four different colors, and Table 4 shows the weighted
average, accuracy, and macro average, and Table 5 shows expected output versus
obtained output.
Cataract Detection Using Deep Convolutional Neural Networks 517
Normal Normal
Normal Normal
Severe Severe
Mild Mild
Mild Normal
518 A. Jones et al.
8 Conclusion
Hence, this paper will provide simple diagnostic mechanisms and reduce the usage
of high-end machinery. The detection work of ophthalmologists is made easier, and
with the limitation of the dataset high accuracy and training results are obtained. This
proposed methodology paves an easier path for cataract detection.
References
1. Szegedy et al., Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), Boston, MA, USA (2015), pp. 1–9. http://doi.org/10.1109/
CVPR.2015.7298594
2. H. Li et al., Computerized systems for cataract grading, in 2009 2nd International Conference
on Biomedical Engineering and Informatics, Tianjin, China (2009), pp. 1–4. http://doi.org/10.
1109/BMEI.2009.5304895
3. N. Sokolova, M. Taschwer, S. Sarny, D. Putzgruber-Adamitsch, K. Schoeffmann, Pixel-based
iris and pupil segmentation in cataract surgery videos using mask R-CNN, in 2020 IEEE 17th
International Symposium on Biomedical Imaging Workshops (ISBI Workshops), Iowa City, IA,
USA (2020), pp. 1–4. http://doi.org/10.1109/ISBIWorkshops50223.2020.9153367
4. S. Kasiviswanathan, T.B. Vijayan, L. Simone, G. Dimauro, Semantic segmentation of conjunc-
tiva region for non-invasive anemia detection applications. Electronics 9, 1309 (2020). http://
doi.org/10.3390/electronics9081309
5. S. Kasiviswanathan, T.B. Vijayan, S. John, Ridge regression algorithm based noninvasive
anaemia screening using conjunctiva images. J. Ambient. Intell. Humaniz. Comput. (2020).
https://doi.org/10.1007/s12652-020-02618-3
6. S. Hu et al., Unified diagnosis framework for automated nuclear cataract grading based on
smartphone slit-lamp images. IEEE Access 8, 174169–174178 (2020). https://doi.org/10.1109/
ACCESS.2020.3025346
7. M. Lin, Q. Chen, S. Yan, Network in network. arXiv: 1312.4400 v3 [cs.NE] (2014)
8. M.K. Behera, S. Chakravarty, A. Gourav, S. Dash, Detection of nuclear cataract in retinal
fundus image using radial basis function based SVM, in 2020, Sixth International Conference
on Parallel, Distributed and Grid Computing (PDGC), Waknaghat, India (2020), pp. 278–281.
http://doi.org/10.1109/PDGC50313.2020.9315834
9. Y. Dong, Q. Zhang, Z. Qiao, J. Yang, Classification of cataract fundus image based on deep
learning, in 2017 IEEE International Conference on Imaging Systems and Techniques (IST),
Beijing, China (2017), pp. 1–5. http://doi.org/10.1109/IST.2017.8261463
10. M.T. Islam, S.A. Imran, A. Arefeen, M. Hasan, C. Shahnaz, Source and camera independent
ophthalmic disease recognition from fundus image using neural network, in 2019 IEEE Inter-
national Conference on Signal Processing, Information, Communication & Systems (SPIC-
SCON), Dhaka, Bangladesh (2019), pp. 59–63. http://doi.org/10.1109/SPICSCON48833.2019.
9065162
11. S. Sadasivam, S. Karthick Ramanathan, Effective watermarking of digital audio and image
using Matlab technique, in 2009 Second International Conference on Machine Vision. IEEE
(2009)
12. A.S.V. Ptraneel, T. Srinivasa Rao, M. Ramakrishna Murthy, A survey on accelerating the
classifier training using various boosting schemes within cascades of boosted ensembles, in
International Conference with Springer SIST Series, vol. 169 (2019), pp. 809–825
13. L. Zhang, J. Li, H. Han, B. Liu, J. Yang, Q. Wang, Automatic cataract detection and grading
using deep convolutional neural network, in 2017 IEEE 14th International Conference on
Cataract Detection Using Deep Convolutional Neural Networks 519
Networking, Sensing and Control (ICNSC), Calabria, Italy (2017), pp. 60–65. http://doi.org/
10.1109/ICNSC.2017.8000068
14. N. Hnoohom, A. Jitpattanakul, Comparison of ensemble learning algorithms for cataract detec-
tion from fundus images, in 2017 21st International Computer Science and Engineering Confer-
ence (ICSEC), Bangkok, Thailand (2017), pp. 1–5. http://doi.org/10.1109/ICSEC.2017.844
3900
15. S. Bhat, S. Mosalagi, T. Balerao, P. Katkar, R. Pitale, Cataract eye prediction using machine
learning. Int. J. Comput. Appl. 176(35) (2020). 0975-8887
Comparative Analysis of Body Biasing
Techniques for Digital Integrated
Circuits
Abstract In VLSI, the sequential circuits depend upon the clock. High speed and
the low power consumption are the two major goals in every circuit design. Different
biasing techniques are applied to shift registers and analyzed them by calculating
the amount of power consumed and the delay of the circuit. In this paper, a 4-
bit shift register designed using multiplexers has been extended into 8-bit register.
Four biasing techniques have been applied, namely standard, V DD /2, 3V DD /4, V DD /4
biasing, and gate level body biasing to know their characteristics. The entire designing
and analysis of the shift register have been done using the Cadence Virtuoso tool in
180 nm technology. The circuit is designed to obtain 1.8 V (full swing voltage).
1 Introduction
This paper is broadly divided into two parts, the first part deals with the design of the
4-bit and 8-bit registers and the second part deals with biasing techniques applied to
them. The Cadence tool with Virtuoso has been used for the designing part and the
power and delay calculations. The circuits have been designed with 180 nm tech-
nology with 1.8 V applied as the VDD to the circuits. The basic blocks were inverter,
2 × 1 multiplexer, 4 × 1 multiplexer, 8 × 1 multiplexer, and D flip-flop. Multiplexer
is designed by transmission gates [1]. The D flip-flop is also designed with multi-
plexer [2]. In shift registers, the data can be shifted or rotated by the required number
of bits [3]. The difference between proposed shift registers and the barrel shifter is
circuit depends upon clock. In the project, four types of biasing techniques have been
applied. Dynamically, changing the transistor threshold voltage is called as the body
G. Srinivas Reddy
Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India
D. Khalandar Basha (B) · U. Somanaidu
Institute of Aeronautical Engineering, Hyderabad, Telangana, India
R. Raju
St. Peters Engineering College, Hyderabad, Telangana, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 521
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_47
522 G. Srinivas Reddy et al.
biasing. This threshold voltage directly affects the power consumption and the delay
of the circuit. This body biasing alters the characteristics like power consumption
and delay. So, this technique greatly helps in fetching the better characteristics for
the circuit. The body biasing techniques applied are standard biasing, gate level body
biasing, VDD/2 biasing, and 3VDD/4 and VDD/4 biasing.
2 Literature Survey
2.1 Multiplexer
Multiplexer is a combinational circuit which has 2n inputs, n select lines, and single
output. Transmission gate-based multiplexer shown in Fig. 1 and truth table of it
shown in Table 1.
D flip-flop using 2 × 1 Mux
Figure 2 shows the D flip-flop using 2 × 1 multiplexers. In when the clock goes
low, it reads the input D through the first multiplexer and places it at the input of the
second multiplexer. When the clock goes high, the second multiplexer transmits the
input D to the output Q.
Fig. 1 Multiplexer
Every CMOS will have four terminals; they are source, drain, gate, and body. The
voltage difference between the source and body will affect the threshold voltage of
the CMOS. This threshold voltage is responsible for the amount of power consumed
by the CMOS. Adjusting this threshold voltage of a CMOS dynamically to fetch
better results is called body biasing. In our project, we will apply many biasing
techniques and calculate the power and speed [4].
The most common biasing technique used is standard biasing. In this, the body
terminal of PMOS is connected to the VDD and NMOS is connected to the ground
as in Fig. 3.
Instead of standard biasing, connect the substrates of NMOS and PMOS to VDD/2
as shown in Fig. 4.
524 G. Srinivas Reddy et al.
In this biasing technique, the body terminal of NMOS and PMOS connected to
VDD/4 and 3VVD/4, respectively, as shown in Fig. 5.
2.6 GLBB
GLBB stands for the gate level body biasing [4]. The gate level biasing for any
CMOS circuit is shown in Fig. 6.
It is a simple dynamic body biasing technique, results high speed. This technique
is mainly aimed to overcome the disadvantages DTMOS technology. This technique
is fast, energy efficient in both sub threshold and near threshold region, and maintains
robustness against temperature and process variations [5]. The body biased generator
(BBG) circuit manages the body voltage of the circuit. It is a push pull amplifier
which acts as a voltage follower. This voltage follower circuit helps in decoupling
the large body capacitances at the output node. When V out is equal to the 0 V, the
BBG transfers low voltage on VB, thus preparing pull up section for the faster logic
switching. When V out is equal to VDD, the BBG transfers high voltage on VB, thus
preparing pull down network for the faster logic switching [6]. BBG also has static
current flowing through it which causes the static power consumption [7].
The main aim of designing this circuit is to develop a shift register to shift/ rotate the
bits in a single cycle depending upon the clock pulse. For a 4-bit register, four 4 × 1
multiplexers and four D flip-flops will be used shown in Fig. 7 and Table 2 provides
the functionality.
The circuit of 4-bit shift register extended to 8-bit shift register. Three select lines
used to perform eight operations. Table 3 provides the functionality for the 8-bit
register shown in Fig. 8.
The 4-bit and 8-bit registers are developed with many basic blocks. The two
primary blocks used are 2 × 1 multiplexer and inverter. With help of symbols created
for these two blocks, every other circuit in this project has been designed. So, making
any changes in these two circuits will reflect in every circuit.
The schematic of inverter with standard bias and V2DD shown in Fig. 9, respectively.
The inverter with 3V4DD , V4DD , and GLBB biasing shown in Fig. 10. The schematic
VDD
Fig. 9 a Inverter schematic with standard biasing. b Schematic of inverter with 2 biasing
3V DD VDD
Fig. 10 a 4 , 4 biased inverter schematic. b GLBB biased inverter schematic
of 2 × 1 multiplexer using V2DD , standard biasing, 3V4DD and V4DD biasing, and GLBB
biasing shown in Figs. 11, 12, 13, and 14, respectively. The realization of 4 × 1
multiplexer using 2 × 1 is shown in Fig. 15, and the realization of 8 × 1 is shown in
Fig. 16. Schematic of 1-bit, 4-bit, and 8-bit register is shown in Figs. 17, 18, and 19.
Fig. 11 2 × 1 mux
schematic using VDD
2 biasing
528 G. Srinivas Reddy et al.
3VDD VDD
Fig. 13 2 × 1 mux using 4 and 4 biasing
The proposed techniques are implemented by using technology gdpk180. The simu-
lation was performed using Cadence spectra tool with supply voltage 1.8 V in 180 nm
technology. The proposed 4-bit and 8-bit registers are tested functionality is checked.
Comparative Analysis of Body Biasing Techniques … 529
Fig. 14 Realization of 8 × 1
multiplexer
Then, applied different biasing techniques to them and tabulated the power consumed
and delay occurred.
For a sequential circuit, the output varies at only positive edge of clock pulse. In
the output waveform at 10 ns, the input to the S0 S1 is 0 0, so reset action takes place
and all the output states Q3, Q2, Q1, and Q0 are 0. At 30 ns, the input to the S0 S1
is 0 1, so the load action takes place and it reads the inputs provided to them, so Q3,
Q2, Q1, and Q0 are 0, 1, 0, and 1, respectively. At 60 ns, the input to the S0 S1 is 0
530 G. Srinivas Reddy et al.
1, so left shift operation takes place and the output states Q3, Q2, Q1, and Q0 are 1,
0, 1, and 0, respectively.
At 80 ns, the input to the S0 S1 is 1 1, so the right shift operation takes place and
the output states of Q3, Q2, Q1, and Q0 are 0, 1, 1, and 0. At 70 ns, another left shift
operation takes places, so the states of Q3, Q2, Q1, and Q0 before 80 ns are 0, 1, 0,
and 0. Similarly, for 8-bit register for various operations. The simulation results are
Comparative Analysis of Body Biasing Techniques … 531
shown in Figs. 20 and 21. The power and delay calculations by applying different
types of biasing techniques are shown in Tables 4 and 5.
5 Conclusions
In this paper, a 4-bit register and 8-bit register which can perform n-bit shifting and
rotating operations depending up on clock pulse have been proposed and the results
are simulated with Cadence spectra tool with 180 nm technology. Later, four biasing
techniques have been applied to them and power consumed and delay occurred were
calculated. With the tabulated results, it is observed that different biasing technique
yields different types of characteristics to the circuit.
532 G. Srinivas Reddy et al.
Table 4 Comparison of
Biasing technique 4-bit register 8-bit register
power consumed
Standard 4.99 mW 8.415 mW
VDD/2 12.95 W 43.75 W
3VDD/4 and VDD/4 2.214 mW 6.99 mW
GLBB 2.328 W 7.87 W
Table 5 Comparison of
Biasing technique 4-bit register 8-bit register
delay
Standard 60.35 ns 70.48 ns
VDD/2 60.25 ns 70.42 ns
3VDD/4 and VDD/4 10.26 ns 70.45 ns
GLBB 60.32 ns 30.39 s
References
1. X. Chen, N.A. Touba, Fundamentals of CMOS Design, Electronic Design Automation (Morgan
Kaufmann, Burlington, 2009), pp. 39–95. ISBN 9780123743640
2. Shivali, S. Sharma, A. Dev, Energy efficient D flip-flop using MTCMOS technique with static
body biasing. Int. J. Recent Technol. Eng. (IJRTE) 8(1) (2019). ISSN: 2277-3878
3. M. Morris Mano, M.D. Ciletti, Digital Design, 6th edn. (Pearson, Los Angeles, 2018)
Comparative Analysis of Body Biasing Techniques … 533
4. R. Taco, M. Lanuzza, D. Albano, Ultra-low-voltage self body biasing scheme and its application
to the basic athematic circuits (hindawi.com). Volume 2015, VLSI Design, Article ID 540482,
pp. 10 (2015). https://doi.org/10.1155/2015/540482
5. D. Khalandar Basha, A. Pulla Reddy, R. Raju, G. Srinivas Reddy, Gated body biased full adder.
Mater. Today Proc. 5(1), Part 1, pp. 673–679 (2018)
6. D. Khalandar Basha, S. Reddy, K. Aruna Manjusha, 2D symmetric 16*8 SRAM with reset. J.
Eng. Appl. Sci. 13(1), 58–63 (2018)
7. D. Khalandar Basha, B. Naresh, S. Rambabu, D. Nagaraju, Body biased high speed full adder,
in LNCS/LNAI/LNBI Proceedings (2017)
Optical Mark Recognition with Facial
Recognition System
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 535
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_48
536 R. Shah et al.
1 Introduction
Many school institutions are concerned with student engagement in schools because
student collaboration in the class leads to more successful understanding, learning,
and higher accomplishments [1]. Likewise, a greater level of interest in the curriculum
is a major element for teachers to create a favorable atmosphere for seriously partic-
ipating students [2]. Estimating participation in a course on a daily basis is the most
well-known tool for increasing participation. There will be two common strategies
for learning things about attendance. A few teachers enjoy calling students’ names
and assigning grades based on their attendance or absence. A document labeling
slip is spread around by various teachers. In wake of social affair the participa-
tion information through both of these two strategies, educators physically enter the
information into the current framework. Be that as it may, these non-mechanical
methods are incorrect because both are repetitive along with prone to misrepresen-
tation. The aim of this paper is to suggest a sharing model that integrates interaction
through innovative framework for certain upgrades. A face recognition and optical
mark recognition focused on a facial recognition foundation, and a smart device-
based attendance management method has been developed. A filtering system is
used in this expansion from the histogram of oriented gradients (HOG) algorithm.
Face recognition was used and plays out a continuous surveying measure, infor-
mation following and revealing. The information is stored on a cloud server and is
available at anytime from anywhere. In the other hand, the planned OMR method
would concentrate on the production of OMR for MCQs by using a new method-
ology, opening the way for the future studies into more effective OMR in terms of
speed and accuracy. The following is a breakdown of the paper’s structure. Section 2
contains a short review of the literature survey. The proposed system is introduced in
Sect. 3, and the implementation and results are discussed in Sect. 4. The main results
are discussed in the final part.
2 Literature Survey
Crime detectives now use face recognition as a valuable and routine forensic tech-
nique. On the opposite of automatic face recognition, forensic face recognition is
more difficult because it must account for facial gestures collected in less-than-ideal
environments because it carries a high degree of accountability for adhering to legal
procedures. The effect of recent developments in automated face recognition on the
forensic face recognition community is discussed. Improvements in forensic face
recognition will address facial aging, facial marks, forensic sketch detection, face
recognition in video, near-infrared face recognition, and the use of soft biometrics
[3]. A facial image can be represented by a position in an input image multiplied by a
number of facial images. In strategies based on similarity identification, the symbolic
ability of a face database is measured with how typical image data are chosen to allow
Optical Mark Recognition with Facial Recognition System 537
for potential model differences and also how many typical photographs or their local
features are usable [4]. A collection of models captures spatial and audio features in
videos. Convolutional neural networks, which have been pre-trained on large face
recognition datasets, capture spatial features. It shows that using strong industry-
level face recognition networks improves emotion recognition accuracy [5]. The
new technique can handle thin papers and answer sheets with low printing accu-
racy. The image scan, tilt correction, scanning error correction, regional deformation
correction, and mark recognition are among the system’s key tools and implementa-
tions. By analyzing the results of a large number of questionnaires, this approach has
proven to be reliable and efficient [6]. A mobile phone-based optical mark recog-
nition (OMR) system checks user response sheets automatically. It makes use of
previous knowledge of the OMR sheet layout, which aids in achieving high speed
and accuracy [7].
A new approach for gender classification and facial expression recognition is
based on the two expressions of anger and joy, as well as geometric and appearance.
Human–computer interaction, driving welfare, and other applications are among
the applications. The most common algorithms for detecting facial expression and
gender are principal component analysis, linear discriminant analysis, and linear
binary pattern algorithms [8]. A posture invariant three-dimensional (3D) outward
appearance acknowledgment technique utilizing distance vectors recovered from 3D
dispersions of facial component focuses to characterize general outward appearances.
Probabilistic neural network engineering is utilized as a classifier to perceive the
outward appearances from a distance vector got from 3D facial element areas. Pain,
disappointment, surprise, excitement, disgust, anxiety, and neutral facial expressions
are effectively recognized [9].
3 Proposed System
The proposed system will have two parts, first being the facial recognition system
and second being the optical mark recognition system. Project is mainly based on
Python and OpenCV.
Referring Fig. 1, face detection is the first step in our process. Here, we are going
to locate the faces in a photograph. Most of this step would be done by encoding
a picture using the histogram of oriented gradients (HOG) algorithm to create a
simplified version of the image. We will select the portion of the image that most
closely resembles a generic HOG encoding of a face using this condensed image.
This algorithm would help in capturing the simple structure of the face [10].
538 R. Shah et al.
The second step in our pipeline is face alignment. We will use a method called
face hallmark estimation to do this. This would be done by figuring out the pose of
the face by finding the main landmarks in the face. Once the landmark would be
found, the algorithms will use them to warp the image so that the eyes and mouth in
the image would be aligned.
The third step in our pipeline is feature extraction. Here, we are going to train a
deep convolutional neural network to generate each face in an image. Now, passing
the aligned face image through this neural network from the last pipeline will help
us to extract features of the face, and these measurements would help us in matching
the data in the next step.
The final step is being the feature matching. In this, we are going to match the
image with its features. This could be done by using machine learning classification
algorithm SVM classifiers, which will use the database of known people to closest
measurement to our test image [11].
From the below-shown Fig. 2, we can see the flow of steps required for the OMR.
Here, we are going to scan the answer sheets in an optical form. After that, we are
going to check whether the QR code location is being scanned properly; if not, then
we are going to rotate the image so that it gets scanned properly.
Optical Mark Recognition with Facial Recognition System 539
After rotating the image, we have to find the allocation of the answer area properly
using the machine learning algorithm. After that, we must find and sort the bubble
area properly and compare the bubble to compute the results by comparing the right
answer; then, we are going to save that data properly and show the result.
Compiling the two parts of the project, we can get output as student’s attendance
report and also the marks obtained by him/her in the examination. This could be a
lot faster, cheaper, and accurate method to solve two problems at a single time [12].
The process of testing is focused after the successful and logical coding. The testing
phase can be divided into three segments: Face identification, face training, and face
recognition identification for attendance system. Similarly, scanned copy identifica-
tion, checking bubble answer, and comparing with the original answer for the OMR
part [13].
Dataset used—we collected images of our friends and classmates having face
alignment and used those images as our dataset. This dataset was in jpeg format.
Figure 3 shows the sample dataset used.
540 R. Shah et al.
The machine is ready to identify the person’s front face using the trained files and
OpenCV and txt by seeing Fig. 4. It identifies the face coming in the camera and
names it if the machine knows about it, otherwise states unknown. Positive encoded
output is shown in Fig. 5.
After finishing with the facial recognition, the system will check the answer sheet
by scanning. The output oriented is similar to the way shown in Fig. 6.
As a result, we conducted total of 100 samples and we got different results when
verified the images at different distances. First, we tried to scan the image at a distance
of 1 ft, and the facial detection was too quick, but when the distance increased, there
was a time delay in output. After a distance of 5 ft, system was unable to generate the
output data. Table 1 shows the data collected from various experiments performed.
The system also depends on the quality of camera used. We tried on Apple iPhone
X and on Mac Book Pro, and mobile could generate data on small distances, while
laptop with higher-efficiency camera generated data with maximum 5 ft of distance.
5 Conclusion
The purpose of this paper is to analyze a faster means of method for traditional
systems that are used in attendance marking and examination checking. This paper
has also showcased various vital topics like advantages and disadvantages, draw-
backs, and solutions for the OMRFRS, when used in different modes of environment.
This paper has also shown how two different aspects of technologies, namely optical
mark recognition and other facial recognition system, could be clubbed and used in
a more modified way. As a fact, there is nothing in the world that exists without any
flaws. This system also might have few drawbacks, but the outputs obtained were
found to resolve most of the issues in traditional approach of attendance and answer
checking system and were also a lot satisfactory.
References
1. L. Stanca, The effects of attendance on academic performance: panel data evidence for
introductory microeconomics. J. Econ. Educ. 37(3), 251–266 (2006)
2. P.K. Pani, P. Kishore, Absenteeism and performance in a quantitative module A quantile
regression analysis. J. Appl. Res. High. Educ. 8(3), 376–389 (2016)
3. A.K. Jain, B. Klare, U. Park, Face recognition: some challenges in forensics, in 2011 IEEE
International Conference on Automatic Face & Gesture Recognition (FG), Santa Barbara, CA,
USA (2011), pp. 726–733. http://doi.org/10.1109/FG.2011.5771338
4. S.Z. Li, J. Lu, Generalizing capacity of face database for face recognition, in Proceedings
Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan
(1998), pp. 402–406. http://doi.org/10.1109/AFGR.1998.670982
5. B. Knyazev, R. Shvetsov, N. Efremova, A. Kuharenko, Leveraging large face recognition data
for emotion classification, in 2018 13th IEEE International Conference on Automatic Face &
Gesture Recognition (FG 2018), Xi’an, China (2018), pp. 692–696. http://doi.org/10.1109/FG.
2018.00109
6. H. Deng, F. Wang, B. Liang,A low-cost OMR solution for educational applications, in 2008
IEEE International Symposium on Parallel and Distributed Processing with Applications,
Sydney, NSW, Australia (2008), pp. 967–970. http://doi.org/10.1109/ISPA.2008.130
7. R. Patel, S. Sanghavi, D. Gupta, M.S. Raval, CheckIt—a low cost mobile OMR system, in
TENCON 2015—2015 IEEE Region 10 Conference, Macao, China (2015), pp. 1–5. http://doi.
org/10.1109/TENCON.2015.7372983
8. A.V. Anusha, J.K. Jayasree, A. Bhaskar, R.P. Aneesh, Facial expression recognition and
gender classification using facial patches, in 2016 International Conference on Communi-
cation Systems and Networks (ComNet), Thiruvananthapuram (2016), pp. 200–204. http://doi.
org/10.1109/CSN.2016.7824014.09/CSN.2016.7824014
9. H. Soyel, H. Demirel, 3D facial expression recognition with geometrically localized facial
features, in 2008 23rd International Symposium on Computer and Information Sciences,
Istanbul, Turkey (2008), pp. 1–4. http://doi.org/10.1109/ISCIS.2008.4717898
10. P.N. Maraskolhe, A.S. Bhalchandra, Analysis of facial expression recognition using histogram
of oriented gradient (HOG), in 2019 3rd International conference on Electronics, Communi-
cation and Aerospace Technology (ICECA). http://doi.org/10.1109/ICECA.2019.8821814
11. M. Pantic, I. Patras, Dynamics of facial expression: recognition of facial actions and their
temporal segments from face profile image sequences. IEEE Trans. Syst. Man Cybern. Part B
(Cybernetics) 36(2), 433–449 (2006). http://doi.org/10.1109/TSMCB.2005.859075
Optical Mark Recognition with Facial Recognition System 543
12. K. Verma, A. Khunteta, Facial expression recognition using Gabor filter and multi-layer artifi-
cial neural network, in 2017 International Conference on Information, Communication, Instru-
mentation and Control (ICICIC), Indore (2017), pp. 1–5. http://doi.org/10.1109/ICOMICON.
2017.8279123
13. R. Samet, M. Tanriverdi, Face recognition-based mobile automatic classroom attendance
management system, in 2017 International Conference on Cyberworlds (CW), Chester (2017),
pp. 253–256. http://doi.org/10.1109/CW.2017.34
Evaluation of Antenna Control System
for Tracking Remote Sensing Satellites
Abstract Remote sensing is important for obtaining information associated with the
earth’s resources and its environment. The tracking of LEO satellites is increasing
rapidly. At ground station, on a daily basis, we track multiple remote sensing satellites
in different modes of tracking, and we acquire data from them. This paper brings out
existing tracking techniques and tracking LEO satellites at S and X bands. This article
represents the evaluation of the response of the closed-loop servo system considering
it as second-order system with autotrack error voltages as step input to the antenna
control system. Earth station antenna has been established for acquiring payload data
from low earth orbit satellites in X-band (8.2–8.4 GHz) (Delbert D. Smith, Commu-
nication via Satellite: A vision in Retrospect.; Lewis, Communications Services via
Satellite.). The tracking mode considered is X-autotracking.
Keywords X-band · Autotrack step response · Low earth orbit (LEO) · Autotrack
error voltages
1 Introduction
Evaluating second-order closed-loop servo system output with step input of ampli-
tude +0.14°, −0.14° (3 dB points off from peak/target) along both AZ and EL axis
for X-band, it is important to measure step response regularly for every ground station
antenna in order to verify the second-order system’s time-domain specifications
[1–4].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 545
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_49
546 A. N. Satyanarayana et al.
2 Description
The autotrack step response is that the time needed for autotrack error voltages to
drive antenna control system to the initial angles (peak/target/boresight) from where
step angle was given initially.
Transient response specifications of a second-order system for a unit step input
[5–8].
Delay time—Td, rise time—Tr, peak time—Tp, maximum overshoot—Mp,
settling time—Ts.
In our work, ground station emphasis is on rising time, maximum overshoot, and
bandwidth.
These specifications are shown graphically within the following Fig. 2.
In the Indian remote sensing satellite (IRS) ground station antenna servo control
system, we evaluate the rise time (tr), percentage overshoot, and settling time (ts).
Evaluation of Antenna Control System for Tracking … 547
3 Evaluation Procedure
• After acquiring the signal, a step angle of +0.14° (3 dB point from peak/target)
along the elevation axis is given as an input to the ground station antenna, and the
error voltage (which is the output/response) generated is shown in Fig. 6
• After generating the error voltage, immediately, ground station antenna mode is
changed to X-autotrack to correct the error voltage generated.
• After acquiring the signal, a step angle of −0.14° (3 dB point from peak/target)
along the elevation axis is given as an input to the ground station antenna, and the
error voltage (which is the output/response) generated is shown in Fig. 7.
After generating the error voltage, immediately, ground station antenna mode is
changed to X-autotrack to correct the error voltage generated.
The autoerror voltage generated inside the closed-loop control system, when
moving away from the target/bore-site tower, will be approaching zero (i.e., ground
station antenna will be pointing to the target/peak) after changing antenna mode to
X-autotrack.
550 A. N. Satyanarayana et al.
The step response results (shown in Figs. 4, 5, 6, and 7) for the X-band along
the azimuth and elevation axis should meet our desired time-domain specifications;
otherwise, servo system behaves sluggishly.
Factors affecting tracking gradients: Aging of components, variation in tempera-
ture, rain phasing will be disturbed, and this will lead to changes in gradients.
4 Results
The graph plots time (seconds) versus error voltage (volts). The following graphs
were plotted when antenna mode changed to X-autotrack mode and the error voltage
generated inside the system will have characteristics similar to the second-order
system response to a unit step input, and the error will be corrected and will reach a
steady state (approaches zero) in X-autotrack mode.
Evaluation of Antenna Control System for Tracking … 551
5 Conclusions
Antenna servo system specifications are met. Evaluating autotrack step response for
a closed-loop servo control system is necessary to avoid sluggish behavior of servo
system; otherwise, tracking accuracy, real-time data may be lost.
552 A. N. Satyanarayana et al.
References
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 553
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_50
554 M. Chandrakala and P. Durga Devi
2 Recognition System
A face recognition algorithms are employed to distinguish a unique person’s face and
compare it to images stored within the ORL database [16]. The primary emphasis of
this paper is on correctly recognizing test images from database training images.
2.1 Preprocessing
This paper focuses on accurately distinguishing test images from database training
images. To achieve high recognition accuracy, the facial images in the ORL dataset
were normalized to 50 × 50 pixels from 92 × 112 pixels. As shown in Fig. 1, the
ORL dataset facial images vary for illumination conditions, expressions, and facial
details.
Face detection and recognition depend primarily on feature extraction. The most
pertinent features were extracted from every face image.
The face image is divided into connected grids called cells in HOG feature extraction
[17]. Each cell contains pixels, and from the pixels, gradient magnitude and angle
are computed. Four cells were grouped to form a block with 50% overlap.
Gradients of the cell are calculated by overlaying 1D derivative mask filters
[−1, 0, 1] and [−1, 0, 1]T applied at pixel values located at coordinate points (r, s).
Luminance value at coordinate points (r, s) is represented as V (r, s). Gradients in
x-direction Vx (r, s) and y-direction Vy (r, s) are calculated as
556 M. Chandrakala and P. Durga Devi
Vx (r, s) = V (r + 1, s) − V (r − 1, s) (1)
Gradient angle θ (x, y) and magnitude M(x, y) have been used to generate
histogram. The evenly distributed orientation over (0, π ) is mapped into one of
the nine bins. The gradient magnitudes are accumulated and are mapped into their
angular bins.
For current work, every face image in ORL database resized to 50 × 50 pixels.
Face image is divided into 10 × 10 pixels called a cell. Block is formed by grouping
four 2 × 2 pixels. Feature vector computed from each cell. There are four such
blocks in each row and column of the image with 50% overlapping. Feature vector
computed as
The image is divided into blocks by the LBP feature extractor [18]. Each block is 3
× 3 pixels in size. The local binary pattern of the central pixel is then calculated as
S−1
LBP S,R = P(R S − RC )2 S (5)
S=0
1 if (R S − RC ) ≥ 0
(R S − RC ) = (6)
0 if (R S − RC ) < 0
Face Recognition Using Cascading of HOG and LBP Feature … 557
Here, RC is the central pixel is used as threshold and R S is the gray value of a
neighboring pixel “P.” The values of all neighboring pixels are compared to the
value of the center pixel. If (R S − RC ) ≥ 0, then the corresponding R S pixel is
represented as “1” otherwise “0.” A histogram can be built based on these binary
values. Central pixel value can be computed by converting binary to a decimal value.
Extension of original LBP is the simple rotation invariant that can reduce the size of
the feature vector.
Patterns are classified into two types, uniform and non-uniform patterns. A
uniform pattern is an LBP that contains two bitwise transitions from 1 to 0 and 0 to 1.
More than two bitwise transitions are present in a non-uniform pattern. To represent a
histogram, each uniform pattern has its independent bin while non-uniform patterns
are combined into a single bin. There are 256 patterns when using the (8, S) neigh-
borhood, in which 58 patterns are uniform and one non-uniform pattern, resulting in
59 distinct patterns. For current work, each face image in the ORL database has an
LBP feature vector length of 59.
Figure 2 depicts the proposed algorithm using cascading of LBP and HOG features.
All ORL database face image samples are preprocessed by size normalization to 50 ×
50 pixels, and LBP and HOG features are extracted and cascaded. Even after feature
extraction, each face image is represented as a 1D feature vector. These feature values
are further classified as training and testing features. A recognition algorithm would
have been used to test the face image. We have evaluated various classifiers, such
as KNN, SVM, and RF, by training them on the face image database. The database
includes different facial images for 40 different individuals, with ten images for each
person. The model is trained on eight images and tested on two.
Experimental face recognition results were evaluated using the ORL face image
database. We have utilized different classifiers, namely KNN, SVM, and RF with
different feature extraction methods LBP, HOG, and the combination of LBP and
HOG.
In this experiment, the extracted features have been tested using KNN, SVM, and
RF classifiers. The grid search CV selection method can be used to pick up the best
hyperparameter for the highest accuracy rate. Based on HOG feature extraction, we
experimentally observed that KNN with k = 1, SVM with linear kernel, and RF
with 30 estimators achieved a maximum recognition rate for the test set. Accuracy
of different classifiers-based HOG features compared with training set size as shown
Fig. 3 Accuracy of different classifiers based on HOG features versus training set size
Face Recognition Using Cascading of HOG and LBP Feature … 559
Fig. 4 Accuracy of different classifiers based on LBP features versus training set size
in Fig. 3. Based on HOG feature extraction, the better recognition rate for SVM
classifier was 92.58%.
For the test set, based on LBP feature extraction, we found that KNN with k = 1,
SVM with RBF kernel, and RF with 30 estimators performed effectively for the
highest recognition rate. As shown in Fig. 4, the accuracy of different classifiers with
LBP features compared with variant training set size. The better recognition rate for
KNN classifiers was 86.76% based on LBP feature extraction.
The accuracy of various classifiers using a combination of LBP and HOG features
is compared to the size of the training set in Fig. 5. Both KNN and SVM achieved
the highest recognition rate of 93.75%. As shown in Figs. 3, 4, and 5, increasing the
training data sample size improves recognition accuracy.
The classification performance of the KNN, SVM, and RF classifiers for various
feature extraction methods is shown in Table 1. We concluded that the cascading
of HOG and LBP feature extraction with KNN and SVM face recognition rates
significantly higher than HOG and LBP feature extraction methods individually.
560 M. Chandrakala and P. Durga Devi
Fig. 5 Accuracy of different classifiers based on cascading of LBP and HOG features versus
training set size
4 Conclusion
For face recognition, we used LBP, HOG, and a cascading of LBP and HOG
feature extraction methods, as well as KNN, SVM, and RF classifiers. According
Face Recognition Using Cascading of HOG and LBP Feature … 561
to the experimental results, a cascading of HOG and LBP feature extraction with
KNN and SVM recognition rate is better than using HOG and LBP feature extrac-
tion methods individually. Experimentally, we concluded that recognition accuracy
enhances as the size of the training set grows. The proposed classifiers have the
ability to improve recognition rates while also addressing issues with pose, scale,
expression, and variant illumination. According to recent research, the hyper-spectral
or multi-spectral imaging system would be the future of human face recognition
systems.
References
1. H.I. Dino, Facial expression classification based on SVM, KNN and MLP classifiers, in 2019
International Conference on Advanced Science and Engineering (2019), pp. 70–75
2. Mittal, S. Agarwal, M.J. Nigam, Real-time multiple face recognition: a deep learning approach,
in ACM International Conference Proceeding Series (2018), pp. 70–76
3. V.A. Aviral Joshi, H.M. Surana, H. Garg, K.N. Balasubramanya Murthy, S. Natarajan, Uncon-
strained face recognition using ASURF and cloud-forest classifier optimized with VLAD.
Procedia Comput. Sci. 143, 570–578 (2018)
4. C. Panjaitan, A. Silaban, M. Napitupulu, J.W. Simatupang, Comparison K-nearest neighbors
(K-NN) and artificial neural network (ANN) in real-time entrants recognition, in 2018 Inter-
national Seminar on Research of Information Technology and Intelligent Systems ISRITI 2018
(2018), pp. 1–4
5. M.J. Leo, S. Suchitra, SVM based expression-invariant 3D face recognition system. Procedia
Comput. Sci. 143, 619–625 (2018)
6. C. Eyupoglu, Implementation of color face recognition using PCA and k-NN classifier, in
Proceedings of 2016 IEEE North West Russia Section Young Researches in Electrical and
Electronic Engineering Conference EIConRusNW 2016 (2016), pp. 199–202
7. M. Ghorbani, A.T. Targhi, M.M. Dehshibi, HOG and LBP: towards a robust face recognition
system, in 10th International Conference on Digital Information Management ICDIM 2015,
no. Icdim (2016), pp. 138–141. http://doi.org/10.1109/ICDIM.2015.7381860
8. W.J. Pei, Y.L. Zhang, Y. Zhang, C.H. Zheng, Pedestrian detection based on HOG and LBP, in
Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), vol. 8588. LNCS (2014), pp. 715–720
9. J. Wei, Z. Jian-Qi, Z. Xiang, Face recognition method based on support vector machine and
particle swarm optimization. Expert Syst. Appl. 38(4), 4390–4393 (2011). https://doi.org/10.
1016/j.eswa.2010.09.108
10. X. Wei, G. Guo, H. Wang, H. Wan, A multiscale method for HOG-based face, vol. 1 (2015),
pp. 535–545. http://doi.org/10.1007/978-3-319-22879-2
11. K.J. Julina, S. Sharmila, Facial recognition using histogram of gradients and support vector
machines, in ICCCSP 2017
12. S.M. Bah, F. Ming, An improved face recognition algorithm and its application in attendance
management system. Array 5, 100014 (2020)
13. T. Ahonen, A. Hadid, M. Pietikäinen, Face recognition with local binary patterns, in LNCS
3021 (Springer, Berlin, 2004), pp. 469–481
14. M. Chandrakala, P. Durga Devi, Two-stage classifier for face recognition using HOG features.
Mater. Today Proc. (2021)
15. U. Jayaraman, P. Gupta, S. Gupta, Recent development in face recognition. 408, 231–235
(2020). http://doi.org/10.1016/j.neucom.2019.08.110
16. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
562 M. Chandrakala and P. Durga Devi
17. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in Proceedings
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR
2005, vol. I (2005), pp. 886–893. http://doi.org/10.1109/CVPR.2005.177
18. T. Ojala, M. Pietikäinen, T. Mäenpää, Gray scale and rotation invariant texture classification
with local binary patterns, in Lecture Notes in Computer Science (including Subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1842 (2000), pp. 404–
420
Design of Wideband Metamaterial
and Dielectric Resonator-Inspired Patch
Antenna
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 563
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_51
564 Ch. M. Kumar et al.
In this paper, initially, circular patch with radius r1 = 3 mm and outer square ring
with L = 16.5 mm, W = 20 mm is designed, and it generates triple band covering C,
X, Ku bands. After that, new metamaterial structure contains split-type square, and
circular rings were joined at bottom of FR4 epoxy substrate of t = 1.6 mm. Due to
insertion of innovative structure to the antenna, we got narrow band in S, wide band
in X, lower part of Ku band regions.
Finally, a cylindrical-shape resonator made up of alumina ceramic with εr = 9.9,
tan δ = 0.0001, with radius r = 7 mm, r = 9 mm is inserted on top of patch. Due to
this type of arrangement in antenna, resonant bands are shifted slightly.
Fig. 1 a Ant: 1 b Ant: 2 Top view c Ant: 2. Bottom view d E field of Ant: 3 e Dielectric resonator
antenna (Ant: 3) f Unit cell
Design of Wideband Metamaterial and Dielectric Resonator … 565
Ant: 1 Results
Ant: 1 has good S11 at three bands ranging from 5.65–7.55 GHz, 9.80–14.16 GHz,
15.34–17.46 GHz. We got −28.18 dB, −41.5 dB, −25.24 dB lowest S11 at 6.51,
12.06, 15.77 GHz shown in Fig. 2a. Ant: 1 has less than 1.68 VSWR at all resonating
bands, especially from 5.74–7.38 GHz, 9.98–14.03 GHz, 15.45–16.17 GHz, and it
has good VSWR shown in Fig. 2b.
From Ant: 1, we got isotropic radiation pattern with gain 0.3 dB at 6.415 GHz,
greater than 3.5 dB gain at 12.067, 15.772 GHz shown in Fig. 3.
Fabricated Antenna Analysis (Ant: 2): The fabricated design has very small size
shown in Fig. 4.
From HFSS simulation tool, we got wide band at 9.86–14.82 GHz. We got wideband
experimentally from 10.72 to 14.20 GHz with slight difference with simulation results
shown in Fig. 5.
From Fig. 6, Ant: 2 has 2.4, 4.2, 4 dB gain at 3.035, 6.68, 11.405 GHz.
μ=nz (3)
n
ε= (4)
z
2 Conclusion
The main purpose of design was to get wideband antenna. Here, the wide band
width is obtained by using ring-type structures. The fabricated antenna resonates
at X, lower part of Ku bands. We got peak gain of 2.4, 4.2, 4 dB at 3.035, 6.68,
11.405 GHz. The average efficiency of antenna was >90%.The structure was fabri-
cated and compared with simulation results. In addition, the antenna performance
Design of Wideband Metamaterial and Dielectric Resonator … 569
References
Abstract The main aim of this article is to develop a basic framework for different
steganography techniques in real-time applications. Today, as the utilization of the
Internet expands, security is gaining momentum. Steganography is a strategy for
camouflage confidential data behind an innocent cover record, so that all subtitles
are not known. The most effective method to ensure data information to utilize the
idea of steganography. The aim of steganography is separated into a few types. In this
study report, we will examine main sorts of steganography, by taking a gander at the
cover record as initial step—basic to all classifications, we will talk about text, image,
speech, and video steganography. Steganography text is an interaction that utilizes at
text cover. We will go with line change, how to change names, syntactic strategy, and
favored technique for covering. Coming to speech steganography, it contains LSB
coding, encoding segment, spread spectrum, echo hiding techniques, and speech
steganography utilizing fast Fourier transforms. The idea of video steganography is
utilized to shroud data behind the scenes of video outlines. The concept of steganog-
raphy and cryptography security principles is given in information. Strategy used to
implant LSB information on paper. Video steganography can shroud a lot of informa-
tion in a basic and compelling manner, and we will utilize a wide range of stegano-
graphic techniques. Along these lines, the test proof demonstrates the adequacy of
the covering technique in speech steganography. In this way, we will actually want
to recover classified data for somewhat quality corruption utilizing steganographic
decoder concept.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 571
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_52
572 R. C. Rao et al.
1 Introduction
The name of steganography is derived from Greek words “stegos” means cover,
“graphia” means writing, defined as “cover writing” [1]. Steganography is eluci-
dated as art and science of embedding a secret message in cover message, such
how that, none other than the sender and intended recipient only suspects the secret
message in the cover data. There are many types of steganographic techniques. In this
paper, we will be discussing closely at few different types of steganographic tech-
niques. Here is the simple classification, representing different techniques involved
in steganography. In recent years, steganography is playing vital role in the popular
social network communication channels such as Facebook, WhatsApp, and Twitter
and has larger significance in terms of security, privacy, and embedding capacity
parameters [2]. Also, its applications have been extended in sharing the informa-
tion of medical data, banking information, broadcasting, and military intelligence.
Steganography is a widespread hiding technique in present days due to the distor-
tion handling capacity, imperceptibility, and enormous base to embed the hidden
information. Steganography classified into various types as shown in Fig. 1.
All the stenographic approaches are performs the same operation as shown in Fig. 2.
The sender and receiver should follow some basic communication protocols before
performing the secret communication, which is widely related to.
• Cover sources
• Embedding and extraction algorithms
• Stego key sources to drive the embedding/extraction algorithms
• Messages sources
• Selection of channel to exchange the information.
The cover sources are usually digital speeches with more attributes such as speech
format, resolution, content type, and size. The secret communication is not possible
between sender and receiver if the cover object is formulated completely [3]. Nowa-
days, the steganography creates some fundamental assumption about the cover source
that facilitates conventional analysis. While considering the cover source as a random
variable, which involves information theory analysis. Here, we shall describe in-
detail, how does a steganography hides a message and how to recoup the secret
message. The following flowchart system of steganographic techniques.
So, the above flowchart describes in easier way to understand the concept of
steganography, which hides a secret message in cover data. In above pictorial repre-
sentation, the function f (x), f (m), f (k) are merged into a single function f (x, m, k).
Where the function f (x) denotes the cover file, f (m) signifies the secret message that
to be embedded and function f (k) denotes the key, (optional) which is addressed as
stego-key, used as password to hide and un-hide the message. Presently, we shall
examine each progression in-detail, with end goal that peruser and will get a total
outline of the idea of steganography, and its strategies. Consider, a cover file c(x)
without any secret data embedded into it. Now, we shall embed a secret message
s(m) into c(x) using steganographic encoder also called as “stego object” function
f (x, m, k). The resulting cover file while remains similar as cover file c(x), which was
used at initial step, and no changes are visible. To recoup the code, the stego object
should be sent to “steganographic decoder.” The secret data with slight degradation
in quality using steganographic decoder.
3 Literature Survey
Text steganography is a technique which uses text as cover. We will study some
basic steganographic techniques, namely line shifting method, word shifting method,
syntactic method, and selective hiding method.
In this manner, the data are covered up by utilizing an alternate book design in the
content line by pointing it upwards somewhat to conceal the message [1, 2]. The
specific example used to create cover text is finished by evolving lines. Pieces like
0, 1, and −1 can highlight removable, raised, and switched lines [3, 4]. Deciding if a
line is exchanged up or down is made by estimating the centroid distance of checked
line and its control lines [5]. In the event that the content is re-composed or when the
calculation (OCR) is utilized, scrambled information will be tainted. Additionally,
distances can be identified utilizing exceptional distance test apparatuses [6]. The
main advantages include that when we want to print the text then only this message
can be used and just the collector can extract the secret information. The disadvantage
of this method is when OCE is applied, then the hidden data are lost.
In this way, the information is hidden by changing the words horizontally, e.g., left
or option to address 0 or 1, respectively. For this reason, the document is coded
separately. This method is only applicable to those who texts with variable spaces
between adjacent words [6]. Flexible spaces in text documents are often used to
spread white space when preparing text. To determine the message, we must have
first message frame. It can be trimmed using a bitmap [7]. Such advantages include
that it cannot be recognized effectively on the grounds that the adjustment in the hole
between the words in normal in light of the fact that to fill the hole in line additional
dividings are utilized. The disadvantages of such case are that if anyone is capable
of this methods or algorithm of word shifting, then the message is not safe and also
retyping or using OCR destroys the stego message [8].
This shrouds the characters in the first (or a particular area) characters of the words.
Connecting those characters help separating the content [9, 10]. The main advantage
of this method is that this is featured security and capacity, the needed aspects of
steganography that make it useful in the hidden exchange of the information through
text documents and establishing secret messages.
Basic Framework of Different Steganography Techniques … 575
This secret messaging technique is called steganography. The name is derived from
the Greek word “σ τ εγ αυω” meaning “secret or combined writing.” In modern
times, steganography can be viewed as a study of the art and science of communi-
cation in a way that conceals the existence of communications in order to protect
information. Images are an excellent way to hide information because it offers a high
degree of distortion—meaning that there are many bits available that offer greater
accuracy than is needed in the use of an object (or display) [11, 12]. The basic frame-
work of image steganography presented in Fig. 3. Here, based on the embedding
approach, the steganographic strategies are divided into various domains as follows.
3.2.1 Compression
When working with high resolution images with high color depth, immature file size
can be large also hard to communicate through a standard Internet association. To
address this, compressed image formats have been developed, as you would guess,
compressing pixel details and keeping file sizes extremely small, making it possible
to transfer [12].
Local domain strategies embed private message or payload with direct pixel power;
which means they update pixel data by inserting or inserting other pieces. Missing
images are good for these methods as compression will not change the embedded
data. These methods should know the image format to make the concealment details
into pointless evidences [13, 14].
This method converts a private message or paid upload into a small stream and
converts it into LSB (8th bit) for some or all bytes within the image. Changes occur
slowly the most important energy converter by +1 which is hard for the human eye
to detect. When using a 24-bit image, each part of the red, green, and blue object
is replaced. With a hardness of about 256 for each major color, changing the LSB
pixels brings a slight change in color intensity [15].
To understand the concept how steganography works for JPEG files, we shall under-
stand the concept behind the question “how the raw data are compressed by JPEG
and then, how we could hide data in it?”
According to research, the natural eye is more touchy to changes in the brilliance
(light) of a pixel than to an adjustment in shading. We characterize light and shading
rather than the encompassing areas. The compression phase uses this understanding
and converts the image from RGB color to YCbCr representation—light separated by
color. In the representation of YCbCr, the Y-component corresponds to light (light–
black–white) and Cb (yellow–blue) and Cr (green–red) elements. Now, we discard
some color data by halving the sample in both vertical and vertical directions and
thus directly reducing the file size by two elements.
Basic Framework of Different Steganography Techniques … 577
This adjustment is viewed as the main alteration used to complete Fourier investiga-
tion on many working frameworks. Tests can be the quantity of pixels in arrangement
or in the raster picture segment in the picture handling [16].
The LSB algorithm is an easy way to insert data into a digital audio file [12]. The
sampling technique followed by quantization converts to a digital binary sequence
from analog audio signal for each computer generated audio file, replacing the binary
equivalent message [18]. We will include LSB for each sample point with a binary
message. The great advantage of using the LSB algorithm is that the LSB process
allows a large amount of data to be embedded/embedded in a message file. The
transfer rate is 1 KBPS per second (1) per second. In other words, we can simply
describe the LSB. Algorithm as the old stego method used to hide the presence of
private information within a public cover (or) message file. For example, it represents
the decimal number 493 in binary writing such as 111,101, and 101. We think of an
address that starts right and goes up to the left. As shown on the fig. tree, LSB in this
case 1 as shown in Fig. 5. The concept, LSB algorithm, replaces the LSB of each
byte in “carrier” data from “secret” message.
The sender likewise plays out the way toward implanting secret message into
transporter record as byte-to-byte. The receiver performs the extraction procedure by
reading LSB algorithm of each data received byte. So, how does a receiver recoup
the secret message? Here is a flow diagram that delivers the simple technique to
comprehend the extraction of secret message as shown in Figs. 6 and 7.
In this concept, the confidential message has to get inserted into baseband signal
referred as “echo.” We consider three parameters of an echo for baseband signals,
namely—amplitude, decay rate, and offset from original signals. These are differed
to represent encoded secret binary message. They are set below to threshold of human
Basic Framework of Different Steganography Techniques … 579
auditory system (HAS), so that the echo cannot easily get revolved. The original cover
video consists of frames represented by Ck(m, n) where “N” is the total number of
frame, and m, n are rows and columns indices of the pixels. The binary secret message
denoted by Mk(m, n) is embedded into cover video media by modulating it into a
signal. The stego—video signal is represented as
“Wave transformation can be considered as converting the signal from a time zone to a
wavelet domain.” This new domain contains more complex functions called wavelets,
mother wavelets, or examination of wavelets [5]. The wavelet examination permits
the division of the message signal into two corresponding (or) symbols. This process
is called “decay.” Items on the edge of the signal are mostly allowed in the high
frequency section. This message signal is transmitted through a series of high-level
filters to analyze these high frequencies. Filters of various cutting methods are used to
analyze the signal at different resolutions. [6]. The DWT process involves choosing
positions and scales, based on the power of two articles, called dyadic scales and
positions. The mother wavelet is redeemed by two forces and converted into values.
Specifically, the function f (t) L2 (R), which describes the square joint function,
represented as the function (t) is known as mother wavelet, while the function (t)
known as scaling wavelet.
580 R. C. Rao et al.
The concept of video steganography is employed to cover the information behind the
frames of videos. First, the information is encrypted using cryptography algorithm,
next the encrypted data are embedded into frames of videos. The technique accus-
tomed embed the information is LSB coding. Video steganography technique can
hide great deal of knowledge in most simplest and efficient way as shown in Fig. 8.
The video steganography uses the mix of both cryptography and steganography for
hiding the key data behind the video clips. Therefore, this is often a double security
system. The key data are first encrypted then the encrypted data are hidden behind
the frames of video. Using the key which is thought only to sender and receiver, the
sender encrypts the information and sends it to the destination, where the receiver
decrypts the information using the identical key. Nobody can easily detect that secret
information which is hidden behind the video. The change within the size of original
video and encrypted video claims that the information is hidden behind the video
[20].
Here, we have the hash-based least significant bit technique for video steganog-
raphy which performs insertion of bits of text file in video in the least significant bit
position of RGB pixel as per hash function. In this way, it includes encoding and
decoding process for hiding message and extracting message, respectively. First of
all, text will be embedded within the video by using the steganographic tool. This
stego video file is again applied to steganographic tool to decode embedded data
[21].
There are various steganographic methods have been proposed in literature. A secured
hash-based LSB technique for image steganography has been implemented [22]. The
basic requirement of hiding a data in cover file will be explained [23]. The technique
[24] of data hiding for high resolution video is proposed. Hiding data using the motion
vector technique for the moving objects are introduced in [25]. In this compressed
video, it is used for the data transmission since it can hold large volume of the data. The
stego machine to develop a steganographic application to hide data containing text in
a computer video file and to retrieve the hidden information is designed [26]. A robust
method of imperceptible audio, video, text, and image hiding is proposed [27]. The
most secure and robust algorithm are introduced [28]. An improved least significant
bit (LSB)-based steganography technique for images imparting better information
security [29]. A new compressed video steganographic scheme in which the data
are hidden in the horizontal, and the vertical components of the motion vectors are
proposed [30]. There is system for data hiding uses AES for encryption for generating
secret hash function or key [31]. A hash-based least significant bit (LSB) technique
has been proposed.
4 Conclusion
References
1. M.A. Saleh, Image steganography techniques—a review. Int. J. Adv. Res. Comput. Commun.
Eng. 7(9), 52–58 (2018). https://doi.org/10.17148/IJARCCE.2018.7910
2. S.H. Low, N.F. Maxemchuk, J.T. Brassil, L.O. Gorman, Document marking and identification
using both line and word shifting, in INFOCOM’95 Proceedings of the Fourteen Annual Joint
Conference of the IEEE Computer and Communication Societies (1995), pp. 853–860
3. S. Bhattacharyya, I. Banerjee, G. Sanyal, Data hiding through multi level steganography and
SSCE. J. Glob. Res. Comput. Sci. J. Sci. 2(2), 38–47 (2011). ISSN: 2229-371x
4. J.T. Brassil, S. Low, N.F. Maxemchuk, L. O’Gorman, Electronic marking and identification
techniques to discourage document copying. IEEE J. Sel. Areas Commun. 13(8), 1495–1504
(1995)
5. M.S. Shahreza, M.H.S. Shahreza, Text steganography in SMS, in 2007 International Confer-
ence on Convergence Information Technology (2007), pp. 2260–2265
6. W. Bender, D. Gruhl, N. Morimoto, A. Lu, Techniques for data hiding. IBM Syst. J. 35, 313–336
(1996)
7. F.A.P. Petitcolas, R.J. Anderson, M.G. Kuhn, Information hiding—a survey. Proc. IEEE 87,
1062–1078 (1999)
8. L.Y. Por, B. Delina, Information hiding—a new approach in text stegano, in 7th WSEAS
International Conference on Applied Computer and Applied Computational Science, 2008,
pp. 689–695.
9. L.Y. Por, T.F. Ang, B. Delina, WhiteSteg—a new scheme in information hiding using text
steganography. WSEAS Trans. Comput. 7(6), 735–745 (2008)
10. S. Changder, D. Ghosh, N.C. Debnath, Linguistic approach for text steganography throughIn-
dian text, in 2010 2nd International Conference on Computer Technology and Development
(2010), pp. 318–322
11. S. Kurane, H. Harke, S. Kulkarni, Text and audio data hiding using LSB and DCT a review
approach, in National Conference on “Internet Things Towar a Smart Future” Recent Trends
in Electronics and Communication (2016)
12. E. Nandhini, M. Nivetha, S. Nirmala, R. Poornima, MLSB technique based 3D image
steganography using AES algorithm. J. Recent Res. Eng. Technol. 3(1), 2936 (2016)
13. J. Kour, D. Verma, Steganography techniques—a review paper. Int. J. Emerg. Res. Manag.
Technol. 9359(35), 2278–9359 (2014)
14. E.R. Harold, What is an Image (2006)
15. M. Hussain, M. Hussain, A survey of image steganography techniques. Int. J. Adv. Sci. Technol.
54, 113–124 (2013)
16. N. Hamid, R.B. Ahmad, Image Steganography Techniques: An Overview, vol. 6 (2012),
pp. 168–187
17. G.N. Kumar, V.S.K. Reddy, Extraction of key frames using rough set theory for video retrieval,
in International Conference on Soft Computing and Signal Processing (Springer, Singapore,
2019), pp. 761–768
18. C.A. PetrKlapetek, D. Nečas, Wavelet transform [Online] (2016)
19. S. Khosarvi, M.A. Dezfoli, M.H. Yektaie, A new steganography method based HIOP algorithm
and Strassen’s matrix multiplication. J. Glob. Res. Comput. Sci. 2(1) (2011)
20. G.J. Simmons, The prisoners’ problem and the subliminal channel, in Proceedings of Advances
in Cryptology (CRYPTO’83), pp. 51–67. J.F. Berglund, K.H. Hofmann, Compact Semitopo-
logicalsemigroups and Weakly Almost Periodic Functions. Lecture Notes in Mathematics, vol.
42 (Springer, Berlin, New York, 1967)
21. B. Dunbar, A Detailed look at Steganographic Techniques and Their use in an Open System
Environment (SANS Institute InfoSec Reading Room, 2002)
22. B. Lin, B. Nguyen, E.T. Olsen, in Orthogonal Wavelets and Signal Processing, ed. by P.M.
Clarkson, H. Stark. Signal Processing Methods for Audio, Images and Telecommunications
(Academic, London, 1995), pp. 1–70
Basic Framework of Different Steganography Techniques … 583
23. S. Mallat, A Wavelet Tour of Signal Processing (Academic, San Diego, CA, 1998)
24. S Andreas, P.T. Ed, A. Venkatraman, Audio Signal Processing and Coding (Wiley-Interscience
Publication, USA, 2006). ISBN 978-0-471-79147-8, TK5102.92.S73
25. A. Kumar, R. Sharma, A secure image steganography based on RSA algorithm and hash LSB
technique. Int. J. Adv. Res. Comput. Sci. (2013)
26. K. Rabah, Steganography the art of hiding. Inf. Technol. J. 3(3), 245–269 (2004)
27. A.K. Bhaumik, M. Choi, R.J. Robles, M.O. Balitanas, Data hiding in video. Int. J. Database
Theory Appl. 2(2), 9–16 (2009)
28. P. Paulpandi1, T. Meyyappan, Hiding messages using motion vector technique in video
steganography. Int. J. Eng. Trends Technol. 3(3), 361–365 (2012)
29. M. Ramalingam, Stego machine video steganography using modified LSB algorithm. World
Acad. Sci. Eng. Technol. 50, 497–500 (2011)
30. P. Bhautmage, A. Jeyakumar, A. Dahatonde, Advanced video steganography algorithm. Int. J.
Eng. Res. Appl. (IJERA) 3(1), 1641–1644 (2013)
31. G.S. Naveen Kumar, V.S.K. Reddy, An efficient approach for video retrieval by spatio-temporal
features. Int. J. Knowl. Based Intell. Eng. Syst. 23(4), 311–316 (2019)
Call Admission Control for Interactive
Multimedia Applications in 4G Networks
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 585
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_53
586 K. Keshav et al.
1 Introduction
Virtual Reality (VR) and Augmented Reality are upcoming emerging interactive
multimedia applications. These applications are useful in gaming, remote health
care applications especially during pandemic times, autonomous transport, remote
educational training, tactile internet, and industrial automation. However, these appli-
cations face challenges over 4G LTE cellular networks. This is because VR applica-
tions have significant throughput requirements for both uplink and downlink. Firstly
for both uplink and downlink, more capacity is required. However, the requirement is
asymmetric. Secondly, if network can achieve low latency, it helps for an immersive
experience. For latency, VR application’s uplink latency needs are more stringent as
compared to downlink. The third key requirement is to have a consistent experience
of full immersion everywhere, which requires consistent throughput even at the cell
edge.
Call admission control is an essential tool in wireless networks to ensure the
quality of experience (QoE) for various applications by ensuring the right amount
of load. Call admission control is usually done based on service level agreement.
It controls the incoming flow so that network can ensure QoS for existing and new
flow. A robust call admission control scheme is sought to address the traffic demand
for interactive multimedia applications.
This research work aims to design solutions to the critical issues in 4G based
integrated multimedia networks for interactive applications such as interactive vir-
tual reality applications. To solve these issues in the cellular network, use of agent
technology is proposed. Agents can make decisions by themselves on behalf of the
user, migrate from node to node, and dynamically resolve issues occurring at vari-
ous network elements. Thus, a mobile agent-based framework provides an efficient
solution to solve multimedia network issues.
Organization of rest of the paper is as follows. In Sect. 2, existing works for
handling call admission issues arising while handling interactive multimedia are dis-
cussed. Further in Sect. 3 call admission control issues in 4G networks are discussed.
In Sect. 4, we propose an agent based call admission control scheme which effective
takes of of quality of service needs of interactive VR applications. In Sect. 5 analyt-
ical model used to model proposed scheme is presented. Simulation and results are
discussed in Sect. 6. Conclusion is presented in Sect. 8, and finally future works are
presented in Sect. 9.
2 Existing Work
Call admission control is an important tool to meet desired QoS for users in 4G
LTE cellular networks. There have been several important works which address call
admission control issues in LTE networks. In [12], author proposes call admission
control to improve parameters such as packet delay and packet dropping as well as
Call Admission Control for Interactive Multimedia Applications in 4G Networks 587
to improve call dropping probability. In [9], to take care of packet-level and call level
QoS considerations across the network, i.e., both wired and wireless part of network,
two-tier call admission control scheme for 4G networks is presented. In [5], the
author presents a Markov chain-based performance model for dynamic call admission
control scheme in cellular networks. In [2] a portion of bandwidth used by admitted
non-real-time traffic is released with the aim of reduction of call dropping probability
by reducing handover call drop probability as well as to increase bandwidth utilization
to provide high QoS to real-time traffic. In [3], considering macro and small cells,
a call admission control mechanism is proposed. In [7], to avoid the starvation of
best effort traffic, a novel CAC scheme to provide effective use of network resources
is presented; however, here author do not discuss interactive application needs. The
involving an adaptive threshold value, which adapts the network resources under
heavy traffic intensity is presented. In [10], a channel borrowing CAC scheme in
two-tier LTE/LTE-Advance networks. Autonomous mobile agents have been used in
networks to solve various issues. A mobile agent consists of code, state, and attributes
and therefore mobile agents allow all network components to become intelligent
[14]. Few techniques on how best the base stations can allocate resources in the 5G
network are investigated in [11]. In [6] an overview of the security issues related to the
mobile agent based systems is presented. In [13], the author proposes a call admission
control, primarily considering minimal energy consumption in mind and to ensure an
acceptable quality of experience for application requests. In [13], the author proposes
a call admission control, primarily considering minimal energy consumption in mind
and to ensure an acceptable quality of experience for application requests. In [1], call
admission control is addressed for a heterogeneous cell, which involves both small
and large cells. New handover schemes which take care of movement across the
large cell and small cell are proposed. A Markov chain technique is used to calculate
call blocking probability of various subscriber requests as they move across different
types of cells. In [8], several schemes such as complete sharing and probabilistic
threshold policy are used for call admission control across 4G/5G networks. In [4],
a Markov Model of Slice Admission Control is presented.
The 4G network provides a mechanism for call admission control access class mech-
anisms, which can be controlled using Allocation and Retention Priority (ARP)
parameters. Each bearer in the 4G network has associated Priority Level (PL). In
4G networks, fifteen levels of priority mechanism are provided. However, given the
variety of types of applications which have evolved in recent years, even 15 level
of priorities in 4G networks are not sufficient. Another crucial issue is handling call
admission control across multiple wireless access technologies like cellular and WiFi
based on its location. Not only handovers within LTE network, but also handovers
from other technologies should also be accommodated in call admission control.
Therefore, 4G networks need to adapt their call admission control mechanisms for
588 K. Keshav et al.
interactive multimedia applications. There are many approaches like guard channel,
fractional guard channel, mobility based, and price based mechanism, which guide
the application’s admission control in 4G networks. 4G networks provide means to
allow call admission control based on different types of users, however, consider-
ing the latency needs of interactive multimedia applications like VR as an essential
parameter, call admission control mechanisms are required to be redesigned.
4 Proposed Scheme
In Fig. 1, architecture of proposed agent based call admission control scheme using
mobile agents is presented. Static agents are deployed in 4G cells, 4G home subscriber
stations, and media servers. Based on need, mobile agents will be deployed on other
key components of the 4G network environment, for example, in packet gateway,
serving gateway, and UEs.
The proposed agent based call admission control scheme has three essential steps.
In the first step, the call admission policy is decided. For this static agent in eNB
dispatches the mobile agent to entities involved in the call admission control mech-
anism in the 4G network. Initially, when the system is not under high load, all types
of traffic connections like interactive VR and non-interactive VR are accepted. The
policy will define the present value of the admission threshold, which should be
used. Algorithms 1, 2 and 3 provide steps of proposed agent based call admission
control scheme. In Algorithm 1, call admission policy for 4G networks is derived
based usage of network resources, history of past application performance and radio
level measurements during previous sessions. Accordingly, call admission threshold
is increased or decreased to priorities admission control for interactive applications.
In Algorithm 2 bearer allocation and exact admission of interactive application are
decided including dropping of non VR applications if necessary. In Algorithm 3,
scheme is proposed to adapt call admission control policy by changing admission
threshold based on implicit user feedback.
9: Agents monitor user behavior like user aborting the session in between due to poor quality
10: if Quality is poor during the previous session then
11: MA will update PCRF and PCEF will be updated with new policy
12: Call admission threshold is increased to not to allow non-interactive calls
13: else if User feedback (collected in a non-intrusive way) is satisfactory then
14: Call admission threshold is adapted to allow more non-interactive calls
15: eNB static agent, mobile agents across network elements P-GW and S-GW are updated
with new policy
16: end if
17: end while
18: End
An analytical model with a M/M/1 system with the extra capability to accept only
interactive calls once system overload is detected has been proposed. In the begin-
ning, both interactive and non-interactive application requests are accepted as they
arrive. When both interactive and background applications go beyond a threshold,
the application’s delay will also be very high. Specifically, when the total incom-
ing request rate becomes high, a static agent in eNB, in consultation with S-GW
590 K. Keshav et al.
and P-GW’s mobile agent, decides on the call admission threshold. The threshold
depends on the present capacity available across these nodes and the requirement
of interactive VR applications. A mobile agent-based system then can decide about
this threshold value as per the requirement of VR applications before actual over-
load happens. So after a threshold, only interactive applications are accepted in the
path across eNB/P-GW, and S-GW and non-interactive applications are re-directed
Call Admission Control for Interactive Multimedia Applications in 4G Networks 591
through other available links. Other available links could be through a device to
device link available in a 4G network or other radio access technology.
Figure 2 describes the analytical queuing model used for call admission control.
Figure 3 describes state diagram for the proposed call admission control model. It is
assumed that each session of the same type of traffic will have same specific require-
ments in terms of the number of slots occupied. So, each session of interactive and
non-interactive traffic takes one slot. The scheduling period in LTE is one subframe.
Let maximum number of non-interactive sessions be N N . Let the number of
interactive sessions be N I . Let total capacity be C and N I + N N < C It is assumed
that the interactive application rate follows Poisson processes with a rate of λ I .
Similarly, the non-interactive application rate also follows the Poisson process with
a rate of λ N .
Let N be the threshold on the number of total request rates inclusive of VR and
non-VR requests, beyond which if more non VR requests arrive, they are offloaded
to nearby link like device to device link as shown in Fig. 1. Please note that N ≤ C.
In Fig. 3, state diagram for the proposed call admission control model is presented.
pk defines the probability that there are k calls (data sessions) in the system.
Let p be the fraction of a load of the VR application request rate as compared to the
non-VR application request rate.
p ∗ λ ∗ pk = μ ∗ pk+1 , k ≥ N (2)
592 K. Keshav et al.
So, with ρ = μλ ,
pk = ρ k ∗ p0 , 0 ≤ k < N (3)
and
p N +k = ρ N ∗ ( pρ)k ∗ p0 , k ≥ 0 (4)
From the normalization, ∞ k=0 pk = 1 Now an important measure is Pr ej , which
defines how much the non-VR application request rate is not accepted and is offloaded
to nearby links.
∞
p0 ρ N
Pr ej = p N +k = (5)
k=0
(1 − pρ)
Delay I and Delay N are the mean delays for the VR request rate and non-VR
request rate classes of traffic.
Using PASTA can can calculate Delay for Interactive applications as below.
1 − ρ N +1 − (N + 1)ρ N (1 − ρ) 1
Delay N = p0 (6)
(1 − ρ)2 μ
1 − ρ N +1 − (N + 1)ρ N (1 − ρ) ρN N ρN 1
Delay I = p0 + +
(1 − ρ)2 (1 − pρ) (1 − pρ)2 μ
(7)
6 Simulation Environment
In the simulation, we have considered the LTE advanced network, wherein the net-
work span a residential society. A virtual reality server has been installed near the
base station, which acts as an edge server, and also, there is the main server that
lies in the internet domain. VR application has been activated in the server, and it
is accessible to all the 100 UE nodes. These UE are spread across networks. Bearer
allocation follows a definite rule for allocating bearer allocation. Non-interactive VR
applications have only default bearer, while VR users have a dedicated bearer and a
default bearer.
7 Results
For 4G systems, it is observed that since 4G capacities are not sufficient for high-end
VR applications, the system is overloaded. Now in the overloaded system, when
both interactive VR requests and non-interactive requests arrive, with an adaptive
rate of offload, non-VR requests are offloaded to alternative links available. If the
Call Admission Control for Interactive Multimedia Applications in 4G Networks 593
alternate link is not available, these non-VR applications can wait for some time,
and if still, resources are not available, then non-interactive requests are dropped.
Delay performance of the interactive application and non-interactive applications
with respect to increased arrival is shown in Fig. 4a. Here, we see that as the num-
ber of users increases, interactive VR applications face the least delay because of
the proposed mobile agent-based scheme. However, non-interactive VR experience
higher delay as expected. Further, the amount of traffic belonging to non-interactive
applications being offloaded has been studied. As incoming traffic load increases, it
causes a specific portion of non-interactive traffic to be offloaded. As the arrival rate
increases, as shown in Fig. 4, the amount of the non-interactive application session
being offloaded also increases. Together through these two graphs in Fig. 4a and in
Fig. 4b , it is observed that through appropriately prioritizing interactive VR appli-
cations after a specific load level, prioritizing resources for interactive multimedia
across the data path in 4G such as eNodeB, S-GW and P-GW leads to the desired
performance for interactive VR applications.
8 Conclusion
4G LTE Networks are resource constraint especially when there are large number of
users in an area. When there is an increased number of interactive VR users and non-
interactive VR users, the system is overloaded. Whenever overload condition exist,
scheme of offloading of non VR traffic can allow system to allow preferential call
admission control to VR users. Proposed scheme provides improved performance
in 4G networks as interactive VR requests occur less delay as compared to non-VR
requests. For dynamic call admission control, the rate at which non-VR traffic waits
adapts to increased load to maintain overall system stability.
594 K. Keshav et al.
9 Future Works
4G networks do not provide scalable solution for meeting needs such as throughout
or less delay to for latency dependent interactive applications when there are large
number of users. Usually, most 4G networks are overloaded across the world. Call
admission control mechanism needs to be extended across multiple types of networks
so that smooth VR performance is achieved when users move across a non-compatible
set of technologies. 5G is an upcoming technology which address solution to these
scalability issues. However, 5G networks are evolving and still do not support efficient
solution needed for call admission control for interactive multimedia. However as
5G networks support large bandwidth, analytical model for 5G networks needs to be
developed considering system is an under loaded one. In future, present work can be
extended to 5G networks and integrated 4G/5G networks. Further it can be extended
to integrated networks consisting of 3GPP based 4G/5G networks as well as IEEE
Wifi networks.
References
12. F.L. Rodríguez, U.S. Dias, D.R. Campelo, R.D.O. Albuquerque, S.J. Lim, L.J.G. Villalba,
Qos management and flexible traffic detection architecture for 5g mobile networks, in Sensors
(Switzerland, 2019)
13. A. Slalmi, R. Saadane, H. Chaibi, H.K. Aroussi, Improving call admission control in 5g for
smart cities applications, in SCA’19: Proceedings of the 4th International Conference on Smart
City Applications, Oct 2019
14. C. Tsatsoulis, L.K. Soh, Intelligent agents in telecommunication networks, in Computational
Intelligence in Telecommunications Networks, Intelligent Agents in Telecommunication Net-
works (2000)
AI-Based Pro Mode in Smartphone
Photography
Abstract The auto mode of cameras of smart phones has a very limited scope. A
person can click better pictures if he has an idea of how to use the manual mode. But
changing the settings like HDR, hybrid zoom, night mode, shutter speed, ISO, etc.,
in manual mode takes time and the right knowledge to capture the best image. One
may miss the moment when it was actually the right time to capture the picture. So,
we have come up with an idea to capture best quality images using AI which will
not be too time taking and would lessen the efforts of the user. We aim to enhance
the image quality based on individual components of the image.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 597
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_54
598 P. Nagpal et al.
our database. We extract features from the newly captured image and then find similar
images from our database and apply the features from those images to our new image
for enhancement. We have tried and tested this algorithm and the associated code on
Samsung Galaxy S21 ultra.
The author has basically proposed a methodology to edit images using textual
commands through a virtual agent. The main challenges in this sequential and interac-
tive image production activity are as follows: (1) based on the context, the consistency
between the images that is generated and the text description provided; (2) step-wise
modification of the generated image at region level so that it looks consistent as
a whole. To address these challenges, the author has proposed a novel sequential
attention generative adversarial network (SeqAttnGAN), which tracks states of the
previous sequential image and its textual description and encodes it and also uses
a GAN architecture to regenerate an enhanced version of the image that is both
consistent and coherent with the preceding images and its description [3].
The author has presented a new image editing approach with convolutional networks
to automatically alter the image content with a desired attribute and still keep the
image photorealistic. The proposed image editing approach effectively combines the
strengths of two prominent images editing algorithms, conditional generative adver-
sarial networks and deep feature interpolation, to be time-efficient, memory-efficient,
and user-controllable [4, 5]. He has also presented an inverted deep convolutional
network to facilitate the proposed image editing approach. Although, the generated
image is photorealistic and of good quality, generator generates a whole new image.
In our approach, we are enhancing the original image quality.
3 Methodology Used
First and foremost step is to build a database with images of appropriate NIQE
values. Then, we perform image segmentation of database images into components
and analyze their contribution to the whole image. We follow the same process of
image segmentation for test images. The next step is to compare the components of
AI-Based Pro Mode in Smartphone Photography 599
the testing image with subsequent entries in the database and apply effects of closest
matching images to the components separately. Recursively, we add this image to
the database and train the model again as shown in Fig. 1.
Segmentation Pseudocode
600 P. Nagpal et al.
Algorithm Flowchart
See Fig. 1.
3.2 Modules
• Segmentation of the images that will be used to create the database to get the
different components in the image and their corresponding contribution to the
whole image [6]. Here, contribution means the percentage of the total area of the
image covered by each component.
• A database of images will be created for different components. The database will
consist of the contribution of the components and their properties [7].
• NIQE algorithm will be used to verify the quality of the components of the images
that will be used to create the database. Only images under an appropriate value
(the lesser the value, the better the image quality) will be chosen. The captured
test image will go through the same process of segmentation and the components
will be extracted.
• These components will be classified into the various categories of the database and
they will be matched with the closest entries of the same individual components
in the database. Some components from the database which are the closest to the
components of the test image and have a good NIQE score will be selected.
• The image properties of the components chosen from the database will be applied
on the individual test components to enhance them. These properties will go
through scaling and normalization [8].
• Individual components will be enhanced and now the overall quality of the image
will be validated through NIQE score and this image will be entered into the
database again to improve the database.
3.3 Steps
Image segmentation means segregation of an image into various groups. This way
we will have a set of images with good image quality verified by the NIQE score and
the number of images in the database will increase for every category. The segmented
parts will be used as individual images for the database. The steps involved in image
segmentation are shown below in Fig. 2.
There are several methods to deal with this and one of the most popular methods
is k-means clustering algorithm. K-means clustering algorithm is an unsupervised
algorithm, and it is used to differentiate the region of interest from the background
of the image. It combines, or separates, the given data into k-clusters based on the
k-centroids.
To perform k-means, we will first convert the image from RGB to HSV. Red,
green, and blue components of an object’s color in a digital image are all correlated
in a way, where the measure of light striking the object and with one another, image
descriptions in terms of these parts make discrimination of objects a tedious process.
Descriptions in terms of hue and lightness along with chroma or saturation are often
602 P. Nagpal et al.
more familiar.
⎧
⎪ 0, ◦ if MAX = MIN ⇔ R = G = B
⎪
⎨
60 .[0 + (G − B)/(MAX − MIN)] if MAX = R
H := (1)
⎪
⎪ 60◦ .[2 + (B − R)/(MAX − MIN)] if MAX = G
⎩ ◦
60 .[4 + (R − G)/(MAX − MIN)] if MAX = B
V := MAX.
3.3.2 Database
enhancing the user images [10, 11]. Image characteristics are explained in detail
in step 6.
The same procedure that was done in step 1 with the database images will be done
with the user image to get the individual components of the user image. Once we
have these individual components, they will be classified to the various categories
and then content-based image retrieval will be performed on these components to
get the closest matching images from the database. These closest matched images
will be used to apply properties on these components to enhance them. The next step
explains content-based image retrieval.
• Searching: This step takes the user image, segments it and stores histograms of
the segments (the same steps as done with the database images while storing), and
then compares these histograms with the ones in the database and returns closest
matching images.
Figure 4 shows steps 1 and 2, and Fig. 5 shows steps 3 and 4. Figure 6 shows the
output histogram for one of the components of the input image.
Test image from the user is broken down into segments, and the final closest
matching images based on comparing the histograms are returned from the database.
AI-Based Pro Mode in Smartphone Photography 605
Brightness Saturation
• Contrast: Contrast is the difference between the darkest and brightest areas or
maximum and minimum pixel intensities of the image. It will make the shadows
darker and highlights brighter.
Brightness
See Fig. 7.
Saturation
See Fig. 8.
Contrast
One of the methods to calculate the contrast of an image is by calculating the
maximum and minimum lightness values of the image [14, 15]. The following steps
will show how to make contrast enhancement for the new user image:
(1) Convert the RGB values of all the pixels of the matching image to HSL (as
shown previously).
(2) Calculate the minimum and the maximum value of lightness of the closest
matching image—L2max and L2min and of the user image—L1max and L1min .
(3) Now, we will shift all the lightness values of the user image by subtracting the
value L1min from each of them.
(4) Next, we will scale all the lightness values by multiplying them by
Therefore,
This means that on increasing the contrast, the lesser range values will decrease
in value and the higher range values will increase, hence increasing the contrast as
shown in Fig. 10.
These images do not have any characteristics so they cannot be sent to the database.
AI-Based Pro Mode in Smartphone Photography 609
4 Results
Figure 12 shows a sample output of this algorithm with each component enhanced
stepwise.
5 Future Scope
There is much scope of work in future with the increasing advancement in technology.
In future, we may be able to generate scenery and views which are out of the scope
of the lens. This includes the images that even the ultra-wide camera lenses will fail
to capture. There are some generative models which are coming up and can help to
capture extravagant photos through mobile photography.
610 P. Nagpal et al.
References
1. C. Guo, C. Li, J. Guo, C.C. Loy, J. Hou, S. Kwong, R. Cong, Zero-reference deep curve
estimation for low-light image enhancement, in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (2020), pp. 1780–1789
2. W. Yang, S. Wang, Y. Fang, Y. Wang, J. Liu, From fidelity to perceptual quality: a semi-
supervised approach for low-light image enhancement, in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (2020), pp. 3063–3072
3. M. Wang, Z. Tian, W. Gui, X. Zhang, W. Wang, Low-light image enhancement based on
nonsubsampled shearlet transform. IEEE Access 8, 63162–63174 (2020)
4. P. Zhuang, X. Ding, Underwater image enhancement using an edge-preserving filtering Retinex
algorithm. Multimed. Tools Appl., 1–21 (2020)
5. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image
recognition, v1. arXiv preprint arXiv:1409.1556 (2014)
6. S. Kosugi, T. Yamasaki, Unpaired image enhancement featuring reinforcement-learning-
controlled image editing software. Proc. AAAI Conf. Artif. Intell. 34(07) 11296–11303
(2020)
7. R. Hummel, Image enhancement by histogram transformation. Comput. Graph. Image Process.
6(2), 184–195, 19
8. M.S. Hitam, E.A. Awalludin, W.N.J.H.W. Yussof, Z. Bachok, Mixture contrast limited adaptive
histogram equalization for underwater image enhancement, in Proceedings of the International
Conference on Computer Applications Technology (ICCAT ) (2013), pp. 1–5
9. R. Singh, M. Biswas, Adaptive histogram equalization based fusion technique for hazy
underwater image enhancement, in Proceedings of the IEEE International Conference on
Computational Intelligence and Computing Research (ICCIC), Dec 2016, pp. 1–5
10. S.B. Rana, S.B. Rana, A review of medical image enhancement techniques for image
processing. Int. J. Curr. Eng. Technol. 5(2), 1282–1286 (2015)
11. S. Mahajan, R. Dogra, A review on image enhancement techniques. Int. J. Eng. Innov. Technol.
(IJEIT) 4(11) (2015)
12. Y. Kinoshita, H. Kiya, Hue-correction scheme based on constant-hue plane for deep-learning-
based color-image enhancement. IEEE Access 8, 9540–9550 (2020)
13. G. Hou, Z. Pan, B. Huang, G. Wang, X. Luan, Hue preserving-based approach for underwater
colour image enhancement. IET Image Process. 12(2), 292–298 (2018)
14. W. Xiong, D. Liu, X. Shen, C. Fang, J. Luo, Unsupervised real-world low-light image
enhancement with decoupled networks. arXiv preprint arXiv:2005.02818 (2020)
15. B. Xiao, H. Tang, Y. Jiang, W. Li, G. Wang, Brightness and contrast controllable image
enhancement based on histogram specification. Neurocomputing 275, 2798–2809 (2018)
A ML-Based Model to Quantify Ambient
Air Pollutants
Vijay A. Kanade
Abstract Today, air pollution has become a serious problem for humanity on planet
earth. Air pollution is estimated to kill about seven million people every year globally.
Although there has been significant development in fighting air pollution in the
recent decades, yet large masses don’t seem to benefit from it as the number of
fatalities due to air pollution continue to pile-on. Considering these implications, the
research proposes a novel solution that helps in determining air pollution in a user’s
vicinity just by clicking pictures of the surrounding. These images are processed by
employing machine learning (ML) techniques. The method tracks air pollution by
analyzing the color pattern of the captured environmental images.
1 Introduction
Today, air pollution is considered as one of the world’s largest environmental health
threat. According to a study, the global deaths due to air pollution in 2020 stood
at 6.6 billion [1]. WHO data says 9 out of 10 breathe polluted air that contain air
pollutants above threshold limits [2].
Air pollution has had a severe health and climatic impact. Along with outdoor
pollution, indoor (household) air pollution has also majorly affected global health
index. Due to air pollution, there has been an increased spike in fatalities occurring
from heart failures, brain strokes, chronic respiratory infections, lung diseases, and
many others.
According to WHO data, ambient air pollution causes nearly 4.2 million deaths
per year. This is due to the fact that about 91% of global population lives in areas
where air quality index (AQI) exceeds the limits set by WHO [2]. The source of
outdoor pollution includes vehicles, power generation units, construction industries,
and manufacturing units (i.e., factories).
V. A. Kanade (B)
Kolhapur, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 611
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_55
612 V. A. Kanade
On the other hand, it has been identified that household (indoor) pollution has
also served as a key driver in causing premature deaths in developing countries.
In households, burning dung, wood, and coal in stoves produce various pollutants
and emit fine particles that can damage health. These include particulate matter
(PM), methane, carbon monoxide, polyaromatic hydrocarbons (PAH), and volatile
organic compounds (VOC) [2]. All these factors contribute to respiratory illness, eye
irritations, and even cancer.
Earthly climatic conditions and ecosystems are intrinsically coupled with air
quality variable. The air pollution propellers have also been identified as key compo-
nents of greenhouse gases. Hence, considering the devastating impact that air pollu-
tion can have for life on earth, strategizing policies to reduce air pollution can offer
both climatic and health benefits. Solutions to tackle air pollution can potentially
lower the diseases linked to it and also aid in mitigating long-term climate change.
2 Proposed Model
2.1 Algorithm
3 Experiment
The teachable machine tool uses a transfer learning technique [7]. It is a deep learning
technique that focuses on reusing the pre-trained model by storing knowledge of the
solved problem and applying it to new related problem. One example of this could
be applying knowledge earned from car recognition problem to a new problem of
truck recognition. In comparison to traditional ML methods, transfer learning can
train deep neural networks with comparatively little data.
Figure 1 highlights the basic difference between traditional ML and transfer
learning.
Thus, in our experiment, the teachable machine tool employs a pre-trained neural
network. Here, we created our own classes for less ambient pollution and more
outdoor pollution, respectively. As we created these classes, they became a part of
the last layer or step of the already existing neural net. Hence, in a way, the uploaded
images in respective classes were learning off from the pre-trained mobile net models
614 V. A. Kanade
existing in the neural network of teachable machine [8]. The very neural net helped
in segregation of real-time images in our experiment.
Location of study (captured images):
Latitude—16° 43 23.7 N, Longitude—74° 14 06.5 E.
Epoch*—Means each and every sample in the training dataset has been fed
through the training model at least once. For example: If your epochs are set to
50, it means the model you are training will work through entire training dataset 50
times.
Batch size*—Set of samples used in one iteration of training. For example: If you
have 50 images and you choose batch size of 16, this means data will be split into
50/16 = 4 batches. Once all 4 batches have been fed through the model, exactly one
epoch will be complete.
Learning rate*—defines rate at which the model learns.
3.2 Results
3.3 Screenshot
Result for Image-1 can be seen in Fig. 3 that shows a screenshot of the tool used.
A ML-Based Model to Quantify Ambient Air Pollutants 615
3.4 Accuracy
In addition, we also analyzed the graphs to check how well the trained model worked.
Below are the graphical images that disclose the accuracy and loss per epoch for
the trained model. Figure 4 depicts accuracy per epoch for ‘Image-1’ used in our
experiment.
616 V. A. Kanade
Fig. 3 Teachable machine (ML tool) displaying the result for uploaded Image-1
Here, accuracy is the percentage of classifications that a model gets right during
training. Implying, if a model classifies 60 samples out of 100, then accuracy is
60/100 = 0.6. Further, if the trained model’s prediction is exact, then the accuracy
is one. Else, the accuracy is lower than one.
3.5 Loss
Here, loss is measure for evaluating how well a model has learned to predict the right
classifications for a given set of samples. If the model’s predictions are accurate, then
the loss is zero. Else, the loss is greater than one. Figure 5 depicts loss per epoch for
‘Image-1’ used in our experiment.
A ML-Based Model to Quantify Ambient Air Pollutants 617
Consider an example where there are two models A and B. Model A predicts a
right classification for a sample but is only 50% confident of that prediction. Model
B predicts the right classification for the same sample but is 90% confident of that
prediction. In this case, both models have the same accuracy, but model B has lower
loss value.
Air pollutants are primarily monitored by using analytical instruments, such as optical
and chemical analyzers. Gas chromatographs and mass spectrometers are some other
tools used for monitoring, but due to their complexity and high cost these are not
used widely. Generally, air pollutant analyzers are complex and expensive, with
each single instrument costing in the range of £5000 to tens of thousands of pounds.
Additionally, traditional air quality monitoring systems are voluminous in nature [9].
Further, traditional air quality monitoring units are static in nature—i.e., installed
in defined areas (e.g., detecting stations) [9]. However, the proposed model is not
static, but mobile from the outset as any handheld device with the right ML soft-
ware can perform the required image processing. With traditional systems, detecting
pollution at a user defined area can pose a challenge as stations have fixed positions.
However, the newly proposed model can traverse in any geography to detect air
pollution even in remotest location, unlike traditional stations.
Besides, the data accessible to common masses today is only a numeric of AQI.
AQI is an index that reports air quality and measures how air pollution affects one’s
health for a time period. Figure 2 shows the AQI parameter observed on iPhones
today.
Having said that, proposed research model is much more advanced than simple
AQI as the users can get an idea of the air composition around any area only
618 V. A. Kanade
5 Conclusion
The research paper discloses an effective ML-based method to detect ambient air
pollution. The process involves capturing the images of the surrounding, analyzing
the captured images by using ML techniques, and determining the severity of
pollution. The research is designed with the aim of making the solution acces-
sible to a common man possessing any handheld device that can click pictures and
process images via ML-based software. The research paves way for a future age of
image processing that can have applications in various fields like healthcare, tech,
astronomy, and many others.
The innovative model harnesses the natural properties of light and chemical prop-
erties of the air pollutants to localize them in an environment. This reduces the
dependency on any external technological domain as seen in traditional systems.
6 Future Work
In the proposed work, we have used the Web-based tool for verifying the research.
However, in future, we intend to develop an app corresponding to the tool so that the
proposed model is accessible to any layman.
Currently, we have only proposed the idea of detecting the severity of ambient
air pollution based on image processing. However, in future, we intend to extend the
model that can identify specific air pollutants (i.e., VOCs, NOx , SO2 , CO, PM,
methane) just by analyzing the captured images. Figure 6 depicts the futuristic
simulation view of the proposed model.
A ML-Based Model to Quantify Ambient Air Pollutants 619
Acknowledgements I would like to extend my sincere gratitude to Dr. A. S. Kanade for his
relentless support during my research work.
Conflict of Interest The authors declare that they have no conflict of interest.
References
1. J. Davidson, Air pollution responsible for over 6.6 million deaths worldwide in 2020, study
finds, 21 Oct 2020
2. Air Pollution, WHO, https://www.who.int/health-topics/air-pollution#tab=tab_1
3. V.A. Kanade, A bio-inspired unsupervised algorithm for deploying [BoT]: symbiotic intelli-
gence, in IOT’18: Proceedings of the 8th International Conference on the Internet of Things,
Oct 2018, pp. 1–5. Article No.: 24
4. D. Igoe, A. Parisi, B. Carter, Characterization of a smartphone camera’s response to ultraviolet a
radiation, in Photochemistry and Photobiology. © 2012 The American Society of Photobiology
5. Teachable Machine, https://teachablemachine.withgoogle.com/
6. TensorFlow.js, https://www.tensorflow.org/js
7. M. Satish, P. Srinivasa Rao, M. Ramakrishna Murty, Identification of natural disaster affected
area using twitter, in International Conference and Publish the Proceedings in AISC Springer
ICETC-2019 (Osmania University, Hyderabad, 2019), pp. 792–801
620 V. A. Kanade
8. Googlecreativelab/teachablemachine-community, https://github.com/googlecreativelab/teacha
blemachine-community/
9. Ultrasonic wind sensors and weather stations for air quality monitoring and analysis, http://www.
gillinstruments.com/applications/government-and-emergency/air-quality-monitoring.html
Multimodal Biometric System Using
Undecimated Dual-Tree Complex
Wavelet Transform
1 Introduction
Biometrics technology gained wide acceptance in society [7]. Even though unimodal
biometrics are acceptable in constrained environments, they suffer from several lim-
itations such as lack of large population coverage and easily vulnerable to noise.
Multibiometric systems use information from multiple modalities and try to over-
come these limitations and provide numerous benefits [5]. Due to the combination
of modalities, better performance is expected in terms of biometric accuracy. Multi-
biometric systems fuse the information at various phases, namely sensor-based,
feature-based, score-based, and decision-based information. Feature-level fusion
has advantages as the features are preserved till the classification. Also, there is
a need for a good feature extraction technique that captures discriminating features
N. Harivinod (B)
St. Joseph Engineering College, Mangaluru, Dakshina Kannada, Karnataka, India
B. H. Shekar
Mangalore University, Mangalagangothri, Dakshina Kannada, Karnataka, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 621
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_56
622 N. Harivinod and B. H. Shekar
2 Methodology
The design of the system proposed is given in Fig. 1. In our method, histogram-
based features are computed for the small blockades that capture the salient features
within the local region. This helps in getting the invariant features even though
their exist intra-class variations. These features are concatenated to give the global
descriptor. For biometric images acquired from different traits, we have retrieved
the region of interest followed by the application of UDTCWT on these images
and the computation of the coefficient matrix. Two types of coefficient matrices are
computed: local UDTCWT phase pattern (LUPP) and global UDTCWT phase pattern
(GUPP). Based on LUPP and GUPP of both the modalities, feature descriptor is
2.1 UDTCWT
Since its inception, the discrete wavelet transform (DWT) is used with great prosper-
ity across diverse image processing applications. The undecimated DWT (UDWT)
was proposed by a number of researchers with different names [3, 12, 15] indepen-
dently. In UDWT, downsampling is avoided from each phase of the DWT. UDWT is
shift invariant because it is mainly caused by downsampling. The scaling and wavelet
filters of an orthonormal DWT are defined as h ∈ l 2 (Z ) and g ∈ l 2 (Z ), respectively.
The undecimated wavelet filter at scale s + 1 is defined recursively as
m
(s+1) (s) g (s) , if mis even
g [m] = g [m] ↑ 2 = 2 (1)
0, if mis odd
The feature descriptor formation is driven by the work of Zhang et al. [16] using
Gabor wavelets. They suggested global Gabor phase pattern (GGPP) and the local
Gabor phase pattern (LGPP) for face image representation. They computed the Gabor
wavelet coefficients at 8 orientations and 5 scales which result in 40 complex Gabor
wavelet coefficient matrices. These complex coefficients are decomposed into real
and imaginary parts. Using these coefficient matrices, they computed 80 LGPPs
and 10 GGPPs. In our work, we have designed the global UDTCWT phase pat-
tern (GUPP) and the local UDTCWT phase pattern (LUPP). For a given image,
UDTCWT coefficients are computed as mentioned in Sect. 3.1 at 6 orientations and
4 scales to obtain 24 UDTCWT complex coefficient images. By separating the real
624 N. Harivinod and B. H. Shekar
and imaginary parts, 48 UDTCWT coefficient images are obtained. From these coef-
ficients, GUPP and LUPP coefficient images are computed.
Fig. 3 Illustration of computation of the LUPP (left) and GUPP (right) at pixel P in an image of
particular scale and orientation
Using the six different orientation coefficient images at a particular scale, a GUPP
coefficient image is computed. Thus, for a given image, eight GUPP coefficient
images are computed using real and imaginary parts in four scales. The illustration
of GUPP image computation at location P(1, 1) of a sample image of size 3 × 3 is
shown in Fig. 3.
Let, Cs,d be the coefficient image at scale s and orientation d. Consider these
coefficient matrices at six various orientations with a particular scale s. Let, GUPPs
be the GUPP coefficient image at scale s. The computation of GUPPs at location
626 N. Harivinod and B. H. Shekar
(x, y) is formed by concatenating the bit vector got from six various orientation
coefficient matrices at (x, y). The MSB and LSB are assumed to be zero, to form
an 8-bit vector. The decimal value of this byte becomes the GUPPs image value at
(x, y). This computation is extended to all pixel locations (x, y). Thus, for a given
image, we compute eight GUPP images; four each for real and imaginary coefficient
matrices at various scales. The order of the GUPP is the same as that of the coefficient
image. Mathematically, this procedure is formulated as
6
GUPPs (x, y) = u(Cs,d (x, y)).2d where s = 1, 2, 3, 4 (4)
d=1
The global descriptor for an image using 48 LUPPs and 8 GUPPs is formed as fol-
lows: (a) figure out the spatial histogram in each blockade by breaking down GUPPs
and LUPPs into blockades. (b) The global descriptor is formed by integrating these
histograms. UDTCWT responses for a face image at different scales and directions
are shown in Fig. 4. Also, GUPP responses are shown in Fig. 5.
Further, we have applied various classification techniques, namely logistic regres-
sion, linear discriminant analysis, kernel Fisher analysis, and K -nearest neighbor
with K = 1, 3, 5. It is found that kernel Fisher analysis gives the best recognition
accuracy, and details are reported in the following sections.
50 100
50
50
0
0
0
-50
-50
-100
-50
-100
100
50
50
50
0
0 0
-50
-50
-50
-100 -100
Fig. 4 (Left) UDTCWT real coefficient image responses for a face image at six directions at
a particular scale (scale = 4). The colors indicate the range of values that the image responses
correspond to. (Right) First two rows show UDTCWT real coefficient image responses at six
directions for two different scales of a face image. Last two rows show that of UDTCWT imaginary
coefficient image responses for two scales
Multimodal Biometric System Using Undecimated Dual-Tree … 627
Fig. 5 (Left) First row shows GUPP real coefficient image responses at four scales of a face image.
Second row shows that of GUPP imaginary coefficient image response. (Right) First row shows
GUPP real coefficient image responses at four scales of a face image. Second row shows that of
GUPP imaginary coefficient image response
4 Experimental Results
The implementation is carried out using MATLAB (R2018a). The MEPCO, PolyU,
and CASIA biometric datasets are used for the experiments. MEPCO biometric
database [9] gives both face and iris images from the same person. In face images, the
subject is captured with different illuminations and expressions. The PolyU palmprint
database [17] contains images of 386 individuals. In total, there are 7752 grayscale
images. Samples are collected in two sessions. CASIA face image database (CASIA-
FaceV5) [2] contains 2500 color facial images of 500 subjects. Typical intra-class
variations include illumination, pose, expression, eyeglasses, imaging distance.
The identification results of the proposed method on the MEPCO dataset are pre-
sented in Table 1. Unimodal and bimodal results are given for the face and iris. To test
the robustness for the face modality, we have used non-frontal testing samples with
varying pose and images with spectacle. We have compared the results with various
classifiers also, viz linear discriminant analysis (LDA), logistic regression (LR), K -
nearest neighbors (K NN). K NN experiments are conducted with K = 1, 3, 5. It is
observed that UDTCWT features give good results on recognition. The results are
Table 2 Comparison of UDTCWT features recognition using various classifiers on MEPCO dataset
Classifiers Recognition accuracy
Face Iris Face and iris
LR 92.75 98.26 100.0
LDA 61.06 96.52 100.0
K NN with K = 1 90.24 98.26 100.0
K NN with K = 3 81.64 93.04 98.26
K NN with K = 5 79.80 88.69 90.43
Kernel Fisher analysis 98.57 99.24 100.0
Table 3 Comparison of verification rate or guinine acceptance rate (GAR) of multimodal biometrics
on the MEPCO dataset using UDTCWT features
Features GAR at
%1 FAR 0.1% FAR 0.001 FAR
Gray features 32.71 22.17 17.34
Gabor features 80.87 49.13 37.39
UDTCWT features 100.00 100.00 100.00
The recognition results of the proposed method on the PolyU palmprint and CASIA
face dataset are presented in Table 4. Unimodal and bimodal results are given for
face and palmprint. Here, one can observe that in the bimodal experiments when
features are gray or Gabor, the recognition accuracy is decreasing. This is because
Gabor wavelets represent palmprint efficiency but fail to represent a face. Hence,
the combination of both deteriorates the results. The results are also compared with
various classifiers, viz linear discriminant analysis (LDA), logistic regression (LR),
K -nearest neighbors (K NN). K NN experiments are conducted with K = 1, 3, 5.
It is observed that UDTCWT features provide good results on identification. The
results are shown in Table 5. The recognition accuracy is obtained using K -fold
Multimodal Biometric System Using Undecimated Dual-Tree … 629
Fig. 6 ROC for verification experiments on MEPCO dataset. ROC for experiments on PolyU
palmprint and CASIA face dataset
Table 4 Comparison of recognition accuracy on PolyU palmprint dataset and CASIA face dataset
using UDTCWT features
Features used Recognition accuracy
Face Palmprint Face and palmprint
Gray values 42.25 72.17 59.56
Gabor coefficients 44.32 91.28 64.75
UDTCWT features 73.75 97.25 99.45
cross-validation with K = 5. This method gives a good measure for how well our
algorithm is trained upon a given data and test it on unseen data. The verification
experiments are also conducted. Table 6 summarizes the result. The receiver operating
characteristic curve for these bimodal experiments is given in Fig. 6.
630 N. Harivinod and B. H. Shekar
Table 6 Comparison of verification rate or guinine acceptance rate (GAR) of multimodal biometrics
by combining PolyU palmprint and CASIA face dataset using UDTCWT features
Features GAR at
%1 FAR 0.1% FAR 0.001 FAR
Gray features 80.50 66.75 57.25
Gabor features 76.00 56.25 42.50
UDTCWT features 100.0 100.0 99.50
5 Conclusion
The UDTCWT features for face-iris and face-palmprint multibiometrics are dis-
cussed in the paper. The experiments are given for both recognition and verification.
From the experimental results, we conclude that a combination of multimodal bio-
metrics using local and global features of UDTCWT gives better results.
References
1. N. Anantrasirichai, J. Burn, D.R. Bull, Robust texture features for blurred images using undec-
imated dual-tree complex wavelets (2014), pp. 5696–5700
2. CASIA, CASIA face dataset (2020). http://biometrics.idealtest.org/
3. P. Dutilleux, An implementation of the “algorithme à trous” to compute the wavelet transform,
in Wavelets (1990), pp. 298–304
4. A. Ellmauthaler, E.A. da Silva, C.L. Pagliari, S.R. Neves, Infrared-visible image fusion using
the undecimated wavelet transform with spectral factorization and target extraction (2012), pp.
2661–2664
5. G. Goswami, P. Mittal, A. Majumdar, M. Vatsa, R. Singh, Group sparse representation based
classification for multifeature multimodal biometrics. Inf. Fusion 32, 3–12 (2016)
6. P. Hill, A. Achim, D. Bull, The undecimated dual tree complex wavelet transform and its
application to bivariate image denoising using a cauchy model (2012), pp. 1205–1208
7. A.K. Jain, R. Bolle, S. Pankanti, Biometrics: Personal Identification in Networked Society, vol.
479 (Springer Science and Business Media, 2006)
8. S.K. Kanagala, G. Sreenivasulu, Landsat 8: Udt-cwt based denoising and yield estimation, pp.
1036–1040 (2018)
9. MEPCO: MEPCO biometric database (2021). http://biometric.mepcoeng.ac.in/mepcobiodb/
index.html
10. P. Niu, X. Shen, T. Wei, H. Yang, X. Wang, Blind image watermark decoder in UDTCWT
domain using Weibull mixtures-based vector HMT. IEEE Access 8, 46624–46641 (2020)
11. D. Rajesh, B. Shekar, Undecimated dual tree complex wavelet transform based face recognition,
pp. 720–726 (2016)
12. O. Rockinger, Pixel-level fusion of image sequences using wavelet frames, in Proceedings of
the 16th Leeds Annual Statistical Research Workshop (1996), pp. 149–154
13. I.W. Selesnick, R.G. Baraniuk, N.G. Kingsbury, The dual-tree complex wavelet transform.
IEEE Sig. Process. Mag. 22(6)
Multimodal Biometric System Using Undecimated Dual-Tree … 631
14. B.H. Shekar, P. Rathnakara Shetty, M. Sharmila Kumari, L. Mestetsky, Action recognition using
undecimated dual tree complex wavelet transform from depth motion maps/depth sequences,
in International Archives of the Photogrammetry, Remote Sensing and Spatial Information
Sciences (2019)
15. M. Unser, Texture classification and segmentation using wavelet frames. IEEE Trans. Image
Process. 4(11), 1549–1560
16. B. Zhang, S. Shan, X. Chen, W. Gao, Histogram of gabor phase patterns (HGPP): a novel
object representation approach for face recognition. IEEE Trans. Image Process. 16(1), 57–68
(2007)
17. D. Zhang, W.K. Kong, J. You, M. Wong, Online palmprint identification. IEEE Trans. Pattern
Anal. Mach. Intell. 25(9), 1041–1050 (2003)
Design of Modified Dual-Coupled Linear
Congruential Generator Method
Architecture for Pseudorandom Bit
Generation
1 Introduction
To protect the data in various applications over the internet requires security and
privacy for it. Especially in IOT applications, Pseudo–Random Bit Generator is
used as an essential component to maintain the user privacy. There are different
methods of PRBS among which Linear Feedback Shift Register (LFSR), Linear
congruential generator (LCG), Couple Linear congruential generator (CLCG), Dual-
Coupled LCG are very popular. Linear feedback shift registers and Linear Congru-
ential generators are mathematically well understood, and it is also a low complexity
PRBG. However, these PRBGs badly fail randomness tests and are insecure due to
its linearity structure [1]. But there are some cryptographic limitations such as that
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 633
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_57
634 C. Heera and V. Shankar
the sequence generated depends on the linear equations. Because of these limita-
tions, these PRBG’s fails random tests and are insecure. Two LCGs are coupled in
the CLCG method and makes it more secure than a single LCG [2]. Dual CLCG
involves four LCGs and two inequality comparisons to generate pseudorandom
binary sequence. The dual CLCG method produces one-bit random output only
when inequality equations occur. Hence, at every iteration, it is unable to generate
pseudorandom bit.
2 Related Study
For more security again two CLCG’s are coupled in Dual CLCG Method [6],
whose outputs acts as inputs to tri-state buffer. But there are some draw backs in Dual
CLCG method. To overcome those drawbacks, a modified method is proposed. In
Design of Modified Dual-Coupled Linear Congruential … 635
modified method, we are placing one XOR gate instead of tri-state buffer, controller,
and memory units.
Security and privacy over the net is that the most sensitive and first objective to
safeguard information in numerous Internet-of-Things (IoT) applications. Several
devices which are connected to the net generate huge information which will lead
to user privacy problems [7, 8]. Also there are various security problems to design
IoT, whose purpose is to connect people to things and things to things over the net
[9, 10].
3 Existing System
1 if xi+1 > yi+1 and pi+1 > qi+1
Zi = (8)
0 if xi+1 < yi+1 and pi+1 < qi+1
Z i = Bi if Ci = 0 (9)
where
1, if xi+1 > yi+1 1, if pi+1 > qi+1
Bi = and Ci = (10)
0, else 0, else
Design of Modified Dual-Coupled Linear Congruential … 637
where
638 C. Heera and V. Shankar
1, if xi+1 > yi+1 1, if pi+1 > qi+1
Bi = and Ci = (16)
0, else 0, else
1. Input
n (positive integer), m = 2n .
2. Initialization
b1, b2, b3, b4 < m, such that these are relatively prime with m.
a1, a2, a3, a4 < m, such that a1 − 1, a2 − 1, a3 – 1 and a4 – 1 must be divisible
by 4.
Initial seeds x 0 , y0 , p0, and q0 < m.
3. Output Zi
(a) For i = 0 to k
(b) Compute x i+1 , yi+1, pi+1 and qi+1 using following equations
• x i+1 = a1 * x i + b1 mod 2n
• yi+1 = a2 * yi + b2 mod 2n
• pi+1 = a3 * pi + b3 mod 2n
Design of Modified Dual-Coupled Linear Congruential … 639
• qi+1 = a4 * qi + b4 mod 2n
(c) If x i+1 > yi+1, then Bi = 1 else Bi = 0
(d) If pi+1 > qi+1, then C i = 1 else C i = 0
(e) Z i = (Bi + C i ) mod 2
(f) Return Z i .
• Pseudo Random Bit Generators are used in the following cryptographic algorithms
which require the 8-bit vector sequence.
Exhaustive search
Time-memory Trade offs
Counter mode of operation
• Pseudo Random Bit generated are used in the signal integrity tests in transceivers
in order to test the signal integrity by running the different tests like loop back
tests by sending the random data patterns from the transmitter and receiving the
same at the receiver.
5 Simulation Results
Table 1 Measured
Parameter Existing Modified
parameters for existing and
dual-CLCG dual-CLCG
modified architectures
Number of LUTs 192 131
Number of DFFs 121 96
Initial clock latency 256 [In Tclk] 8 [In Tclk]
Output to output 2 [In Tclk] 2 [In Tclk]
latency
Maximum frequency 220.11 MHz 220.11 MHz
Power at max 27.501 mW 27.501 mW
frequency
Power/Frequency 0.1249 mW/MHz 0.1249 mW/MHz
Delay 4.632 ns 4.363 ns
See Table 1.
6 Conclusion
Existing Dual CLCG Method architecture have so many drawbacks as large usage of
flipflops, high initial clock latency of 2n for n-bit architecture, fails to attain maximum
length sequence of 2n , unable to produce pseudo random bit at every iteration. From
Table 1, we can conclude that proposed modified dual CLCG method architecture
642 C. Heera and V. Shankar
overcomes all these drawbacks. The existing and proposed architectures are designed
using Verilog-HDL using Microsemi Libero Tool.
References
1. J. Stern, Secret linear congruential generators are not cryptographically secure, in Proceedings
28th Annual Symposium on Foundations of Computer Science, Oct 1987, pp. 421–426.
2. R.S. Katti, R.G. Kavasseri, Secure pseudo-random bit sequence generation using
coupled linear congruential generators, in Proceedings IEEE International Symposium on
Circuits and Systems (ISCAS), Seattle, WA, USA, May 2008, pp. 2929–2932
3. R. Ostrovsky, Foundations of Cryptography (Lecture Notes) (UCLA, Los Angeles, CA, USA,
2010)
4. O. Goldreich, Foundations of Cryptography (Cambridge University Press, New York, NY,
USA, 2004)
5. R.S. Katti, R.G. Kavasseri, V. Sai, Pseudorandom bit generation using coupled congruential
generators. IEEE Trans. Circuits Syst. II, Exp. Briefs. 57(3), 203–207 (2010)
6. A.K. Panda, K.C. Ray, Modified dual-CLCG method and its VLSI architecture for pseudo-
random bit generation. IEEE Trans. Circuits Syst. I: Regular Pap. 66(3) (2019)
7. J. Zhou, Z. Cao, X. Dong, A.V. Vasilakos, Security and privacy for cloud-based IoT: challenges.
IEEE Commun. Mag. 55(1), 26–33 (2017)
8. Q. Zhang, L.T. Yang, Z. Chen, Privacy preserving deep computation model on cloud for big
data feature learning. IEEE Trans. Comput. 65(5), 1351–1362 (2016)
9. E. Fernandes, A. Rahmati, K. Eykholt, A. Prakash, Internet of Things security research: a
rehash of old ideas or new intellectual challenges? IEEE Secur. Privacy 15(4), 79–84 (2017)
10. M. Frustaci, P. Pace, G. Aloi, G. Fortino, Evaluating critical security issues of the IoT world:
present and future challenges. IEEE Internet Things J. 5(4), 2483–2495 (2018)
11. S.-W. Cheng, A high-speed magnitude comparator with small transistor count, in Proceedings
ICECS, vol. 3, 2003, pp. 1168–1171
12. T. Kim, W. Jao, S. Tjiang, Circuit optimization using carry-saveadder cells. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst 17(10), 974–984 (1998)
Performance Analysis of PAPR and BER
in FBMC-OQAM with Low-complexity
Using Modified Fast Convolution
Abstract The asynchronous waveform with an ultra-low side lobe, fast convolu-
tion multi-carrier (FCMC), has appeared to be a promising technique for potential
wireless communications. Limiting computational complexity offers benefits such as
lower energy usage, faster processing, and less latency. These benefits have become
more significant with 5G, where the most important characteristics are minimal
latency, robust, and high speed. The fast convolution (FC) filter architecture uses
only filter values obtained in the frequency domain, since we limit the filter to the
frequency domain. This completely removes the traditional polyphase operation of
the filter. We introduce the FBMC/OQAM framework utilizing modified fast convo-
lution and test the system’s performance on a communication channel. The proposed
system is related to a traditional FBMC/OQAM polyphase system, and we have
noticed our modified fast convolution filter outperforms FBMC’s polyphase design
in terms of complementary cumulative distribution function (CCDF), computational
complexity, and BER metrics.
1 Introduction
The OFDM is a multi-carrier communication system that splits up the spectrum avail-
able into multiple carriers, each of which is modulated by a low-rate data source.
OFDM is identical to FDMA by segmenting the bandwidth available into several
channels and thus enabling it to be allocated to users. OFDM uses the spectrum
more efficiently by spacing the channels, but by grouping the channels much more
closely together, OFDM utilizes bandwidth much more effectively. This is accom-
plished by orthogonalizing both carriers and eliminating interference between all
the narrow carriers. OFDM may be considered as a transfer strategy. It also refers
to this technique as OFDM when used in a wireless environment. It refers to the
wired environment as discrete multi-tone such as asymmetric digital subscriber lines
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 643
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_58
644 D. R. Prasad et al.
(ADSLs). In OFDM, such a carrier is orthogonal. That being said, DMT [1] does not
hold this situation on the record. Amidst several benefits relative to traditional OFDM
systems, FBMC still has several unresolved problems that make them workable for
realistic applications. In this study, the primary objective of research is to resolve
some core challenges for potential wireless networks in FBMC systems. There are
many desirable features in FBMC-OQAM like excellent frequency position, a very
low side lobe power spectra density, and reliability of time-varying channels and
carrier frequency offsets. FBMC-OQAM, configured with this feature, is much more
appropriate than the orthogonal frequency division multiplexing (OFDM) wireless
communication system for 5G technology for future generations, in particular for
asynchronous applications. FBMC-OQAM possesses a good peak-to-average power
ratio (PAPR) as a multi-carrier system.
New strategies for minimizing PAPR are important. There is a notable classi-
fication of the following method: Galactic swarm optimization [2], optimal filter
design based on GFDM [3], suboptimal partial transmits sequences based [4],
custom conic optimized iterative adaptive clipping, and filtering based [5], while
some other PAPR reduction approaches for OFDM do exist. The fast convolution
of the filtering design uses the frequency response samples of the filter, or rather,
FFT samples for filtering. The sampling factor is regulated during fast convolu-
tion by selecting the required input and output blocks. FBMC/OQAM and FMT
waveform generation using fast convolution (FC) are deliberated explicitly about the
performance of their architecture, which is represented in terms of reduced spec-
tral leakage, better filter bank stability, and computational complexity savings in
the FBMC/OQAM filter bank framework. However, few authors have measured the
system’s overall performance in the presence of noise on communication links. This
paper describes the FBMC/OQAM method with fast convolution. The BER system
performance in an AWGN and system PAPR channel is measured and compared to
traditional FBMC/OQAM polyphase. In the case of a fast convolution method, BER
performance and PAPR are shown to be higher.
2 Related Work
With their critical research, we mention here a few significant and recent papers. In
[6], authors stated that specific channel estimation systems remain an open problem
for both inter-carrier as well as inter-symbol interference in the multi-carrier filter
bank that has offset quadrature amplitude modulation (QAM). To increase the channel
estimation efficiency in preamble-based FBMC/OQAM systems, the adoption of the
commonly employed modeling for the study of analysis filter bank (AFB) response
proposes a recursive least squares multi-tap-channel estimation (RLS-MTCE) frame-
work which minimizes noise effects. On the other hand, the authors are using the new
AFB signal model and designating multi-tap channel estimation systems, which are
referred to as LS-MTCE and MMSE-MTCE, for least square and minimum mean
square error (MMSE), respectively, to correctly consider the interference in order to
Performance Analysis of PAPR and BER … 645
increase the efficiency of the channel estimation. As related to the current systems,
the proposed counterparts typically attain minimal error expense with a reasonable
rise in computational complexity.
In [7], the author shows the efficiency of an FBMC-OQA massive MIMO uplink
system (MSE) is technically defined by an average mean squared error (OMF)
production and for three distinct forms of linear receivers, i.e., zero forcer (ZF),
LMMSE, and matched filter (MF). The random matrix theory asymptotically char-
acterizes the MSE performance of these receivers as the number of base station
(BS) N antennas and K is increased, thus keeping a finite N/K ratio. The expressions
obtained allow several inferences to be made, some of them already noted but not tech-
nically illustrated in the literature. First, because of the channel hardening effect, the
MSE becomes uniform throughout the frequency band. Second, they illustrate good
user synchronization in a massive MIMO environment. In conclusion, the various
MSE concepts such as noise, inter-user interference (IUI), and channel frequency
selectivity skewed variance would become trivial at broad N/K ratio values if the
users are well synchronized. In the previous research, they recognized this effect as
“self-equalization.”
In [8], the author implemented a multi-carrier filter bank (FBMC) framework to
address these drawbacks with prototypes of pulse shaping filters to satisfy system
specifications. According to its significant impact on achieved performance, filter
selection is vital for the FBMC/offset quadrature amplitude modulation (OQAM)
system. In order to improve system performance, new pulsing filters prototypes for
FBMC/OQAM systems would then be suggested. Several prototypes like raised
cosine pulse (RCP), root-raised PHYDYAS, etc. Hermitis are pulse shaping, root-
raised cosine (RRC) pulse filters, effectively timeline offset (TO). Since multiple
input multiple output (MIMO) is a more prominent FBMC-related problem, we are
proposing a new method using Walsh-Hadamard (WH) code in MIMO FBMC/OQA
multiple input/time block spreading and exploring how our suggested pulse shaping
filters integrate with MIMO systems. FBMC/OQAM systems applicants for the
suggested filters are seen to be appropriate.
In [9], the authors suggested the Bayesian compressive sensing method (BCS) for
the FBMC/OQA multiple input scenario (MIMO scenario) to estimate channel effi-
ciency. They suggest a high channel estimation for an iterative fast Bayesian matching
tracking algorithm. They provide the first statistical data for the sparse channel model
in Bayesian channel estimations. They use the BCS channel estimating technique to
predict the channel pulse response efficiently. Then, by optimizing iterative termina-
tion conditions, an updated FBMP algorithm is suggested. The simulation findings
show that the proposed method gives more than the traditional compressive sensing
technique, mean square error (MSE), and bit error rate (BER).
In [10], in the 5G connectivity research, they expressed a major concern regarding
the multi-carrier filter bank with offset quadrature amplitude modulation (FBMC-
OQAM). FBMC-OQAM also has the intrinsic limitation of resolving the high peak-
to-average power ratio (PAPR). Because of the FBMC-OQMA conflicting frame-
works, they see that the procedure suggested for FBMC-OQAM OFDM is inade-
quate, explicitly using the conventional partial transmission sequence (PTS). They
646 D. R. Prasad et al.
suggest an updated PTS-based method with the use of phase rotation factors to
only maximize the phase of the sparse PTS (sparse PTS) signal. Theoretical and
simulation findings show that the suggested S-PTS scheme offers substantially less
computationally complex PAPR reduction performance.
This section introduces the typical configuration of the FBMC framework with
the filter bank or transmultiplexer (TMUX) structure for synthesis study. Figure 1
displays the TMUX configuration of the FBMC framework. The complex input
symbols of the FBMC framework are as discussed in [11] and represented in
mathematical form as
In Eq. (1), the terms and denote complex terms on the nth subcarrier of the mth
data block of the system. These components are equally spaced in the time domain of
T /2, where T is a symbolic duration. N then transfers to N-prototype filters parallel
symbols. The s(t) of the data blocks of M may then be written with the FBMC-OQAM
signals [12].
M
M
N N
T
s(t) = smn (t) = amn p t − m T + jbmn p t − m T − e jnϑt , (2)
m=1 n=1 m=1 n=1
2
In Eq. (2), the term p(t) signifies a PHDYAS prototype filter as adopted by [13]
with U = 4, ϑt is equivalent to 2πt
T
+ π2 .Then, resultant impulse response of the term
p(t) is stated as
U −1
it
p(t) = 1 + 2 G i cos 2π (3)
i=1
UT
√
where G1 = 0.97196, G2 = 2/2, and G3 = 0.235147.
As per the Nyquist theorem, at the sampling rate of T /K, all the signals of s(t) will
undergo the process of sampling which results in the approximation of the true PAPR
value of the discrete-time signals. The discrete-time signals may then be represented
as
⎧ M N
⎨ m=1 n=1 amn p(k − m K ) + jbmn p k − m K − 2
K
Re(Ck,n ), k even
dk,n = (5)
Im(Ck,n ), k odd
Re(Ck,n ), k even
dk,2n+1 = (6)
Im(Ck,n ), k odd
In Eq. (4), the term L p signifies prototype filters discrete-time length, and k signi-
fies its corresponding time index. These valued real symbols are therefore multiplied
in numbers before filtering. Post-processing OQAM can be extracted from a specific
analytical filter bank and integrates two signals at once, the true component of each
OQAM symbol. The last phase of the FBMC-OQAM method is post-processing.
The analytical filters hk (m) and gk (m) are extracted from the prototype N-length
filter h(m) [14] as
2π k N −1
h k (m) = h(m) exp j m− (7)
M 2
gk (m) = h ∗k (N − 1 − m) (8)
where m = 0, 1, …, N − 1.
When convolution is applied by leveraging the FFT, it is cyclic convolution. For
FFT, we insert zeros before all are of the same duration as the signal or filter sequence.
If the x(n) signal FFT is multiplied by the h(n) filter’s FFT, the product is the y(n)
648 D. R. Prasad et al.
output FFT. That being said, the y(n) length generated by an inverse FFT is equivalent
to the input length. This is not necessarily simple, since the output length of a long
L block with an M-length filter is L + M − 1. That implies that the output blocks
must be overlapped and added, not merely concatenated. The second argument is
that the overlapping stages involve non-cyclical convolution, and FFT convolution
is cyclical. Implementing a L − 1 zero on the impulse response and a M − 1 zero
on each input block are both done conveniently such that both FFTs have M + L
− 1 duration. There is no aliasing, and the cyclic convolution applied generates the
same effect as the ideal non-cyclic convolution. Arithmetic savings may be important
whether FIR digital filtering is implemented or carried out with convolution. There
are two limits, though. The usage of blocks contributes to a one-block delay. Until
the first input blocks are available, none of the first output blocks can be determined.
The second limit is the handling and sorting of blocks.
This problem is also eliminated by the continuous reduction in memory costs. An
alternate way of overlap-add may be created if the output is first segmented instead
of the data. If we find the measurement of an output block, we can show it requires
not only the relevant input block but also part of the previous input block. We can
also show it that for each output block, we require a M + L − 1 portion of the input.
Then, the last element of the previous block is saved and compared with the new input
block to be determined by h(n). Figure 2a shows the overall diagram of the block of
the fast convolution filter bank. The utilization of required weights in the frequency
domain of FFT operation is used by a filter that uses fast convolution operation rather
than time domain impulse response. The FFT is multiplied by FFT filter weights on
Fig. 2 Overall bock diagram of filter banks a fast convolution, b polyphase synthesis
Performance Analysis of PAPR and BER … 649
the input signal block, and the IFFT is used to generate a time domain output signal
[15]. This study investigates opportunities for reducing the complexity of FC-based
waveforms, remembering that the basic notion of FC is the effective implementation
of high-order linear filters via frequency domain processing.
We place a particular emphasis on circumstances in which a small portion of the
bandwidth is active. By adding to the start and the end of each packet, a set of virtual
(i.e., not carrying any data) symbols and by intelligently selecting these symbols, we
show that FMBC-OQAM ramp-up and ramp-down tails may be removed such that
they are unimportant and are therefore discarded. This reduces the signal length of
each FBMC-OQAM packet so that its bandwidth efficiency increases, i.e., the same
data are delivered in a less time-limited fashion. By choosing the input and output
block length, the multi-rate configuration, the proposed filter is incorporated with an
embedded fast convolution operation to improve the performance of the system, as
shown in Fig. 2b.
The signal C ph (m) = exp( j2π mko L s,k /L k ), where m is the block index, and
k o is the center frequency of the kth band-pass filter is used in each input block of
the kth channel. They require this to hold the process of consecutive blocks contin-
uous. We order FFT domain weights in Fig. 3 for the FBMC/OQAM waveform of
neighboring channels. The neighboring channels get overlapped with the half portion
of bandwidth for accommodation of sub-channels collected as the preprocessing of
OQAM triggers the up-sampling factor two.
We make the FFT weights up only of bands and bands of change. Therefore, the
FC filter is a linear device that varies time (LPTV). We built the FFT filter weights
to minimize the impact of cyclic distortion to mitigate interference in the stop-band
area due to an optimization problem. If the L domain weighs, the pulse response
h(h) is to be found with the N-point IFFT of weight wk. In fact, we may combine
both functionalities in filter-based deployments. First, we include the waveforms
created for transmission in a spectrum and the unused sections of the spectrum for
dynamic and fragmented utilization must not be clear by further processes. Second,
on the recipient side, the filter bank processing may eliminate the interference from
the unused portions of the permitted spectrum. We would note that the structure
contains lengthy FFT/IFFT transformations, length N, short transformations of length
L, and nontrivial complexes L − 1 for each subchannel, in an evaluation of FC filter
bank (FCFB) computational complexity based on implementing FBMC/OQAM. The
h̃
time-varying impulsive response (η) can now be modeled as described in [16]. With
n
h̃ H̃
N-point FFT of (η) for measure frequency response (ω) for each n. The total
n n
stop-band region interference is measured as
2
Is (ωi ) = H̃n (ω1 ) L s for ω1 ∈ s (9)
n
k−1
2π ln
g(n) = h 0 + 2 h 1 cos 0≤n≤N (11)
l−1
N
where K is the overlapping factor and N is the filter length, N = KM. A 128-channel
FBMC/OQAM framework has been analyzed in terms of machine complexity. For
72 active subcarriers, the FC dependent FBMC/OQAM complexity with N = 1024
architecture parameters, L = 16, and L s = 10 are 40 multiplications per identi-
fied symbol toward 44 multiplications and 56 multiplications involved in the iden-
tification of an FBMC/OQAM polyphase symbol on the transmitter and recipient
side, respectively. The increase in performance in FC-driven FBMC/conceptual
OQAM’s complexity over FBMC/OQAM polyphase is growing with fewer involved
subcarriers.
In this section, the performance metrics were carried out in MATLAB. For deploy-
ment, an FBMC-OQAM 64 channel framework is being considered. Choosing N =
2048 and L = 64 reached a ratio of sampling R = 32. Filter FFT weights must also
be configured to eliminate cyclic distortion. Figure 4a analyzed and contrasted the
graph showing BER comparison for the proposed fast convolution filter (FC-FBMC-
OQAM) with the classical polyphase filter in the presence of an AWGN channel.
It is evident that the low bit error rate (BER) of the proposed fast convolution filter
bank is maintained for all the SNR values as related to classical polyphase filters.
Performance Analysis of PAPR and BER … 651
Fig. 4 Performance comparison of two filters: a BER comparison for proposed fast convolution
filter (FC-FBMC-OQAM) with classical polyphase filter, b PAPR comparison for proposed fast
convolution filter (FC-FBMC-OQAM) with classical polyphase filter
Figure 4b analyzed and contrasted the graph showing PAPR comparison for the
proposed fast convolution filter (FC-FBMC-OQAM) with the classical polyphase
filter in the presence of an AWGN channel. It is evident that the good CCDF of the
proposed fast convolution filter bank is maintained for all the PAPR values as related
to classical polyphase filters. Also, it is shown that for proposed up to 12 dB only,
whereas in the FBMC/OQAM polyphase, it extends to 16 dbB. Table 1 demonstrates
the computational complexity of the proposed fast convolution filter (FC-FBMC-
OQAM) with classical polyphase filter schemes.
From Fig. 5, it is evident that the number of additions and multiplications required
for our proposed system (FC-FBMC-OQAM) is directly proportional to the number
of active subcarriers. For our proposed scheme (FC-FBMC-OQAM), it is observed
that for 43 active subcarriers, the computational complexity is less as compared to
the classical polyphase filter for 73 active subcarriers.
5 Conclusions
In this study, given the high PAPR of the FBMC-OQAM signal, the need to investigate
adequate reduction schemes for PAPR are critical. To date, we have suggested several
improved techniques in order to decrease FBMC-OQAM PAPR values in compliance
652 D. R. Prasad et al.
Fig. 5 Graph showing computation complexity for proposed fast convolution filter (FC-FBMC-
OQAM) with classical polyphase filter
References
4. J. Gao, J. Wang, B. Wang, X. Song, Minimizing PAPR of OFDM signals using suboptimal
partial transmit sequences, in Proceedings of 2012 IEEE International Conference on Infor-
mation Science and Technology, ICIST 2012 (2012). https://doi.org/10.1109/ICIST.2012.622
1753.
5. N. Vijayakumar Dr., S. Sudhir, PAPR Reduction of OFDM signal via custom conic optimized
iterative adaptive clipping and filtering. Wirel. Pers. Commun. 78, 867–880 (2014).https://doi.
org/10.1007/s11277-014-1788-x
6. D. Ren, J. Li, G. Zhang, G. Lu, J. Ge, Multi-tap channel estimation for preamble-based
FBMC/OQAM systems. IEEE Access 8, 176232–176240 (2020). https://doi.org/10.1109/ACC
ESS.2020.3026341
7. F. Rottenberg, X. Mestre, F. Horlin, J. Louveaux, Performance analysis of linear receivers for
uplink massive MIMO FBMC-OQAM systems. IEEE Trans. Sig. Process., 1–1 (2017). https://
doi.org/10.1109/TSP.2017.2778682
8. H.M. Abdel-Atty, W. Raslan, A. Khalil, Evaluation and analysis of FBMC/OQAM systems
based on pulse shaping filters. IEEE Access, 1–1 (2020). https://doi.org/10.1109/ACCESS.
2020.2981744
9. H. Wang, W. Du, X. Wang, G. Yu, L. Xu, Channel estimation performance analysis of
FBMC/OQAM systems with Bayesian approach for 5G-enabled IoT applications. Wirel.
Commun. Mob. Comput. (2020).https://doi.org/10.1155/2020/2389673
10. H. Deng, S. Ren, Y. Liu, C. Tang, Modified PTS-based PAPR reduction for FBMC-OQAM
systems. J. Phys.: Conf. Ser. 910, 012057 (2017).https://doi.org/10.1088/1742-6596/910/1/
012057
11. P. Siohan, C. Siclet, N. Lacaille, Analysis and design of OFDM/OQAM systems based on
filter bank theory. IEEE Trans. Sig. Process. 50, 1170–1183 (2002).https://doi.org/10.1109/78.
995073
12. P. Jirajaracheep, S. Sanpan, P. Boonsrimuang, P. Boonsrimuang, PAPR reduction in FBMC-
OQAM signals with half complexity of trellis-based SLM (2018), pp. 1–5. https://doi.org/10.
23919/ICACT.2018.8323624
13. H. Nam, M. Choi, S. Han, C. Kim, S. Choi, D. Hong, A new filter-bank multicarrier system
with two prototype filters for QAM symbols transmission and reception. IEEE Trans. Wirel.
Commun. 15, 1–1 (2016).https://doi.org/10.1109/TWC.2016.2575839
14. N. Varghese, J. Chunkath, V.S. Sheeba, Peak-to-average power ratio reduction in FBMC-
OQAM system, in Proceedings—2014 4th International Conference on Advances in Computing
and Communications, ICACC 2014 (2014), pp. 286–290. https://doi.org/10.1109/ICACC.201
4.74
15. M. Renfors, F. Harris, Highly adjustable multirate digital filters based on fast convolution, in
2011 20th European Conference on Circuit Theory and Design, ECCTD 2011, pp. 9–12 (2011).
https://doi.org/10.1109/ECCTD.2011.6043653
16. M. Borgerding, Turning overlap-save into a multiband, mixing, downsampling filter bank
(2007).https://doi.org/10.1002/9780470170090.ch13
Sign Language Recognition Using
Convolution Neural Network
Abstract Sign language is one of the medias to communicate with deaf and dumb
people; usually, it is not known to normal people. So, it becomes a challenging task
to establish a communication between normal people and hearing impaired person.
Many tools are developed to help them, but unfortunately not produce accurate
results. To interact with them, we are using various fingers gestures; then, designed
model will convert those gesture into words or alphabets into specific language. The
predicted model is helpful to reduce the gap between the normal people and hearing
impaired person. In our proposed sign language recognition algorithm, we focused
on deep convolution neural network to produce better accuracy.
1 Introduction
In the world to interact with any person, we require some language in the form of
textual, vocal or visual representation. But in case of visually challenged persons
means those who have hearing impaired problems communication is very tedious
task. To communicate with them require a suitable media through visual, i.e., sign
language [1]. It is very useful to persons those who have difficulties with speaking or
hearing. Sign language is one of the popular communication media it used various
ways like hand motions, facial expressions and body movements to express some-
thing. In the world, already some of the popular sign languages are exists with various
functionalities and limitations [2, 3]. Some of the popular sign languages are Indian
Sign Language (ISL), Polish Sign Language, American Sign Language, etc., based on
the geographical conditions languages are changed like spoken languages. Because
of these every sign language have some limitations.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 655
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_59
656 V. Sannareddy et al.
Many software tools and packages are also developed teach and understand the
sign language, but the usage is limited because of not producing accurate results.
To overcome the limitations, we proposed an algorithm to interact with a hearing
impaired person accurately by recognizing and understanding the sign correctly.
We considered Indian sign Language (ISL) to check the efficiency of the proposed
algorithm.
The following Fig. 1 conveys the hand gestures of ISL.
To understand ASL, so many algorithms are already proposed, but ISL is
completely different from ASL. In ASL, they used only one hand to give the signs,
and in ISL they used two hands to provide the signs.
2 Related Work
Ansari and Harit [4] have produced significant research contributions to categorize
the Indian signal language gestures accurately. In that they used different alphabets,
numbers and different movements all together of 140 different samples they catego-
rized. To detect the various parts in the body like hand they used traditional unsuper-
vised learning technique, i.e., K-Means clustering algorithm. They used Gaussian
distribution also to extract the features required for train the data set with the accuracy
reaches 90%.
Deora and Bajaj [5] adopted PCA algorithm to recognize the sign language
gestures. In this, they also used artificial neural networks to recognize in efficient
manner. They considered very small data set and produced results are also not
satisfactory. But, compared to neural networks, PCA produces accurate results.
Sign Language Recognition Using Convolution Neural Network 657
Zhang et al. [6] adopted convolution neural networks to perform sign language
recognition. To perform the process in efficient manner, they established two-step
process—first one is extract features then next followed by classification process. To
extract the features, they used CNN and artificial neural network is used to classify
them. In proposed algorithm, they used Italian sign gestures of 27 members. CNN
adopted max pooling technique to extract the features and forwarded to ANN. This
model produces the highest accuracy of 91.7%.
3 Methodology
In the proposed algorithm, we adopted the approach described Fig. 2. In the figure,
it contains the steps to recognize the various sigs worked on the ISL.
In data set, we are going to analyze total of 4800 sign images of ISL of the English
alphabets, and it consists of 26 different class labels in the data set. The data set is
as shown in Fig. 1. At pre-processing stage, it resize the images into 640 × 480
pixels. Both feature extraction and classification techniques will be performed after
performing the normalization on images. The following Fig. 3 provides information
about workflow diagram of the proposed model. The proposed model can perform
in various steps.
1. Image Acquisition
2. Image pre-processing
3. Feature extraction
4. Apply CNN
5. Classification.
Convolution Neural Network Architecture is shown in Fig. 4.
4 Implementation
Image acquisition is the process of creating the photographic images, such as the
interior structure of an object. The term is often assumed to include the compression,
storage, printing, and display of such images. To acquire frames in real time we use
Sign Language Recognition Using Convolution Neural Network 659
various built in functions in the open CV library. The following python code used to
capture the images in real time.
cap = cv2.VideoCapture(0)
ret, img = cap.read()
The following Fig. 5 provides an example used to represent acquiring frames in
real time.
It is the most important step by using this, we are going to create a model for sign
recognition. To extract relevant features from the images, we adopted CNN technique.
It contain more than one convolution layer to extract relevant features. To extract the
topological properties from an image, we adopted feed-forward network.
On the selected features apply popular classification technique like SVM, Random
Forest and KNN to design the model using trained data set. After fitting, the model
predicts the values for test data set.
Program Code
The following python function is used to predict the hand gestures of ISL. In this,
we used opencv and tensorflow frame work to design the proposed model. Initially
for every alphabet, some integer is assigned.
The results of this proposed method will give alphabets as output corresponding to
the captured hand gestures. We can use various deep learning algorithms to predict
this output, but accuracy differs from one algorithm to other. Here, we used deep
convolution neural networks in order to predict more accurate output. Here, the
output will be alphabets that relate to the hand gestures captured. As shown in Fig. 7,
if there is no hand in front of camera then there will be no output and if any symbol
that matches with the data set is there in front of camera as shown in Fig. 8 then the
corresponding alphabet is shown.
In our proposed work, Convolution Neural Network and Adam optimizer with
learning rate 0.01 and dropout 0.25 is used. Accuracy of proposed system is as
shown in Fig. 9.
Sign Language Recognition Using Convolution Neural Network 661
Fig. 9 Accuracy of
proposed model
662 V. Sannareddy et al.
In this paper, the sign language system is proposed using the convolution neural
network algorithm after researching about various algorithms. Although our
proposed task is to recognize only the alphabets, it is 98.74% accurate which is
higher than existing systems. The proposed system is now only suitable for ISL
signs for alphabets only. It is not suitable for numbers, body movements, sentences
and facial expressions. In future, it will be extended to work with different forms
of signs. And also to improve the accuracy, we can adopt mixed breed clustering or
classification techniques in future.
References
1. A.K. Sahoo, G.S. Mishra, K.K. Ravulakollu, Sign language recognition: state of the art. ARPN
J. Eng. Appl. Sci. (2014)
2. J. Singha, K. Das, Indian sign language recognition using Eigen value weighted Euclidean
distance based classification technique. Int. J. Adv. Comput. Sci. Appl. 4, 2 (2013); N.V. Tavari,
A.V. Deorankar, Indian sign language recognition based on histograms of oriented gradient. Int.
J. Comput. Sci. Inf. Technol. 5(3), 3657–3660 (2014)
3. NMANIVAS. Gesture recognition system. https://github.com/nmanivas/Gesture-Recognition-
System
4. Z. A. Ansari, G. Harit, Nearest neighbour classification of Indian sign language gestures using
kinect camera. Sadhana 41(2), 161–182 (2016)
5. D. Deora, N. Bajaj, Indian sign language recognition, in 2012 1st International Conference
Emerging Technology Trends in Electronics, Communication and Networking (ET2ECN) (2012).
https://doi.org/10.1109/ET2ECN.2012.6470093
6. C. Zhang et al., Multi-modality American sign language recognition, in 2016 IEEE International
Conference on Image Processing (ICIP) (2016). https://doi.org/10.1109/ICIP.2016.7532886
Key-based Obfuscation of Digital Design
for Hardware Security
1 Introduction
The fabless semiconductor industries are increasing in number which outsource their
work to the IC foundries or to manufacturing plants to avoid the cost of maintaining
a manufacturing plant. Ensuring the device security during the design and manu-
facturing plays a vital role in modern life cycle development of products. There has
been a significant investigation on the security of the systems. The prime focus is on
the protection of the system rather than improvements in functionality. As various
companies located in multiple countries are involved, it has become a daunting task.
Many issues like counterfeiting, piracy, overproduction without authority, etc., have
H. Nazeem (B)
G. Narayanmma Institute of Technology and Science (for Women), Hyderabad, Telangana
500104, India
D. Amuru
CVEST, International Institute of Information Technology, Hyderabad, Telangana 500032, India
e-mail: deepthi.amuru@gnits.in; deepthi.amuru@research.iiit.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 663
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_60
664 H. Nazeem and D. Amuru
2 Related Work
Many obfuscation processes for hardware had been developed since last few years
for sequential and combinational circuits. Few publications call this technique as
obfuscation while other publications address it as encryption due to the use of key
to lock the circuit instead of the old definition of concealing the functionality. The
actual idea of obfuscation is design security using keys which avoids unauthorized
usage.
Few early papers for logical obfuscation have been implemented in [3, 4]. These
two techniques use finite state machine (FSM) insertion referred as obfuscating FSM.
The obfuscating FSM takes key bits as input to select a state of the FSM. Only when
correct state gets activated, the circuit can function correctly. However, as per [5],
one can trace and remove them from the design and copy the circuit. The limitations
of encryption technique based on logic using xnor/xor, or/and, and multiplxers have
been uncovered by the recent work using SAT solver-based tools [6]. But it has its
own drawbacks of high overheads.
The implementation presented in [7, 8] combines the FSM of obfuscation with a
physically unclonable functions (PUFs) and creates circuits with states which depend
on output of PUF. This creates each IC with unique signature, termed as IC metering.
After the chips are manufactured, each individual IC is tested for collecting necessary
information to unlock the chip. By combining the collected information and the
knowledge which the design provides, the design house only can unlock each IC.
Key-based Obfuscation of Digital Design for Hardware Security 665
In hardware obfuscation using multiplexers, two input multiplexers are used in the
design to insert the key bits. The incorrect and correct signal are mapped to the input,
and select line is mapped with key bits. The output from the multiplexer is given to
the next stage, which may be the correct signal or obfuscated signal depending on
key input. There are different types of obfuscation; they are fixed obfuscation, time-
varying obfuscation, and dynamic obfuscation. To understand this, we use control
signals. For example, consider C1 and C2 are the correct control signals and C1 and
C2 are incorrect control signals. The two key gates with select lines K[0] and K[1]
are driven with these control signals as input. The outputs are S1 and S2.
In fixed obfuscation technique, the output is proper for the correct key input (which
is fixed) and is always improper for the incorrect key input. Let us assume, the
correct key combination is “01.” When {K[0], K[1] } = {0, 1}, we get a valid signal
combination at S1 and S2 as shown in the Fig. 1.
In this approach, the output signal is dependent not only on key input but also on
trigger value. The outputs are mapped to C * T, where C is the control signal and
T is the trigger signal. In this technique, the incorrect key value for K[0] and K[1]
will choose the obfuscated signal C1 * T 1 and C2 * T 2, respectively, at S1 and S2.
Here, T 1 and T 2 are trigger signals. It is represented by a function G as shown in
the Fig. 2. In time-varying type of obfuscation, we have a periodic trigger signal.
666 H. Nazeem and D. Amuru
Generally, architectures are divided as control path and data path to ease the opti-
mization and testing of the designs [9]. The correct operation will depend on the
control flow and data path, and this information derived is most important to the
systems. So, we introduce multiplexers controlled by key bits at these critical links
for obfuscation. In this paper, the digital design used for obfuscation is Fast Fourier
Transform (FFT). FFT is the critical unit in most of the signal processing applica-
tions. We consider an 8-point DIT FFT and demonstrate adding obfuscation after
stage 1, as the output of stage 2 is dependent on the output from stage 1.
We introduce trigger circuits into the design to convert the fixed mode obfuscation
into a dynamic mode obfuscation. The trigger circuits inserted generate signals which
will trigger rarely and randomly.
In this, the control signals with the help of trigger signals are obfuscated by
using a trigger combination circuit. Then, the obfuscated signals and correct control
signal are given as input to the multiplexers. Figure 5 shows a multiplexer which is
connected to one of the control signals. To implement dynamic obfuscation on the
FFT design, we need to replace the key gates in fixed obfuscation with the basic
dynamic obfuscation unit. As shown in the Fig. 6, the basic dynamic obfuscation
unit has three inputs, i.e., trigger signal from trigger combination circuit, actual input
signal to be obfuscated, and the key. The basic dynamic obfuscation unit instantiates
a 2:1 mux for which the inputs are actual signal and the modified signal (which is
generated by fusion of input and trigger signal). Depending on key value, it selects
the actual signal or modified signal to drive the output.
For example, when correct key value is 1 and incorrect key value is 0,
It must be noted that for correct value of key, the circuit gives correct functionality,
and for incorrect key, the circuit gives correct output when trigger is low and incorrect
output when trigger is high. The trigger which is input to the circuit goes high rarely
and randomly. This will result in stronger obfuscation even with shorter keys.
5 Results
The obfuscation techniques discussed above are simulated in Xilinx 14.7 ISE to
verify their logic. The hardware description language used for the design is Verilog.
Also test benches have been created to test the design logic. The simulation results
are shown below for fixed and dynamic obfuscated FFT designs using both correct
key and incorrect key.
When the input sequence is x = {1, 1, 1, 1, 1, 1, 1, 0} and the key input is key
= 4 b1101 only, then we get the correct output {7, −0.707 − j0.707, −j, 0.707 −
j0.707, 1, 0.707 + j0.707, j, −0.707 + j0.707} as shown in Fig. 7. For other key
values (incorrect), the output is incorrect.
Fig. 8 Simulation results of dynamic obfuscation of FFT with correct key value
Table 1 shows the delay and power values of fixed obfuscated FFT design, dynamic
obfuscated FFT design, and the design without obfuscation, i.e., simple FFT. It can
Key-based Obfuscation of Digital Design for Hardware Security 671
Fig. 9 Simulation results of dynamic obfuscation of FFT with incorrect key value
be seen that there is a minor increase in delay and power values with the addition of
obfuscation circuit.
672 H. Nazeem and D. Amuru
Table 2 shows the device utilization analysis of fixed and dynamic obfuscated designs
when compared to the design without obfuscation.
6 Conclusion
This paper explains the necessity of incorporating hardware security in digital designs
to prevent IC counterfeiting and illegal overproduction. We demonstrate the poten-
tial of various obfuscation techniques—fixed and dynamic in providing hardware
security to digital designs through 8-bit Fast Fourier Transform. The dynamic obfus-
cation is different from fixed obfuscation in which obfuscating signals keep changing
with time. Dynamic obfuscation is advantageous over fixed obfuscation in terms of
time to attack even for shorter keys and hence results in stronger obfuscation. The
area, power, and delay overheads have also been analyzed. Dynamic obfuscation will
provide more security compared to fixed with a small percentage increase in delay
and power.
References
Abstract For the purpose of online transactions, the user needs to carry credit or
debit card at commercial places. In the existing swipe machine system, to authenticate
a user, only personal identification number (PIN) is used. Sometimes the user may
forget PIN or attempts wrong passwords consecutively, then the card will be blocked
and the user needs to visit the bank frequently. There are chances that the fraudsters
as an act of phishing may steal the personal information such as user id, credit or
debit card number, CVV number, and card expiry date using skimming devices. To
overcome these problems, a system is proposed which uses biometric authorization
and personal identification number (PIN) to make a transaction. The user can make
a transaction with any one of the user’s bank accounts without using a credit or
debit card and also this system does not require a swipe machine. The transaction
information is sent to the server over a secure network using the Internet and further
processing will be done. The proposed system enhances customers experience and
increases security.
1 Introduction
Many countries are moving completely toward digitization to reduce the amount of
cash flow by means of notes and make money exchange between people easier. Swipe
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 673
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_61
674 E. Mahammad et al.
2 Existing System
In this existing system, user credit or debit card is swiped through the machine, then
the user is asked to enter the PIN. After successful verification of PIN, the entered
amount debits if the amount is available in user’s account. Since only a PIN is used for
authentication, which is not more secure. Hence, the fingerprint biometric is used to
authenticate the user’s identity and also increase the security of electronic payment.
The flowchart of existing swiping machine authentication system is shown in Fig. 1.
3 Proposed System
4 System Design
In proposed system, the external power supply is given to microcontroller and GSM
module. The microcontroller is connected to the fingerprint module, matrix keypad,
and GSM module. The microcontroller can be connected to mobile or laptop through
the Internet from the Wi-Fi module of Raspberry Pi. The details of transaction will
be updated, as laptop or mobile is connected to the IoT platform. The detailed
architecture is shown in Fig. 3.
A. Raspberry Pi: The microcontroller is a small computer on a single integrated
circuit containing a processor, memory, and programmable I/O peripherals.
Raspberry Pi consists of onboard CPU, RAM, USB ports, Wi-Fi module,
camera interface to connect camera module, etc. New models of Raspberry
Pi are released with updated features.
676 E. Mahammad et al.
B. GSM Module: GSM module is a wireless modem that works with GSM wireless
network. The GSM module requires a SIM card and is able to operate as a digital
identity to link with cellular phone network. GSM is used for sending mobile
voice and information services.
C. Fingerprint Module: The fingerprint module reads the fingerprint of the user.
It converts the fingerprint to a template. The user template is sent to the micro-
controller. It performs a comparison with existing templates and authenticates
the user [4–7].
D. Matrix Keypad: Matrix keypad has built-in push button contacts connected to
the row and column lines. The microcontroller can scan these lines in button
press mode.
E. IoT Platform: IoT platform can be used to perform data storage, data analytics,
etc. IoT platforms are secure, fast, reliable, and scalable. These platforms collect
data from different sensors and devices and perform required operations on data.
In the IoT system, the data transfer happens over a network without requiring
human to human or human to computer interaction.
5 Results
Figure 4 shows the hardware module. Raspberry Pi is interfaced with GSM module,
fingerprint module, and matrix keypad.
When the fingerprint of user is scanned, the user is recognized and his name
will be displayed. Later, the user has to enter the PIN to continue. When the PIN
is verified, all the existing bank accounts of the user will be displayed. User has to
select the bank in which he wants to make transaction. After selecting a bank, amount
6 Future Scope
In future, the system can be developed with camera module which captures the user’s
face and gestures to identify. But this system has a limitation that only the users can
make a transaction with the bank account.
7 Conclusions
With the increase of transactions using credit or debit cards, the need of secured
and fast transaction system is required. So, it is very important to authenticate users
properly and also new digital technologies should be used to match the increased
demand. By using a combination of biometric and PIN authentication, without the
need of a swipe machine, our proposed model will increase security and customer
experience.
Acknowledgments The authors would like to thank Management and Principal of Vardhaman
College of Engineering, Shamshabad, Hyderabad, India for continuous support and encouragement.
Internet of Things-based Card Less Banking … 679
References
1. A. Singh, S. Singh, R. Kumar, Secure swipe machine with help of biometric security, in 2016
International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)
(Chennai, 2016), pp. 1056–1061
2. L. Honnegowda, Security enhancement for magnetic data transaction in electronic payment and
healthcare systems. Int. J. Eng. Technol., 331–335 (2013).https://doi.org/10.7763/IJET.2013.
V5.569
3. M. Dutta, K.K. Psyche, T. Khatun, M.A. Islam, M.A. Islam, ATM card security using bio-metric
and message authentication technology, in 2018 IEEE International Conference on Computer
and Communication Engineering Technology (CCET) (Beijing, 2018), pp. 280–285
4. S. Barman, S. Chattopadhyay, D. Samanta, S. Bag, G. Show, An efficient fingerprint matching
approach based on minutiae to minutiae distance using indexing with effectively lower time
complexity, in International Conference of Information Technology (IEEE, 2014), pp. 179–183
5. M.O. Onyesolu, I.M. Ezeani, ATM security using fingerprint bio-metric identifier: an investiga-
tive study. Int. J. Adv. Comput. Sci. Appl. 3(4), 68–72 (2012)
6. C. Ashwini, P. Shashank, S.M. Nayak, S.S. Yadav, M. Sumukh, Cardless multi-banking ATM
system services using biometrics and face recognition. Int. J. Eng. Res. Technol. (IJERT)
NCCDS—2020 8(13) (2020)
7. A.K. Jain, A. Ross, S. Prabhakar, An introduction to biometric recognition. IEEE Trans. Circuits
Syst. Video Technol. 14(1), 4–20 (2004)
Cluster Adaptive Stationary Wavelet
Diffusion
1 Introduction
In the last few decades, image noise reduction has received a lot of attention. A
variety of approaches has been developed. For image denoising, nonlinear anisotropic
diffusion algorithms work efficiently. Perona and Malik [1] first proposed diffusion-
based denoising in the early 1990s. Perona-Malik anisotropic diffusion (PMAD) is
the original diffusion model. The approach constructs a family of restored signals
by using a noisy signal and developing it locally according to a method specified by
the PMAD equation during the denoising process. After then, the PMAD is altered
for various objectives [2–10].
Chao and Tsai [2] have suggested a diffusion model that includes both local and
gray gradient variances in order to maintain edges and fine details while effectively
eliminating noise. The main disadvantage of this approach is that pictures with a
A. K. Mandava (B)
Department of Electrical, Electronics and Communication Engineering, GITAM School of
Technology, Bengaluru, India
e-mail: amandava@gitam.edu
E. E. Regentova
Department of Electronics and Computer Engineering, University of Nevada, Las Vegas, NV
89154, USA
e-mail: emma.regentova@unlv.edu
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 681
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_62
682 A. K. Mandava and E. E. Regentova
high noise level cannot be added. In the image, such bright pixels usually have very
wide gray gradients and gradients than the edges and information.
Yu et al. [3] suggested a SUSAN controlled diffusion, in which the edge detector
uses local knowledge from a pseudo global perspective to find image features.
SUSAN importantly directs the diffusion process with its noise insensitivity and
structural protection characteristics. For the desired diffusiveness, many parameters
need to be tuned. However, the expense of comprehensive image analysis can be a
powerful tool for finding reasonably good results in closed contours.
The correlation between explicit one-dimensional nonlinear diffusion schemes
and discrete Haar wavelet shrinking was investigated by Mrazek et al. [4]. Weickert
et al. [4, 5] explored the correlation between discrete diffusion filtering and Haar
wavelet reduction, including an analytic four-pixel technique, but only for 1D and
isotropic 2D scenarios with scalar-valued diffusivity. When compared to Perona-
Malik diffusion [1], this allows for the enhancement of edges.
The relationship between multiwavelet denoising and nonlinear diffusion was
examined by Alkhidhr et al. [6]. According to the findings, the multiwavelet shrink-
ages of the widely used CL(2) and DGHM multiwavelets are linked to a second-order
nonlinear diffusion equation. They also came up with high-order nonlinear diffusion
equations for multiwavelet shrinkages in general. According to the experiments, the
diffusion-inspired multiwavelet shrinkage outperforms typical multiwavelet hard-
and soft-thresholding shrinkages.
Nonlinear diffusion in the stationary wavelet domain was exploited by Zhong and
Sun in [7]. They demonstrated that noise has a less effect on the partial differential
equation in the wavelet domain than on the raw noisy picture domain because noise
diminishes with size. On the finest scale, this method employs filtering based on
the minimum mean squared error followed by anisotropic diffusion in the stationary
scale-space.
Nikpour and Hassanpour [8] diffuse wavelet transform approximations when
shrinkage is applied at different levels. The decomposition employs a five level
wavelet transform with the dB10 mother wavelet. A median filter, a wavelet threshold,
anisotropic diffusion, and PDE in the fourth order were used to compare the process.
Mandava and Regentova [9] introduced a context-adaptive nonlinear diffusion
approach termed context-based diffusion in the stationary wavelet domain (SWCD)
for image denoising in the wavelet domain. The diffusivity function is applied to
the wavelet coefficients of a stationary wavelet transform as a weighting function.
The expected results in this strategy are based on iterative threshold processing
implementation.
Zhang and Feng developed a new wavelet domain denoising approach in [10].
In the dual-tree CWT domain, the algorithm is implemented. Each diffusion stage
employs the local Wiener filter. In terms of noise variance, the suitable stopping time
is stated. Multiple-step local Wiener filter (MSLWF) is the name of this technique.
MSLWF has the best performance among wavelet domain approaches, according to
the testing.
LFAD technique proposed in [11] has the best performance in the class of
advanced diffusion-based methods. The method uses superpixel segmentation for
Cluster Adaptive Stationary Wavelet Diffusion 683
performing the diffusion. On the other hand, the proposed approach requires extensive
computations which reduce its efficiency for online applications.
Wavelet domain diffusion can achieve superior denoising results than nonlinear
diffusion in the spatial domain. This is because wavelet transforms maintain image
details better than spatial domain diffusion methods, and more effective wavelet
coefficients can be used in the diffusion process compared to wavelet shrinkage
approaches. The major disadvantage of all the diffusion-based denoising is that all
the regions are diffused for an equal number of iterations. Even though the best result
is obtained for texture and edge region pixels, it continues to diffuse these pixels until
there is an improvement from the image’s remaining pixels which makes the image
blurred. To overcome this problem, diffusion of each cluster or region for a different
number of iterations until the best result is obtained for each cluster. In addition,
the incorporation of cluster-based nonlinear diffusion in the wavelet domain is also
investigated. At the second level of the stationary wavelet transform, for robust region
categorization, wavelet features are employed in the approximation subband and then
diffuse the coefficients in detail at first level. Cluster-based wavelet diffusion (CWD)
is the name of this technique. This approach is introduced in Sect. 2 after a theoretical
basis. The experiment’s findings are presented in Sect. 3; followed by a conclusion.
Perona and Mallik [1] defined the first nonlinear diffusion technique. Inter-region
smoothing is inhibited by their process, which facilitates intra-region smoothing.
Perona and Mallik’s diffusion process is mathematically defined as
∂
I (x, y, t) = ∇ · (c(x, y, t)∇ I ) (1)
∂t
The image is represented by I(x, y, t), t is the number of iterations, and c(x, y,
t) is the so-called diffusion function. Perona and Mallik proposed two diffusivity
functions.
|∇ I (x, y, t)| 2
c1 (x, y, t) = exp − (2)
k
and
1
c2 (x, y, t) = 2 (3)
|∇ I (x,y,t)|
1+ k
684 A. K. Mandava and E. E. Regentova
The constant diffusion k is used in this equation. Equation (1) can encompass a
wide range of filters depending on the diffusivity function used. The discrete diffusion
structure can be described as follows:
c N ∇ N Ii,n j · ∇ N Ii,n j + c S ∇ S Ii,n j · ∇ S Ii,n j
Ii, j = Ii, j + (∇t) ·
n+1 n
(4)
+c E ∇ E Ii,n j · ∇ E Ii,n j + cW ∇W Ii,n j · ∇W Ii,n j
The local gradient direction is denoted by the letters N-north, S-south, E-east, and
W-west, and the local gradient is determined using nearest neighbor differences.
The stationary wavelet transform algorithm (SWT) was used in this study to estab-
lish a new method for combining spectral and spatial data simultaneously [12]. SWT
extracts the signal frequency components using specified low- and high-pass filters
and generates four distinct sets of wavelet features, namely approximation, hori-
zontal, vertical, and diagonal coefficients. Another essential feature of the SWT
method is using several wavelet families to fit better the type of signals being
investigated. For instance, the well-known Haar wavelet is the first and simplest.
2.3 Clustering
The two main types of clustering algorithms suggested in this literature are crisp
clustering processes, in which each data point belongs to just one cluster, and fuzzy
clustering methods, in which each data point belongs to a cluster with a defined
degree of membership. The key advantage of FCM over the k-means algorithm is that
it allocates a degree of membership to each pattern to each cluster. So for obtaining
clusters that share similar structures, fuzzy c-means clustering was employed to
separate them [13].
3 Experiment
where MSE stands for mean square error and using the universal image quality index
(UIQI) given by
4σx y x y
Q= (8)
σx2 + σ y2 (x)2 + (y)2
where x, y are the means and σ x , σ y are the standard deviations and σ xy denotes
the covariance. As it is mentioned in [15], the mean subjective ranks of observers
correspond with the average quality index UIQI.
Tables 1, 2, and 3 show the PSNR results for different number of clusters (i.e.,
5, 10, and 15). From Tables 1, 2, and 3, for low noise levels such as σ = 10 and
20, one can conclude that ten clusters yield best results, and for high noise levels
such as σ = 30 and 40, fifteen clusters produce best PSNR values. Based on PSNR
686 A. K. Mandava and E. E. Regentova
values and the algorithm’s execution time, ten clusters represent an optimal value for
practical implementation. Table 4 provides the PNSR results of stationary wavelet
diffusion without clustering to emphasize the effect of the cluster-based diffusion.
The proposed method improves PSNR on average 0.3–3 dB compared to the SWD.
Table 5 presents UIQI values attained by CWD (ten clusters) for benchmark images
with the additive white Gaussian noise. Based on the findings from Tables 2 and 4, it
appears that cluster-based diffusion provides superior results. Table 6 shows PSNR
values for methods under comparison for “Lena” image. The diffusivity constant used
in [13] is λ = 10. Figure 1 shows clustering results of “house” and “Lena” images
based on pixel intensity values and the clustering based on wavelet approximation
Fig. 1 First column: Cluster map of “house” and “Lena” based on intensity values; second column:
Cluster map of “house” and “Lena” based on wavelet approximation value and energy features.
Number of clusters = 10
688 A. K. Mandava and E. E. Regentova
Fig. 2 First row: “House” with level σ = 20 and CWD result; second row: “House” with level σ
= 40 and CWD result
and energy features. Major trends and discontinuities contribute to large wavelet
coefficients at the second level, while noise makes small coefficients contribute to
smooth regions’ features. Figures 2 and 3 show noisy “house” and “Lena” images
for σ = 20 and 40 and their denoised versions, respectively. When employed in
smoother regions, the proposed technique produces a higher level of visual quality.
Smooth regions are specifically diffused to a higher extent. Figure 4 shows a synthetic
image with the noise level as σ = 40 and the denoised results by BM3D and CWD.
Although BM3D is superior PSNR-wise, we can observe better edge preservation
by CWD.
4 Conclusion
A novel cluster adaptive diffusion approach is proposed in this paper. In the scheme
developed, first cluster the image into ten clusters using the wavelet energy feature
calculated at the detail subbands of the second level and fuzzy c-means. Then, the
diffusion is performed on each cluster until the best PSNR is obtained. According to
Cluster Adaptive Stationary Wavelet Diffusion 689
Fig. 3 First row: “Lena” with noise σ = 20 and CWD result; second row: “Lena” with noise σ =
40 and CWD result
the experiments, the proposed approach exhibits fairly good performance in terms
of objective and visual qualities.
690 A. K. Mandava and E. E. Regentova
Fig. 4 First row: Synthetic image and synthetic with noise level σ = 40 (PSNR = 16.56, UIQI =
0.150); second row: BM3D (PSNR = 33.89, UIQI = 0.251) and CWD (PSNR = 31.94, UIQI =
0.242) result
References
1. P. Perona, J. Mallik, Scale-space and edge detection using anisotropic diffusion. IEEE Trans.
Pattern Anal. Mach. Intell. 12, 629–639 (1990)
2. S.-M. Chao, D.-M. Tsai, An improved anisotropic diffusion model for detail- and edge-
preserving smoothing. Pattern Recogn. Lett. 31(13), 2012–2023 (2010)
3. J. Yu, J. Tan, Y. Wang, Ultrasound speckle reduction by a SUSAN-controlled anisotropic
diffusion method. Pattern Recogn.. 3083–3092 (2010)
4. P. Mrazek, J. Weickert, G. Steidl, Diffusion inspired shrinkage functions and stability results
for wavelet denoising. Int. J. Comput. Vis. 64, 171–186 (2005)
5. M. Welk, J. Weickert, G. Steidl, A four-pixel scheme for singular differential equations, in
Scale-Space and PDE Methods in Computer Vision, ed. by R. Kimmel, N. Sochen, J. Weickert.
Lecture Notes in Computer Science, vol 3459 (Springer, Berlin, 2005), pp. 585–597
6. H. Alkhidhr, Q. Jiang, Correspondence between multiwavelet shrinkage and non-linear
diffusion. J. Comput. Appl. Math. 382, 45–61 (2020)
7. J. Zhong, H. Sun, Wavelet-based multiscale anisotropic diffusion with adaptive statistical
analysis for image restoration. IEEE Trans. Circ. Syst. I Regul. Pap. 55(9), 2716–2725 (2008)
8. M. Nikpour, H. Hassanpour, Using diffusion equations for improving performance of wavelet-
based image denoising techniques. IET-IPR 4(6), 452–462 (2010)
9. A.K. Mandava, E.E. Regentova, Image denoising based on adaptive non-linear diffusion in
wavelet domain. J. Electron. Imaging 20(3), 033016–033016 (2011)
Cluster Adaptive Stationary Wavelet Diffusion 691
10. X. Zhang, X. Feng, Multiple-step local Wiener filter with proper stopping in wavelet domain.
J. Vis. Commun. Image R. 25(2), 254–262 (2014)
11. A.K. Mandava, E.E. Regentova, G. Bebis, LFAD: locally- and feature-adaptive diffusion based
image denoising. Appl. Math. Inf. Sci. 1, 1–12 (2014)
12. G.P. Nason, B.W. Silverman, The stationary wavelet transform and some statistical applications,
Lecture Notes in Statistics, vol 103 (1995), pp. 281–299
13. J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algoritms (Plenum Press,
New York, 1981)
14. K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3D transform-domain
collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)
15. Z. Wang, A.C. Bovik, A universal image quality index. IEEE Signal Process. Lett. 9(3), 81–84
(2002)
16. Y. Wang, L. Zhang, P. Li, Local variance controlled forward-and-backward diffusion for image
enhancement and noise reduction, IEEE, Trans. Image Process. 16(7) (2007)
17. G. Gilboa, N. Sochen, Y.Y. Zeevi, Forward-and-backward diffusion processes for adaptive
image enhancement and denoising. IEEE Trans. Image Process. 11(7), 689–703 (2002)
Low Complexity and High Speed
Montgomery Multiplication Based
on FFT
Abstract The particular augmentation activity is the most tedious activity for
number-hypothetical cryptographic calculations in RSA and Diffie-Hellman. There
are quick multiplier structures to limit deferral and increment the throughput utilizing
parallelism and pipelining. Such structures are gigantic with limited efficiency. In
this endeavor, we incorporate the improved fast Fourier transform (FFT) technique
into the McLaughlin’s system and Montgomery calculation to accomplish high zone
time efficiency. Contrasted with the past FFT-based structures, we hinder the zero-
cushioning activity by processing the particular increase steps straightforwardly
utilizing cyclic and nega-cyclic convolutions. Moreover, upheld by the number-
hypothetical weighted change, the FFT calculation is utilized to give quick convolu-
tion calculation. The outcomes illustrate that our work has a superior efficiency with
best in class FFT-based MMM representations from or more 1024-piece operand
dimensions.
1 Introduction
The focus of this work is equipment usage of RSA calculation [1] with bigger than
1024-piece module size. The executions of RSA calculation [2] are accepted with
512-piece modulus would be sufficient in factorization techniques expanded the
modulus length to 1024 bits. NIST suggested [3] 4096-piece for the not so distant
future so as to keep up RSA secure. Obviously, bigger key dimensions lead to
extended handling time and more equipment asset during processing because of
the RSA calculation needs the secluded exponentiation.
Montgomery multiplication is an efficient strategy to process particular increase
[4]. In MMM calculation, the tedious preliminary division is supplanted by increases
and decreases modulo R to decreases are insignificant by choosing R to be an intensity
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 693
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_63
694 B. Jyothi et al.
The FMLM3 incorporates two diverse modulus R = 22v−1 and Q = 22v+1 and
accomplishes quick NWT deprived of zero-padding.
The design of the FMLM3 is proposed in Fig. 1. A controller block is intended
to create all the controller signs related to the whole framework. The RAM block
comprises of a few RAM groups, accumulates the pre-figured data, the middle
outcomes, and the terminal secluded item.
This calculation has an increasingly confused information flow contrasted with the
repeatable structure method. Additionally, more activity units are required to manage
the secluded decreases and the contingent determinations. Tasks of the FMLM3 from
the highest view are registered successively, while pipelined structures are planned
inwards every block. The front and opposite NWTs are acted in the FFT/FFT−1
block. Segment savvy duplication and expansion are acted in a multiply adder block.
The ripple carry adder (RCA), the subtractor, and the shift module components are
liable considering the time space operations such as modulo R and Q decreases,
contingent choices, and so on.
The engineering of the structure is focused on large clock recurrence while keeping
up a little asset price. The pipelined butterfly structure (BFS) anticipated is embraced
in our FFT/FFT−1 block to accomplish the objective. Figure 2 gives the FFT/FFT−1
block.
Low Complexity and High Speed Montgomery … 695
Fig. 1 FMLM3
Shift_ctrl signals are used in multiplication. Channel switcher reorders the output
digits. The operators A and B and channel switcher (CS) comprise of eight 2-to-1
MUX clusters are intended to guarantee the middle of the road digits can be composed
into the right RAM area.
The FFT/FFT−1 block is structured using six sources of info, four contributions
into BFSs calculation, and two contributions for the pre-registered higher bounce
requirements to limit the after effects of NCT−1 before aggregation. The digits are
not negative and equivalent to x n or x n + M. Because the lesser limits of are non-
positive whole numbers, while x n ∈ [0, M), we just have to crisscross the higher
limits and right the x n + M cases by deducting M.
An aggregator is intended to alter the consequences of CT−1 .Two neighboring
digits are included in each one cycle, and corresponding outcomes are produced in
two stages. In the initial phase, it figures:
R0 = xi+1 2u + xi (1)
where x i and x i + 1 indicate the two information digits. In the subsequent phase, R0
is summed to the aggregation
r
i
ri+1 = R0 +
22u
X i=1 2u + X i = ri mod 22u (2)
where X i and X i + 1 indicate the two yield digits, r i signifies the information put
away in aggregation register (r 0 = 0).
Control signals shift_ctrl0 transmits the fidget factors (the quantity of ω) to deal
with the move activity. At the point when c = 21 , A = 2c is an objective
√ number
and ω has whole number powers. At the point when c = 1/2, A ≡ 2 ≡ 23·2v−3 −
22v−3 mod M, consequently, one deduction, two movements, and three modulo M
decreases are essential to duplicate A.
For the case c = 1/2, the calculation of NCT and NCT−1 need additional activities
contrasted with CT and CT−1 owing to the non-whole number intensity of√ω and
the size of the silly number. Considering the NCT calculation when A = 2, the
fidget factors are acquired by ω−[n/j].j+j/2 , where J = 2v−1−j and choose the lower
yield during the last stage calculation by subbing J = 1 and ω = 2, move bits of the
last stage are calculation
The MAU unit actualizes the segment shrewd increase and the expansion of the
FMLM3, since it is apt if the operand magnitude is no bigger than two or three
hundred bits. The multiply adder unit is in pipelined mode with three (cP + 1) piece
inputs (indicate as A, B, and C), and one yield acquiring (A × B mod M) + C.
To upgrade the exhibition of augmentation, the Karatsuba strategy is indicated as
d(i) is
d (i) + 1
d (i+1) = (5)
2
M A = 4 + 4n + mul (6)
where n indicates the quantity of recursions, mul means the pipeline profundity of
the center increase unit.
In RCA, shift module and subtractor blocks are intended to execute time area tasks.
Taking into account that the information width in RAM piece to store the after effects
of NWT [16]. The information size is kept up during FFT calculation, we isolate
these huge operands into two unit fragments and register one section for each cycle
to abbreviate the convey chain. Since, every unit is intended to have three pipeline
stage cycles.
The secluded augmentation steps of FMLM3 are registered by FFT strategy legiti-
mately with no zero-cushioning, so that lower multifaceted nature is accomplished.
An altered variant of the FMLM3 is introduced, which lessens the quantity of area
changes from 7 to 5. A universal constraint set determination technique is anticipated
for specified operand dimensions to help best FFT calculation. Pipelined models with
solitary and twofold butterfly architectures are planned, actualized so as to investigate
the connection among cycle necessity and number of butterfly architectures. The top
level block of the proposed is shown in Figs. 5 and 6.
The Virtex-6 FPGA usage results illustrate the proposed FMLM3 with mutually
one and two butterfly architectures have preferable territory inertness efficiency over
the cutting-edge FFT-based Montgomery particular duplication. Furthermore, the
preparing rate of the proposed multiplier is likewise practically identical, particularly
for huge change length (for example P = 64 or higher). Figures 7 and 8 show the
RTL schematic and simulation waveforms correspondingly.
5 Conclusion
the zero-cushioning activity is kept away from and the change length is diminished
considerably contrasted with the ordinary FFT-oriented multiplication. Besides, we
investigated in few exceptional situations, the quantity of changes can be additionally
decreased from 7 to 5 deprived of additional computational endeavors, so the FMLM3
could be additionally quickened. A universal technique for efficient constraint set
determination has been summed up for a specified operand dimension.
Also, pipelined models with one and two butterfly architectures are intended
for large territory inactivity efficiency. We additionally investigated the association
702 B. Jyothi et al.
Fig. 8 Simulation
among the quantity of butterfly architectures and the cycle necessity. The estimation
outcomes demonstrate a practical physical methodology can be executed which could
exchange region price for quicker swiftness.
References
1. R.L. Rivest, A. Shamir, L. Adleman, A method for obtaining digital signatures and public-key
cryptosystems. Commun. ACM 21, 120–126 (1978)
2. R.L. Rivest, A description of a single-chip implementation of the RSA cipher. Lambda. 1, 4–18
(1980)
3. E. Barker, W. Barker, W. Burr, W. Polk, M. Smid, P.D. Gallagher, Recommendation for key
management–part 1: general, vol 32 (NIST Special Publication, 2012)
4. P.L. Montgomery, Modular multiplication without trial division. Math. Comput. 44, 519–521
(1985)
5. A. Karatsuba, Y. Ofman, Multiplication of multidigit numbers on automata. Soviet Physics
Doklady7, 595–601 (1963)
Low Complexity and High Speed Montgomery … 703
6. S.A. Cook, S.O. Aanderaa, On the minimum computation time of functions. Trans. Am. Math.
Soc. 23, 291–314 (1969)
7. A. Schönhage, V. Strassen, Schnelle multiplication for digit numbers. Computing 7, 281–292
(1971)
8. M. Fürer, Faster integer multiplication. SIAM J. Comput. 39, 979–1005 (2009)
9. D. Harvey, J. Van Der Hoeven, G. Lecerf, Even faster integer multiplication. arXiv preprint
arXiv.1407.3360 (2014)
10. S. Covanov, E. Thomé, Fast arithmetic for faster integer multiplication. arXiv preprint arXiv.34
(2015)
11. A.F. Tenca, C.K. Koc, A scalable architecture for modular multiplication based on Mont-
gomery’s algorithm. IEEE Trans. Comput. 52, 1215–1221 (2003)
12. M.D. Shieh, W.C. Lin, Word-based Montgomery modular multiplication algorithm for low-
latency scalable architectures. IEEE Trans. Comput. 59, 1145–1151 (2010)
13. M. Morales-Sandoval, A. Diaz-Perez, Scalable GF(p) Montgomery multiplier based on a digit–
digit computation approach. IET Comput. Digi. Tech. (2015)
14. M. Huang, K. Gaj, T. El-Ghazawi, New hardware architectures for Montgomery modular
multiplication algorithm. IEEE Trans. Comput. 60, 923–936 (2011)
15. G.C. Chow, K. Eguro, W. Luk, P. Leong, A karatsuba based Montgomery multiplier, in
IEEE International Conference in Field Programmable Logic and Applications (FPL) (2010),
pp. 434–437
16. M.K. Jaiswaland, R.C.C. Cheung, Area-efficient architectures for large integer and quadruple
precision floating point multipliers, in 2012 IEEE 20th Annual International Symposium in
Field-Programmable Custom Computing Machines (FCCM) (2012), pp. 25–28
An Efficient Group Key Establishment
for Secure Communication to Multicast
Groups for WSN-IoT Nodes
Abstract Wireless sensor networks (WSNs) are an important element for Internet
of Things technologies (IoT). The radio, multimedia, group communications deliver
effective way between limited resources nodes in the IoT-WSNs rather than device-
to-device communication. Limits to the capacity of sensor nodes processing and
power usage also rendered it impossible to implement the encryption strategies devel-
oped for traditional networks. This paper introduces group key set-up protocols for
secure multicast communications between IoT-resourced devices. We presented a
new matrix for heterogeneous wireless sensor networks (HWSNs) with this paper.
In HWSN, the cluster heads are having more energy, communication, and processing
than the cluster members. This heterogeneity reduced the overhead of the clusters
of the security. It may provide cluster members on the network for both expensive
computations. Our system has several advantages as compared to other classical key
management systems in energy consumption. The experiment study shows that our
system can keep complete network connections, control configurations and explicitly
set up neighboring cluster members in pairs and minimize overhead storages.
1 Introduction
The IoT has been a powerful aspect in the networking technology of the next decade.
Wireless sensor networks (WSNs) described a core building block of IoT appli-
cations [1]. Regarding Internet access, sensors, and intelligent devices increasing,
security service providers have a fair chance. International Business Machines (IBM)
revealed recently the IoT Solutions Practice product. In this security suite, this IBM
product offers different security services. The Cisco system has estimated that the
Internet connection unit would be over 50 billion by 2022 [2]. Many sensors (i.e.,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 705
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_64
706 T. Durgam and R. Sadiwala
heat, pressure, humidity, radiation sensors), IoT devices, power actuators, moni-
toring equipment, and several networking devices have been included in the Internet-
linking devices. Data often swing rapidly as the growth rate of the connecting devices
increases. Contributions and organizations to the relevant work were regularly inves-
tigated and checked based on security based on key control, trust management, and
authentication. Sensor systems with reduced battery power and computing capacity
are usually called resource-restricted devices [3, 4]. Ensuring for key group estab-
lishment is a prime feature to ensure that message transmission in such multicast
communities is integral, authenticated, and confidential [5]. In addition, group key
establishment protocols must accommodate IoT-powered WSNs devices and network
features, including resource limits, scalability, and the creation of dynamic groups.
Data consistency and community verification are the minimum standards for
multicast security in the same way. The multicast messages with the cryptographic
traffic encryption key (TEK) defined as the group key [6, 7] encrypt these specifica-
tions. The group key controller (GKC) produces and distributes the group key to all
group participants. In order have good privacy for the data, there should be regular
updating in the group key from time to time. For a group key control algorithm, the
GKC alters and distributes it. Any group rekey algorithm must have backward secrecy
that ensures that a passive opponent who knew an old group key subset would not
reveal the next keys and forward secrecy that ensures an enemy who knew a group key
subset could not identify the previous groups. These algorithms implement commu-
nications and overhead computations both on the GKC and on each group member
throughout the computation of the new group key. Secret sharing is being used for
various WSN protection protocols, including core administration and confidentiality
of records. The authenticated community key transmission protocol suggested in [8]
calls for the development and distribution of the group key through an online key
generation center (KGC) to maximize the overhead and reduce the versatility of the
mechanism implementing them. This work paves the way for the keying system to be
reproduced in [9, 10]. A more complex system is without trustful KGCs. The commu-
nity leader is among the group members, and the final core derivation includes all
the stakeholders as well. However, a scheme does not include ubiquitous chip-suites
for globally linked IoT systems and contains pairing dependent calculations.
We will use this paper as our core scheme and refine it on the model implemen-
tation and show that experience of sensor deployment will help us boost the perfor-
mance of a pairwise key predistribution scheme. However, structured modeling in
the IoT-enabled WSN implementation paradigms of an adequate group key estab-
lishment protocol is to secure multicasting. In other words, the performance metrics
like efficiency, scalability, and integrity are compared with associated work, and the
applicability of these protocol is defined. We justify the proposal suggested in allevi-
ating the current security limitations of other traditional systems offering improved
performance to provide secure communications.
The main contributions and organization of this paper are summarized as follows:
In Sect. 2, we describe literature review of IoT-based WSN systems. Section 3 is
proposed work. Section 4 is results and discussion. Finally, in Sect. 5, we concluded
the paper.
An Efficient Group Key Establishment for Secure Communication … 707
2 Related Work
this issue with the view that the basic station is not the network core and that nodes
cannot be reached. In [15], the paper presented by the authors provides methods in
order to guarantee users’ security before connections to the sensor network facili-
ties, and data are obtained. In the paper, they can attempt to speed up the estimation
of scalar multiplications by the concurrent strategy of distributing the calibration
into various individual tasks concurrently processed by multiple nodes. They were
trying to develop the stable crypto-ECC multicast routing protocol that reflects the
limitations of the WSN. Finally, Telosb sensors will test the suggested approach.
3 Material Methods
Fig. 1 Use case of smart industry for multicast group key management
710 T. Durgam and R. Sadiwala
The rekeying procedure usually involves decrypting and encryption of secure data
by the old key with the new, the more expensive method for data center maintenance
as shown in Fig. 3. Often, we try to regenerate the master key and key shares like
anyone is joining or leaving the multicast network, security needs to adjust the number
of shares or the threshold of shares, and we exchanged regularly the master key with
the mandates of enforcement to comply. Besides removing the master key, we can
also rotate the underlying encryption key vault uses in order to encrypt data at rest. In
vault, two distinct processes are rekeying and revolving. “Recovery” is the method of
producing a new master key and implementing the algorithm of Shamir. “Rotation” is
the mechanism by which to create a fresh vault encryption key to encrypt data at rest.
Both vault rekeying and vault revolving coding key are fully online operations, as
Fig. 2 shows (b). During each of these procedures, vault can maintain uninterrupted
applications.
.
0.7
Proposed
Traditional
0.6
Energy cost for Predistribution (J)
0.5
0.4
0.3
0.2
0.1
0
0 100 200 300 400 500 600 700 800 900 1000
Size of the network (n)
Fig. 4 Energy costs for different sizes of the network protocols used
An Efficient Group Key Establishment for Secure Communication … 713
as a result of the group key management communicating directly with the central
gateway without wasting the many resources of the group members.
5 Conclusion
IoT is spreading quickly on the Internet, which requires secure connectivity. In reality,
an appropriate algorithm for security decides the privacy of both documented and
unrecorded IoT data sources. While someone directly connected all protocols to scal-
ability and security characteristics, the proposed system is often beyond conventional
energy-consumption schemes. Traditional scheme is more suitable for dispersed IoT
implementations, where group members have to contribute significantly and need
greater randomness to key computation. It is seen that more energy cost load on
responder side, the proposed approach is best-suited to centralized IoT implementa-
tions, where the major source of cryptography is an object with low-energy profiles
on the edge nodes.
714 T. Durgam and R. Sadiwala
References
1. L. Harn, C.-F. Hsu, Z. Xia, Z. He, Lightweight aggregated data encryption for wireless sensor
networks (WSNs). IEEE Sens. Lett. 5, 1–4 (2021). https://doi.org/10.1109/LSENS.2021.306
3326
2. K. Taeeun, The internet-connected device vulnerability information management system in
IoT environment. Int. J. Internet of Things Big Data 4, 17–22 (2019). https://doi.org/10.21742/
IJITBD.2019.4.1.03
3. G. Chakma, N. Skuda, C. Schuman, J. Plank, M. Dean, G. Rose, Energy and area efficiency
in neuromorphic computing for resource constrained devices, in Proceedings of the 2018 on
Great Lakes Symposium on VLSI (2018), pp. 379–383. https://doi.org/10.1145/3194554.319
4611
4. V.G. Kiran. C. Rai, FPGA implementation of simple encryption scheme for resource-
constrained devices. Int. J. Adv. Trends Comp. Sci. Eng. 9, 5631–5639 (2020). https://doi.
org/10.30534/ijatcse/2020/213942020
5. J. Carracedo, A. Corona, Cryptanalysis of a group key establishment protocol. Symmetry
13(332) (2021). https://doi.org/10.3390/sym13020332
6. M. Kumar, A. Kishor, Network traffic encryption by IPSec. Int. J. Comput. Sci. Eng. 7, 912–915
(2019). https://doi.org/10.26438/ijcse/v7i5.912915
7. O. Ahmedova, U. Mardiyev, O. Tursunov, Generation and distribution secret encryption keys
with parameter, in 2020 International Conference on Information Science and Communications
Technologies (ICISCT), vol. 1 (2020), pp. 1–4. https://doi.org/10.1109/ICISCT50599.2020.935
1446
8. P. Jaiswal, S. Tripathi, An authenticated group key transfer protocol using elliptic curve
cryptography. Peer-to-Peer Netw. Appl. 10 (2017).https://doi.org/10.1007/s12083-016-0434-7
9. C.-Y. Lee, Z.-H. Wang, L. Harn, C.-H. Chang, Secure key transfer protocol based on secret
sharing for group communications. IEICE Trans. 94-D, 2069–2076 (2011). https://doi.org/10.
1587/transinf.E94.D.2069
10. Y. Sun, Q. Wen, H. Sun, W. Li, Z. Jin, Z. Huan, An authenticated group key transfer protocol
based on secret sharing. Proc. Eng. 29, 403–408 (2012). https://doi.org/10.1016/j.proeng.2011.
12.731
11. P. Porambage, A. Braeken, C. Schmitt, A. Gurtov, M. Ylianttila, B. Stiller, Group key estab-
lishment for secure multicasting in IoT enabled wireless sensor networks (2015). https://doi.
org/10.1109/LCN.2015.7366358
12. I. Chatzigiannakis, E. Konstantinou, V. Liagkou, P. Spirakis, Design, analysis and performance
evaluation of group key establishment in wireless sensor networks. Electron. Notes Theor.
Comput. Sci. 171, 17–31 (2007). https://doi.org/10.1016/j.entcs.2006.11.007
13. M. Carlier, A. Braeken, Symmetric-key-based security for multicast communication in wireless
sensor networks. Computers 8, 27 (2019). https://doi.org/10.3390/computers8010027
14. A. Bomgni, E. Fute, G. Brel, G. Mdemaya, A. Anastasie, K. Donfack, C.T. Djamegni, A. Leo,
Energy efficient and secured geocast protocol in wireless sensor network deployed in space
(3D). Int. J. Wirel. Mobile Netw. 10(11) (2018).https://doi.org/10.5121/ijwmn.2018.10202
15. W. Jerbi, A. Guermazi, H. Trabelsi, Crypto-ECC: a rapid secure protocol for large-scale wireless
sensor networks deployed in Internet of Things (2020). https://doi.org/10.1007/978-3-030-
48256-5_29
Design of Sub-volt High Impedance Wide
Bandwidth Current Mirror for High
Performance Analog Circuit
Abstract The demand of high performance long battery life portable wearable
devices has forced the electronic industries to build up with new methods for circuit
realization so as to achieve better performance in sub-volt supply. In this paper, the
performance enhancement of widely used analog block, current mirror is done. The
current mirror proposed is flipped voltage follower-based structure, whose perfor-
mance enhancement is achieved in terms of output resistance. In order to boost the
resistance, the output section of current mirror uses regulated cascode stage which
helps in increase of output resistance from 880 to 32 M. The regulated cascode
uses the feedback concept which not only provides the resistance boosting factor
but also the reduced capacitance leads to bandwidth improvement which observed
for proposed current mirror is 2.2 GHz. The complete analysis is done using on
MOSFET models of 180 nm technology at a dual supply voltage of 0.5 V.
1 Introduction
The performance of any system is decided by the circuit configurations used for
its implementation. For analog systems, one of such fundamental blocks is current
mirror extensively used. The common application of current mirror is in biasing of
amplifiers, active load, current amplification, filtering, level shifter, etc. [1]. The role
of current mirror is to generate the output in form of current as a function of input
current. The ideal characteristic of current mirror which determines its performance
includes wide dynamic range and bandwidth and low input resistance and high output
resistance. Apart from this, the operating voltage is also an important parameter as it
decides the amount of power consumption. However, at low supply, the fulfillment
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 715
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_65
716 P. A. Kumar et al.
of these ideal requirements becomes difficult. The main obstacle in design of low
power current mirror is the threshold voltage of MOS transistor. It is the minimum
voltage required to turn-on the MOSFET. A number of techniques have been adopted
in current mirror realization to achieve the desired performance [2], among which
few recent reported architecture of current mirrors can be found in [3–10]. In this
paper, the current mirror proposed is based on the low voltage block flipped voltage
follower (FVF) [11] and regulated cascode structure. The FVF is a modified form of
conventional source follower configuration which operates on low supply voltage.
The FVF-based current mirror reported in literature can be found in [12–18]. The
current mirror proposed in this paper has wide dynamic range, giga hertz range
bandwidth, and high output impedance compared to its conventional design. This
paper is divided in five sections as the brief analysis, and discussion of proposed
current mirror is discussed in Sect. 2 which also carries mathematical analysis. The
simulations results are shown in Sect. 3 followed to conclusion in Sect. 4.
The conventional current mirror based on flipped voltage follower is shown in Fig. 1a.
It includes four N-type MOS transistors (M 1 –M 4 ). As the drain current of M 3 is
constant due to current source I B1 , any change in the input current is sensed by M 1
and accordingly produces suitable change in its gate-to-source voltage (V gs,M1 ) which
modulates the output current (I out ). The V bias is the DC voltage applied to maintain
M 3 and M 4 in saturation.
Iin Iout
Iin
Iout
VDD
IB1
VDD
VDD
M3 M4 M3
M4
M5
M1 M2 M1 M2
VSS VSS
(a) (b)
Fig. 1 a Conventional FVF current mirror; b proposed FVF current mirror
Design of Sub-volt High Impedance Wide Bandwidth … 717
Performing routine small signal analysis gives the input and output resistances of
the FVF current mirror as (1/gm1 ) and (gm4 r 04 r 02 ), respectively, where gmi and r 0i
denote the transconductance and output resistance of related transistor. The observed
output resistance of (gm4 r 04 r 02 ) ranges in kilo ohms which is not sufficient for precise
applications. In this proposed design, the output section is modified by cascode
approach, i.e., regulated cascode as shown in Fig. 1b. The output section is initially
modified using transistor M 5 . The voltage headroom of M 2 is regulated via M 4 and
M 5 transistors. The presence of feedback loop amplifier implemented using M 5 and
I B1 prevents variations in drain-to-source voltage of M 2 ensuring better stability. The
working is similar to that of cascode, however, this configuration yields an additional
multiplying factor of (gm r 0 ) in output resistance [19]. Moreover, the C gd,M4 does not
appear in the current input path as seen with traditional cascode current mirror. The
reduced capacitance improves the bandwidth of the circuit. The effective output
resistance compared to conventional FVF current mirror design is boosted by (gm r 0 )
which turn out in the range of mega ohms.
During analysis, the symbols used have their usual meaning and match with stan-
dard spice model parameters of MOS transistors. The operating region of all MOS
transistors is assumed in saturation region.
gm1V1
r
01
718 P. A. Kumar et al.
At node one,
V1 − V2
i in = −gm3 V2 + (1)
r03
At node two,
V2 V2 − V1
gm1 V1 + + gm3 V2 + =0 (2)
r01 r03
Since gm r0 1,
gm1
V2 = − V1 (3)
gm3
V1 1
Rin,prop. = ≈ (4)
i in gm1
From (4), it can be observed that compared to conventional current mirror, there
is almost no change in the input resistance; as the input section, no changes were
done.
V4 − V3
i out = gm4 V53 + (5)
r04
At node three,
At node five,
V5 = −gm5r05 V3 (7)
I out
4
5
gm5 V3 gm4V 53
r r
05 04 Vout
r
02
V4
Rout = = r04 + gm4 gm5r05r02 r04 + r02 (8)
i out
Since gm r0 1,
3 Simulation Results
The proposed current mirror circuits shown in Fig. 1a, b and is simulated MOS
model of 180 nm technology at ±0.5 V supply. The simulation results match with
the mathematical analysis. The MOS width and length along with other assumed
parameters for circuit simulations are listed in Table 1. The input bias current is set
to 65 uA ensuring lower offset in the circuit. The output characteristic is shown in
Fig. 4, where input current is wept from 0 to 200 uA in steps of 50 uA.
720 P. A. Kumar et al.
Fig. 4 Output
characteristics
As seen, the proposed current mirror operates with minimal error. The frequency
response, input resistance, and output resistance plots are shown in Figs. 5, 6, and 7,
respectively. As seen in Fig. 5, the bandwidth gets slightly extended due to reduced
capacitance. For the proposed current mirror, it is 2.2 GHz; whereas for conventional,
it is around 1.5 GHz. Also, there is no change in the input resistance and remains
same in both the current mirrors which here are 820 as seen in Fig. 6. However, a
drastic improvement in output resistance can be seen in Fig. 7, which for proposed,
it is found to be 32 M, and for conventional, it is 880 K.
Table 2 Comparison of parameters of proposed current mirror with FVF current mirrors
Parameters [14] [15] [16] [17] [18] Conv. CM Prop. CM
Input current range (uA) 300 300 100 1000 0–500 0–200 0–200
Input resistance (ohm) 13.3 12.8 496 68.3 17 820 820
Output resistance (ohm) 34.3G 39.5G 1M 10.5G 750 K 880 K 32 M
Bandwidth (Hz) 210 M 216 M 181 M 402 M 4.5G 1.5G 2.2G
Supply (V) 1 1 0.9 1 ±0.5 ±0.5 ±0.5
Power (uW) 42.5 42.5 150 110 140 110 130
Technology (nm) 180 180 180 180 180 180 180
722 P. A. Kumar et al.
4 Conclusion
A low voltage current mirror giga hertz range bandwidth having high output resis-
tance has been presented in this paper. The achieved output resistance in mega ohm
range with the help of cascode approach suits its applicability in precise amplifiers.
Also the giga range bandwidth can find number of applications in high speed circuits.
The complete design has been implemented in 180 nm technology at dual supply of
0.5 V. The micro watt power dissipation encourages its applications in low power
electronic devices.
References
1. M. Akbari, A. Javid, O. Hashemipour, A high input dynamic range, low voltage cascode current
mirror and enhanced phase-margin folded cascode amplifier. Iranian Conf. Electr. Eng. 77–81
(2014)
2. F. Khateb, S. Bay, A. Dabbous, S. Vlassis, A survey of non-conventional techniques for low-
voltage low-power analog circuit design. Radioengineering 415–427 (2013)
3. X. Zhang, E. El-Masry. A regulated body-driven CMOS current mirror for low-voltage
applications. IEEE Trans. Circ. Syst. II: Express Briefs 571–577 (2004)
4. P.S. Manhas, S. Sharma, K. Pal, L.K. Mangotra, K.K.S. Jamwal, High performance FGMOS-
based low voltage current mirror. Indian J. Pure Appl. Phys. 355–358 (2008)
5. F. Esparza-Alfaro, A.J. Lopez-Martin, J. Ramírez-Anguloa, R.G. Carvajal, Low-voltage highly-
linear class AB current mirror with dynamic cascode biasing. Electron. Lett. 1336–1338 (2012)
6. N. Raj, A.K. Singh, A.K. Gupta, Low power high output impedance high bandwidth QFGMOS
current mirror. Microelectron. J. 1132–1142 (2014)
7. F. Esparza-Alfaro, A.J. Lopez-Martin, R.G. Carvajal, J. Ramirez-Angulo, Highly linear
micropower class AB current mirrors using Quasi-Floating Gate transistors. Microelectron.
J. 1261–1267 (2014)
8. N. Raj, A.K. Singh, A.K. Gupta, Low-voltage bulk-driven self-biased cascode current mirror
with bandwidth enhancement. Electron. Lett. 23–25 (2014)
9. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high output impedance bulk-driven quasi-floating
gate self-biased high-swing cascode current mirror. Circ. Syst. Signal Process. 2683–2703
(2016)
10. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high performance bulk-driven quasi-floating gate
self-biased cascode current mirror. Microelectron. J. 124–133 (2016)
11. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high bandwidth self-biased high swing cascode
current mirror. Indian J. Pure Appl. Phys. 1–7 (2017)
12. R.G. Carvajal, J. Ramirez-Angulo, A.J. Lopez-Martin, A. Torralba, J.A.G. Galan, A. Carlosena,
F.M. Chavero, The flipped voltage follower: a useful cell for low-voltage low-power circuit
design. IEEE Trans. Circ. Syst. I: Regular Papers 1276–1291 (2005)
13. V.I. Prodanov, M.M. Green, CMOS current mirrors with reduced input and output voltage
requirements. Electron. Lett. 104–105 (1996)
14. S. Azhari, H.F. Baghtash, K. Monfaredi, A novel ultra-high compliance, high output impedance
low power very accurate high performance current mirror. Microelectron. J. 432–439 (2011)
15. Y. Bastan, E. Hamzehil, P. Amiri, Output impedance improvement of a low voltage low power
current mirror based on body driven technique. Microelectron. J. 163–170 (2016)
16. L. Safari, S. Minaei, A low-voltage low-power resistor-based current mirror and its applications.
J. Circ. Syst. Comput. 175–180 (2017)
Design of Sub-volt High Impedance Wide Bandwidth … 723
17. M.S. Doreyatim, M. Akbari, M. Nazari, A low-voltage gain boosting-based current mirror with
high input/output dynamic range. Microelectron. J. 88–95 (2019)
18. N. Raj, Low voltage FVF current mirror with high bandwidth and low input impedance. Iranian
J. Electr. Electron. Eng. 1–7 (2021)
19. A. Torralba, R.G. Carvajal, J. Ramirez-Angulo, E. Munoz, Output stage for low supply voltage
high-performance CMOS current mirrors. Electron. Lett. 1528–1529 (2002)
Low-Voltage Low-Power Design
of Operational Transconductance
Amplifier
1 Introduction
The rapid increase in demand of efficient portable equipment for biomedical appli-
cations has pushed industry to design low-voltage and low-power analog and mixed
signal ICs for long-term use. The general trend followed in SOC design is the
technology downscaling which is easily implemented in digital circuits but not in
analog circuits. The common approach followed is the scaled supply voltage [1].
But, threshold voltage of MOS transistor creates an obstacle in lowering of supply
voltage beyond a certain limit. Using gate-driven (GD) technique, the supply cannot
be lowered below threshold voltage of MOSFET. In view of this, various low-power
(LP) techniques have been proposed in literature. Few commonly known comes are
bulk-driven (BD), level shifter, floating gate (FG), and quasi-floating gate (QFG)
[2–5]. Among stated techniques, the BD has attracted considerable interest for low-
power design due to its simple architecture. In BD MOS transistors, the gate terminal
is fixed to voltage so as to create channel for MOS transistor to turn on, whereas the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 725
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_66
726 R. Durgam et al.
input signal is applied at bulk which controls the I DS current. The issue with BD is
low transconductance and poor frequency response. The decreased transconductance
is visible in poor gain and bandwidth. The objective of this paper is to exploit the
advantage of using BDQFG technique over BD which results in enhanced transcon-
ductance and, hence, improves the unity gain bandwidth (UGB). The effects of using
LP techniques: QFG, BD, and BDQFG on the performance of CM-based opera-
tional transconductance amplifier (OTA) have been analyzed. The paper is organized
as follows: Sect. 2 of the paper covers the brief of bulk-driven and bulk-driven
quasi-floating gate MOSFETs. Section 3 details the proposed OTA design realized
using BDQG as well using GD, BD, and QFG MOSFETs. Simulation results and
conclusion are shown in Sects. 4 and 5, respectively.
2 Low-Power Technique
The MOS is a four-terminal device whose fourth terminal is the bulk. Using the bulk
terminal as a signal, the threshold voltage limitation can be removed. Based on this,
BD technique was first reported in [6].
Few recent articles based on BD for realizing LP circuits can be found in [7–10].
Recently, a new approach named as BDQFG is proposed in literature which uses
QFG MOS transistor in BD mode. Such approach results in high transconductance
and, hence, the improvement in bandwidth over BD and QFG-based circuits [11–
19]. The schematics of BD and BDQFG MOS transistor are shown in Figs. 1a, b,
respectively. In Fig. 1b, the bulk is tied to the input of QFG MOS transistor MN.
Under DC analysis, it works as standard BD, whereas under AC, it combines the
features of BD and QFG.
Fig. 1 N-channel: a
Bulk-driven (BD) and b
Bulk-driven QFG MOS
transistor
Low-voltage Low-power Design … 727
3 Proposed OTA
In this section, the standard OTA choosen is current mirror-based OTA [20, 21].
The current mirror OTA uses three simple two MOS transistor-based CM topology.
The schematic of OTA based on standard GD approach is shown in Fig. 2. The
combination of N-type CM (M7, M8) and P-type CM (M3, M5) and (M4, M6)
is the three basic CM’s used to build OTA. The N-type CM (M9, M10) acts as a
tail current source biased using a constant current source (Ibias ). The generalized
schematic of OTA design based on low-power technique is shown in Fig. 3 where
input signal processing block is represented using a LP technique block which has
three terminals as A, B, and C. This LP technique block can be replaced by technique:
QFG, BD, and BDQFG techniques.
The design equations governing the OTA performance parameters of Fig. 3 are:
(i) Transconductance
VDD
A B
LP Technique
DC Iout
C
M9 M10
M7 M8
VSS
A B
A B
VDD VDD
VDD VDD A B
MP1 MP2
MP1 MP2
M2 VDD M1 M2 VDD M1 M2
M1
C1 Vin- Vin+ C1 C2
C2 Vin- Vin+
Vin- Vin+
C C C
Fig. 3 N-channel LP technique block: (i) QFG, (ii) BD, and (iii) BDQFG
728 R. Durgam et al.
W
Gm = μn Cox I10 (1)
L 2
(iii) DC gain
W 1
A V = G m Rout = μn Cox I10 (3)
L 2 gds6 + gds8
From Eq. (3), the DC gain is a function of the input gate transconductance of
M2 (gm2 ) and effective output resistance (Rout ) of the amplifier. Using GD, the
maximum transconductance is achieved, whereas use of mentioned LP techniques
results in low transconductance of M2. As is well-known, QFG uses the capacitor
divider network which attenuates the effective gate voltage of M2 and, thus, reduces
transconductance, whereas in case of BD, the transconductance of M2 is the body
transconductance (gmb2 ) which is usually (0.2–0.4) times of gate transconductance.
For BDQFG, the effective transconductance
of M2 is the combined effect of QFG
MOST gate transconductance gm2,QFG and bulk MOST body transconductance
(gmb2 ), i.e., gm2,BD−QFG = gm2,QFG + gmb2 .
The BDQFG technique offers the high transconductance, low output impedance,
and high DC gain. The parameter affected by output impedance is visible in dominant
pole location which is inversely proportional to the output impedance. In summary,
the OTA unity gain bandwidth (UGB) is maximized for BDQFG with a low power
consumption level. The only drawback associated with BDQFG technique is its
sensitivity to nonlinear effects which causes DC offsets. These offsets can be removed
by proper matching of input MOS transistors and gate input capacitors.
Low-voltage Low-power Design … 729
4 Simulation Results
The CM OTA of Fig. 2 has been simulated on 0.18 micron technology at dual supply
on 0.5 V. The performance of OTA has been evaluated with QFG, BD, and BDQFG
techniques and compared to GD approach. The dimensions of MOS transistors and
other parameters assumed for simulation of OTA are listed in Table 1.
The effective transconductance of respective OTA’s is shown in Fig. 4. From the
plots, it is seen that the decreased gate voltage of QFG results in low transconduc-
tance compared to that of standard GD configuration, i.e., G m,QFG ≈ (0.8 − 0.9)G m ,
whereas for BD, it is very low. However, in case of BDQFG, the highest transcon-
ductance can be observed which reflects in improved gain compared to those for BD
and QFG as shown in Fig. 5. The decreased output impedance in BDQFG results
in better 3DB frequency as compared to the other techniques. The overall effect of
high gain and decreased output impedance for BDQFG is observed in UGB product
which is found to be equivalent to that of conventional GD-based CM OTA but with
an advantage of low power consumption. The transient analysis results are shown in
Fig. 6. From simulations, it can be concluded that BDQFG is a better among others
for low power consumption. The comparative performance of all the abovementioned
CM OTAs as obtained by simulations is summarized in Table 2.
5 Conclusion
References
1. B.J. Blalock, P.E. Allen, G.A. Rincon-Mora, Designing 1-V op amps using standard digital
CMOS technology. IEEE Trans. Circ. Syst. II Analog Digital Sig. Process. 769–780 (1998)
2. M. Gupta, R. Pandey, Low-voltage FGMOS based analog building blocks. Microelectron. J.
903–912 (2011)
3. J.M.A. Miguel, A.J. Lopez-Martin, L. Acosta, J. Ramirez-Angulo, R.G. Carvajal, Using floating
gate and quasi-floating gate techniques for rail-to-rail tunable CMOS transconductor design.
IEEE Trans. Circ. Syst. I Regul. Pap. 1604–1614 (2011)
4. C. Garcia-Alberdi, A. Lopez-Martin, L. Acosta, R.G. Carvajal, J. Ramirez-Angulo, Tunable
class AB CMOS Gm-C filter based on quasi-floating gate techniques. IEEE Trans. Circ. Syst.
I Regul. Pap. 1300–1309 (2013)
5. N. Raj, A.K. Singh, A.K. Gupta, Low power high output impedance high bandwidth QFGMOS
current mirror. Microelectron. J. 1132–1142 (2014)
6. A. Guzinski, M. Bialko, J.C. Matheau, Body driven differential amplifier for application in
continuous-time active C-filter. Proc. ECCD, Paris France 315–319 (1987)
7. N. Raj, R.K. Sharma, Modeling of human voice box in VLSI for low power biomedical
applications. IETE J. Res. 345–353 (2011)
8. H. Khameh, H. Shamsi, On the design of a low-voltage two stage OTA using bulk-driven and
positive feedback techniques. Int. J. Electron. 1309–1315 (2012)
9. L. Zuo, S.K. Islam, Low-voltage bulk-driven operational amplifier with improved transcon-
ductance. IEEE Trans. Circ. Syst. I Regul. Pap. 2084–2091 (2013)
10. J. Gak, M.R. Miguez, A. Arnaud, Nanopower OTAs with improved linearity and low input
offset using bulk degeneration. IEEE Trans. Circ. Syst. I Regul. Pap. 1–10 (2013)
732 R. Durgam et al.
11. F. Khateb, Bulk-driven floating-gate and bulk-driven quasi-floating-gate techniques for low-
voltage low-power analog circuits design. AEU-Int. J. Electron. Commun. 64–72 (2013)
12. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high output impedance bulk-driven quasi-floating
gate self-biased high-swing cascode current mirror. Circ. Syst. Sig. Process. 2683–2703 (2015)
13. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high performance bulk-driven quasi-floating gate
self-biased cascode current mirror. Microelectron. J. 52, 124–133 (2016)
14. N. Raj, A.K. Singh, A.K. Gupta, Low voltage high bandwidth self-biased high swing cascode
current mirror. Indian J. Pure Appl. Phys. 55, 245–253 (2017)
15. N. Raj, Low voltage FVF current mirror with high bandwidth and low input impedance. Iranian
J. Electr. Electron. Eng. 1–7 (2021)
16. F. Khateb, W. Jaikl, M. Kumngern, P. Prommee, Comparative study of sub-volt differential
difference current conveyors. Microelectron. J. 1278–1284 (2013)
17. N. Raj, A.K. Singh, A.K. Gupta, Low-voltage bulk-driven self-biased cascode current mirror
with bandwidth enhancement. Electron. Lett. 23–25 (2014)
18. N. Raj, A.K. Singh, A.K. Gupta, High performance current mirrors using quasi-floating bulk.
Microelectron. J. 11–22 (2016)
19. F. Khateb, The experimental results of the bulk-driven quasi-floating-gate MOS transistor. AEU
Int. J. Electron. Commun. 462–466 (2015)
20. P.E. Allen, D.R. Holberg, CMOS Analog Circuit Design (Oxford University Press, USA, 2002)
21. S. Ali, A power efficient gain enhancing technique for current mirror operational transconduc-
tance amplifiers. Microelectron. J. 183–190 (2015)
Automatic Detection of Cerebral
Microbleed Using Deep Bounding Box
Based Watershed Segmentation
from Magnetic Resonance Images
1 Introduction
T. G. Berin (B)
Ponjesly College of Engineering, Nagercoil, India
C. H. Sulochana
Department of ECE, St.Xavier’s Catholic College of Engineering, Nagercoil, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 733
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_67
734 T. G. Berin and C. H. Sulochana
disease, dementia or simply from normal aging. The MRI sequence will envisage the
microbleeds that are open to hemosiderin deposition such as T2*-Gradient Recalled
Echo (GRE) [1, 2]. Modern imaging protocols such as susceptibility weighted
imaging (SWI) that are regularly run at high resolution (≤1 mm3 ), long echo time,
and use the phase image to enhance contrast, are much more sensitive in identi-
fying small bleeds than established protocols [3]. Current publications have shown
that when SWI is weighed up with standard gradient echo imaging there is a three
to six-fold increase in the number of CMBs seen [4]. The examination of microb-
leeds are strenuous and time wasting, as the radiologist or experts need to verify
uphold slice by slice while identifying the black dots to transform from mimics.
CMBs are prevalent in patients with cerebrovascular and cognitive diseases (such
as stroke and dementia), as well as present in healthy aging individuals. Apart from
indicating these vascular diseases, CMBs could also structurally affect their nearby
brain tissues, and further cause neurologic dysfunction, cognitive impairment and
dementia [5] The observer variability for the detection of CMBs is large [6]. Addi-
tionally, manual detection of CMBs is a time consuming task, which can take more
than two hour per Traumatic brain injury (TBI) patient. A Computer Aided Detec-
tion (CAD) system can implies these drawbacks. Several CAD systems have been
developed for the detection of CMBs in other patient populations the existence of
CMBs and their distribution patterns have been recognized as important diagnostic
biomarkers of cerebrovascular diseases. For example, the lobar distribution of CMBs
suggests probable cerebral amyloid angiopathy [7]. Segmentation of image is very
important and can be classified as the most difficult function in image processing.
Segmentation is defined as the grouping of data which share same characteristics
such as color intensities etc. [8]. Generally, the watershed transformation is applied
to image gradient and shows the segmentation results as watershed lines which sepa-
rated the regions. This image gradient method usually produced result with noise and
poor quality of segmentation or oversegmentation [9]. To reduce the effect of over-
segmentation, numerous approach been proposed. For example, watershed technique
based on markers [10], scale space method [11], region merging method [12], partial
different equations methods for image enhancements [13], the combined technique
between wavelet transformation and watershed transformations [14] etc. In water-
shed segmentations, the separation of image basically depends on the image gradient.
Theoretically, the image gradient corresponds to the homogenous gray level of the
image. The nature of image that are low contrast will generate small area of gradient,
resulted distinct regions to be erroneously merged [15]. In watershed segmentations,
the separation of image basically depends on the image gradient. Theoretically, the
image gradient corresponds to the homogenous gray level of the image. The nature
of image that are low contrast will generate small area of gradient, resulted distinct
regions to be erroneously merged [16]. Sliding window processing was employed
that produce good generalization report [17]. To speed up the convergence network
in segmentation multimodal fusion technique was implemented [18]. Deep learning
method [19]. SVM sub classifier [20]. This paper will discuss on the segmentation of
image by deep learning bounding box and watershed transform in which the image
Automatic Detection of Cerebral Microbleed Using Deep Bounding … 735
enhancement technique that are used to prevent oversegmentation and at the same
time also reduce the noise.
2 Proposed Methodology
Fig. 1 Overview of
proposed framework
736 T. G. Berin and C. H. Sulochana
by anisotropic filter and thus the image is sharpened and smoothened. Subsequently,
in the segmentation, bounding box and watershed transformation is applied.
The first step of preprocessing is to convert RGB image into gray- scale image. The
basic purpose of applying color conversion is to reduce the number of colors.. High
frequency noise is present in magnetic resonance images and it is usually removed by
a filtering process. The anisotropic diffusion filter (ADF) was proposed to adaptively
remove the noise from CMB image, maintaining the image edges. After the image
is converted to grayscale image, it is given as an input to the anisotropic filter. For
basis of most sharpening methods anisotropic type of filter is used. When contrast
is enhanced between adjoining areas with little variation in brightness or darkness
image is sharpened. In the anisotropic filter the frequency is decreased which helps
to keep the image with high frequency information. Anisotropic filter is used in order
to increase the brightness of the center pixel kernel. A single positive value is found
in the center of the kernel array, which is totally surrounded by the negative values.
Anisotropic diffusion filter is used for smoothening the magnetic resonance images.
Figure 2 compares both the original image with the filtered image, ADF preserves
the sharpness of edges better than Gaussian blurring.
is not necessary. This method is entirely different from other edge based segmen-
tation methods because the boundaries of an CMB image will be connected and
closed. These boundaries of regions thus obtained belong to the contour of microb-
leeding image. The segmentation efficiency of above said algorithm increase if the
foreground objects and background regions are verified and marked separately. This
concept is referred to as marker controlled watershed segmentation (MCWS). Once
the Bounding Box Segmentation is over region merging process is started. Different
regions of an image are merged to form a single region with some similarity crite-
rion. Bounding Box Segmentation is a fast and simple technique which can efficiently
separate the pixels in a CMB image having similar properties to build large regions
or objects. This method receives a predefined set of seed pixels along with the input
image and these seed pixels point to the objects to be segmented. The seed pixel
is compared to all unallocated neighboring pixels in the image and this enables the
region to grow iteratively. s is the measure of similarity, where s defined as the
difference between mean of pixels in is measured and these pixels are allocated to
the corresponding region. This process repeats until all the pixels in the image are
allocated to any one of the regions. Figure 3 shows the segmented bleeding region.
For implementing Deep Learning Bounding Box (DLBB) algorithm after the
preprocessing phase the steps to be followed are initially the number of DLBB
clusters is selected then initialize the cluster centers. Elements for the partition matrix
is calculated from the cluster center and finally the cluster center value is computed.
Bounding box algorithm is represented by U matrix. The values are stored between 0
and 1, which represents the membership data points for each and every cluster, while
the hard c-means uses only 0 and 1 as the two values for the membership function.
Fig. 3 Detected CMB from MR images, a original image, b filtered image, c locating the seed
bounding box, d CMB region marked, e segmented bleeding region, f DBB segmented coloring
Image, g area red marked Image
738 T. G. Berin and C. H. Sulochana
c
n
J 1 (u, v) = u imj X j − Vi2 (1)
i=1 j=1
where u is the membership matrix; V is the cluster center matrix; n is the number of
pixel points; c is the number of clusters; X i is the jth measured pixel point; and V i is
the center of cluster i.
3 Results
Table 1 Performance
Metrics Unsharp masking Bilateral filter Proposed method
analysis of preprocessing
PSNR 26.54 29.61 36.94
SSIM 0.7462 0.8351 0.9156
MSE 0.0058 0.0029 0.00046
From Table 2, by comparing the three methods the mean dice similarity coeffi-
cients, the proposed method gives the mean 98%. The proposed method had a very
high sensitivity of 96%. The SVM classifier is able to remove most of the false
positives at the loss of some sensitivity. The automated processing had an overall
accuracy of 98.16% and specificity of 95.6%.
References
1. S.M. Greenberg, M.W. Vernooij, C. Cordonnier, R.A. Salman, F. Edin, S. Warach, J. Lenore,
M. Van Buchem, M.M.B. Breteler, Cerebral microbleeds a field guide to their detection and
interpretation. J. Lan. Neurol. 8, 165–174 (2009)
2. V. Mok, J.S. Kim, Prevention and management of cerebral small vessel disease. J. Stroke.
17(2), 111–122 (2015)
3. A. Charidimou, D.J. Werring, Cerebral microbleeds detection, mechanisms and clinical
challenges. J. Future Neurol. 6, 587–611 (2011)
4. M. Ayaz, A.S. Boikov, E.M. Haacke, D.K. Kido, W.M. Kirsch, Imaging cerebral microbleeds
using susceptibility weighted imaging one step toward detecting vascular dementia. J. Magn.
Reson. Imaging. 31, 142–148 (2009)
5. R.N. Nandigam, A. Viswanathan, P. Delgado, M.E. Skehan, E.E. Smith, J. Rosand, S.M.
Greenberg, B.C. Dickerson, MR imaging detection of cerebral microbleeds: effect of
susceptibility-weighted imaging, section thickness, and field strength. J. Neuroradiol. 30,
338–343 (2009)
6. A. Charidimou, D.J. Werring, Cerebral microbleeds and cognition in cerebrovascular disease
an update. J. Neurolog. Sci. 322(1), 50–55 (2012)
7. B. Geurts, T. Andriessen, B. Goraj, The reliability of magnetic resonance imaging in traumatic
brain injury lesion detection. J. Brain Inj. 26, 1439–1450 (2012)
8. R. Yogamangalam, B. Karthikeyan, Segmentation techniques comparison in image processing.
J. Eng. Technol. 5, 307–313 (2013)
9. W. Bieniecki, Oversegmentation avoidance in watershed-based algorithms for color images. J.
Mod. Probl. Radio Eng. Telecommun. Comput. Sci. 169–172 (2004)
10. A. Fazlollahi, Efficient machine learning framework for computer-aided detection of cerebral
microbleeds using the radon transform. Conf. ISBI. 113–116 (2014)
11. H.J. Kuijf, Efficient detection of cerebral microbleeds on 7.0T MR images using the radial
symmetry transform. J. NeuroImage. 59, 2266–2273 (2012)
12. W. Bian, C.P. Hess, S.M. Chang, S.J. Nelson, J.M. Lupo, Computer-aided detection of radiation-
induced cerebral microbleeds on susceptibility-weighted MR images. J. NeuroImage, Clin. 2,
282–290 (2013)
13. S.R. Barnes, Semiautomated detection of cerebral microbleedsin magnetic resonance images.
J. Magn. Resonan. Imag. 29, 844–852 (2011)
14. C.R. Jung, Combining wavelets and watersheds for robust multiscale image segmentation. J.
Image Vis. Comput. 25, 24–33 (2007)
15. P.R. Hill, C. NishanCanagarajah, D.R. Bull, Image segmentation using a texture gradient based
watershed transform. J. IEEE Trans. Image Process. 12, 1618–1633 (2003)
16. H. Ramadan, C. Lachqar, H. Tairi, A survey of recent interactive image segmentation methods.
J. Computat. Vis. Media 6, 355–384 (2020)
17. S. Lu, K. Xia, S.-H. Wang, Diagnosis of cerebral microbleed via VGG and extreme learning
machine trained by Gaussian map bat algorithm. J. Ambient İntell. Humanized Comput. (2020)
18. D. Nie, L. Wang, E. Adeli, C. Lao, W. Lin, D. Shen, 3-d fully convolutional networks for
multimodal isointense infant brain image segmentation. J. IEEE Trans. Cybern. 49, 1123–1136
(2018)
19. S. Liu, Cerebral microbleed detection usng susceptibility weighted imaging and deep learning.
J. Neuro Image. 198, 271–282 (2019)
20. G.M. Himabindu, R. Murty, Extraction of texture features and classification of renal masses
from kidney images. J. Eng. Technol. 7, 1057–1063 (2018)
New Efficient Tunable Window Function
for Designing Finite Impulse Response
Digital Filter
Keywords Band pass filter (BPF) · Band stop filter (BSF) · Finite impulse
response (FIR) · High-pass filter (HPF) · Low-pass filter (LPF) · Main-lobe width
(MLW) · Side-lobe amplitude (SLA)
1 Introduction
Digital filters play a crucial role in processing and implementing digital signals.
By passing some desired signals through it, the digital filters can improvise certain
parameters of any signal. It passes particular pass-band frequency and discards the
undesired frequencies while filtering. Basic sub-categories of filters are—low-pass
filter (LPF), high pass filter (HPF), band pass filter (BPF), and band stop filter (BSF).
Digital filters, on the basis of impulse response, has two categories—(IIR) infinite
impulse response filters and (FIR) finite impulse response filters [1]. Designing a
digital FIR filter can be done using any of the three methods: frequency trunca-
tion, windowing, and optimization process. But, window method is considered to
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 741
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_68
742 R. Kumar and R. P. Rishishwar
2 Proposed Window
The causal window of length Mis represented by function w(nT ) where ‘n’ ranges
from (0 ≤ n ≤ M − 1). All calculations in this paper have been done considering the
sampling period, T = 1 s [4]. The value of M − 1 is the order of window represented
by N.
0.14
a0 = 0.5363 − , a1 = 0.996 − a0 , a3 = 0.004 (3)
M −1
New Efficient Tunable Window Function … 743
I0 is the zero-order modified Bessel function. The tuning parameter (β) derive the
preferable “MLW–SLA” understanding. The main reason behind using 80% of modi-
fied Hamming window and 20% of Kaiser window is that small amount of Kaiser
window provides lesser leakage factor and better performance at higher orders. But
if we increase its composition more, then it will result in larger side-lobe peaks [3, 6].
The above proposed function has two varying parameters that are as follows: window
order (N) and tuning parameter (β). By changing the value of tuning parameter β,
at different values of N, we can accomplish desired frequency domain specifications
and application requirements. Frequency transform of “Eq. (1)” is given by:
W ( f ) = 0.8 × a0 Hrect ( f )
1 1
− 0.8 × a1 Hrect f − + Hrect f −
M −1 M −1
3 3
− 0.8 × 0.004 Hrect f − + Hrect f −
M −1 M −1
M sin β 2 − (Mπ f )2
+ 0.2 × × (5)
I0 (β) β 2 − (Mπ f )2
where we have
sin(π M f )
Hrect ( f ) = × e−iπ f (M−1) (6)
sin(π f )
3 Performance Evaluation
Frequency domain
0
Proposed window
-10
-20
Normalized Magnitude (dB)
-30
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
Fig. 1 Magnitude representation of frequency spectrum by simulation at order N = 200 and tuning
parameter β = 7
Since Hamming window [5] and modified Hamming window [4] have an outcome
of smaller peak side-lobe levels, so our window should be tuned to that value of β
which will result in smaller peak side-lobe levels. So, β = 7 results in the desired
window characteristics. Now, we will vary the order N and evaluate the best possible
results.
On comparing frequency spectrum of proposed window with Hamming window,
Fig. 2a, b evaluates smaller peak side-lobe of proposed window at the expense of
main-lobe width for N = 40 and N = 60. But when window order is increased (N
= 200), it offers smaller peak side-lobe amplitude accompanied by same main-lobe
width as of Hamming window. Comparison is shown in Fig. 2c. As per data shown
in Table 1, the proposed window (in comparison with Hamming window) offers 4 −
4.6 dB small peak side-lobe amplitude along with smaller leakage factor (0.02%)
and equal main-lobe width (for higher window order).
Frequency domain
0
Proposed window
-10 Hamming window
Modified Hammingwindow
-20
Normalized Magnitude (dB)
-30
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(a)
Frequency domain
0
Proposed window
Hamming window
Modified Hamming window
-20
Normalized Magnitude (dB)
-40
-60
-80
-100
-120
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(b)
Fig. 2 Magnitude variation (dB) of frequency spectrum for proposed window, modified Hamming
window, and Hamming window functions by simulation for order, a N = 40, b N = 60, c N = 200
side-lobe as shown in Fig. 2c. Table 1 also evaluates that the proposed window has
1.8 − 2 dB smaller peak side-lobe with equal main-lobe width and lesser leakage
factor (0.02%) in comparison to modified Hamming window (for higher window
order). Since Hamming has smaller side-lobe peak than Hann and Bartlett window,
then, proposed window will have smaller amplitude than Hann and Bartlett windows
also.
746 R. Kumar and R. P. Rishishwar
Frequency domain
0
Proposed window
-10 Hamming window
Modified Hamming window
-20
Normalized Magnitude (dB)
-30
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(c)
Fig. 2 (continued)
For concentrating maximum energy in main-lobe, just like Kaiser window, we need
to tune parameter β less than 7. This characteristic is nearly optimal in the sense
of its peak’s concentration around frequency. Kaiser window has the function [4]
containing Bessel’s zero-order function. Since proposed window is behaving better
for larger orders, we operate our window at N = 200. Figure 3a compares Kaiser
window function and proposed window function using frequency spectrum speci-
fications for N = 200 at β = 5.57. Comparison results in equal MLW and 3.7 db
New Efficient Tunable Window Function … 747
Frequency domain
0
Proposed window
Normalized Magnitude (dB) Kaiser window
-50
-100
-150
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(a)
Frequency domain
0
Proposed window
Kaiser window
-20
Normalized Magnitude (dB)
-40
-60
-80
-100
-120
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(b)
Fig. 3 Magnitude variation (dB) of frequency spectrum for proposed and Kaiser window function
by simulation at order N = 200, a β = 5.57, b β = 5.3
Table 2 Characteristics of FIR filters designed by different window functions for different orders
at tuning parameter β = 5.57 and order M = 200
Order Stop-band attenuation (As) in dB
Hamming window Modified Hamming Kaiser window Proposed window
filter window filter filter filter
N = 40 −52.3 −58.2 −58.49 −58.9
N = 60 −53.4 −58.68 −58.66 −59.35
N = 200 −53.7 −59.1 −59.00 −59.8
4 Application Example
This section tests proposed window in application environment and decides its
utility in practical use. Considering a low-pass finite impulse response filter which
is designed by windowing of infinite impulse response filters. The impulse response
of an ideal low-pass filter is given as [4]:
wc T = π for n = 0
h n (nT ) = sin wc nT (7)
nπ
for n = 0
where wc is the cut-off frequency. Now, the required FIR filter can be achieved by
windowing h n with proposed window [6].
In application example, for cut-off frequency wc = 0.3 × π rad/sample, stop-
band attenuation (As) of filters designed using proposed window is compared with
standard and modified Hamming windows at different orders. Simulation is done
using MATLAB Filter Designer tool.
Table 2 gives the simulated data for comparison. It can be seen that proposed
window filters exhibit better performance in every order when tuning parameter β
= 5.57. But for higher order (N = 200), proposed window filter gives finer value of
stop-band attenuation (−59.8 dB) with a difference of 0.7 dB–6.01 dB as compared
to other windows. Figure 4 compares magnitude response (dB) of FIR proposed
window low-pass filter with standard window low-pass filters (M = 200).
5 Conclusion
Summarizing all simulation results, we can conclude that when window specifica-
tions require small side-lobe peaks then we need to tune β at 7 that provides 2 dB–
4.1 dB smaller side-lobes. But when maximum energy concentration in main-lobe is
required, then we tune β at 5.57 (or less than 7) which gives 3.76 dB–4.46 dB smaller
side-lobes. With same main-lobe width and smaller leakage factor, proposed window
works better for higher order. In application example, FIR low-pass filter of proposed
New Efficient Tunable Window Function … 749
-10
-40
-50
-60
-70
-80
-90
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(a)
Magnitude Response (dB)
-10
-20
Proposed window filter
-30 Modified Hamming window filter
Magnitude (dB)
-40
-50
-60
-70
-80
-90
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(b)
Fig. 4 Comparing magnitude response (dB) of proposed window low-pass filter with standard
window low-pass filters for order M = 200 and tuning parameter β = 5.57, a Hamming window
filter, b Modified Hamming window filter, c Kaiser window filter
750 R. Kumar and R. P. Rishishwar
-10
-40
-50
-60
-70
-80
-90
-100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Normalized Frequency ( rad/sample)
(c)
Fig. 4 (continued)
window exhibits higher stop-band attenuation (0.7 dB–6.01 dB) compared to other
windows. In common, if we want a general-purpose window, then tune β at 5.57 for
N = 200 which gives better frequency domain specifications (0.3 dB–3.76 dB) along
with application requirements (0.7 dB–6.01 dB). Moreover, properties like stability
and linearity are also met except for the property discussed in modified Hamming
window paper.
References
Abstract Brain tumour detection is perhaps the most remarkable and difficult task
in the field of medical sciences because manual segmentation can result in incorrect
results and findings. Furthermore, it is the most difficult process when there is a large
amount of data to be categorised. The appearance of brain tumours varies greatly.
Because the tumour and normal tissues have a similar appearance, tumour removal
becomes unquestionable. In this paper, a strategy is proposed to extract the brain
tumour from 2D-magnetic resonance brain images (MRI) by convolution neural
network techniques. The experimental study was carried on a real-time data set with
various tumour sizes, areas, shapes and distinctive image intensities. The techniques
involved are AlexNet, DenseNet, ResNet and the other one is the combination of
VggNet 19 and DenseNet. The accuracies we have got for AlexNet, DenseNet,
ResNet and the other one which is the combination of VggNet19 and DenseNet are
98.5%, 92.3%, 94.6%, 95.4%. The principle aim of this paper is to detect brain tumour
and also observe which technique is giving the best accuracy using MATLAB.
1 Introduction
The vast majority of tumours are catastrophic. Brain tumours that are essential arise
in the brain. In the optional type of tumour, the tumour’s movement within the brain
causes effects from other parts of the body [1]. If the tumour is detected and treated
early in the tumour formation process, the patient’s chances of recovery are very
good [2].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 751
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_69
752 D. Pavani et al.
Brain tumour is a collection of abnormal cells that form within the brain. It is
of two types: malignant and benign. Malignant is cancerous, and benign is non-
cancerous. Brain tumours are classified as primary and secondary. Primary tumours
originate in the brain, most of which are nothing but benign, and secondary is caused
when the cancer cells spread to brain from other organs like lungs and breast. These
are called as metastatic brain tumours.
A brain tumour is one of the most serious cancers that can affect both children and
adults. Initial recognition, classification and analysis of brain tumours are especially
important in order to treat the tumour effectively [3]. Early detection of brain tumours
can play a critical role in improving treatment options and increasing the likelihood
of survival [4]. Segmentation of brain tumours manually is a time consuming and
critical task because of the huge data generated in the medical field. Therefore, MRI is
used to detect the tumour, using MRI is also burdensome because of the considerable
amount of data. In this paper, we proposed a method which is the combination of
VggNet and DenseNet for brain tumour detection along with other convolution neural
network techniques.
Hussain [4] has used BRATS data set which consists of non-tumour and tumour MRI
images. He used FCM for image segmentation, and for the classification purpose, he
used six traditional classifiers, namely K-nearest neighbour (KNN), support vector
machine (SVM), multilayer perceptron (MLP), Naïve Bayes, random forest and
logistic regression which was put into action in scikit-learn and amongst which he
achieved greater accuracy for SVM, i.e. 92.42%. The other proposed method was
using CNN where the five layer CNN is used to classify the tumoured and non-
tumoured images and achieved the accuracy of 97.87%. Vinoth [5] used convolution
neural network for MRI image segmentation where the HGG and LGG parts of the
tumour are found, and the obtained sensitivity is 96.54%. For the tumour classifica-
tion, SVM classifier is used [6–8]. S. Pereira has proposed CNN-based method for
brain tumour segmentation where CNN is built over 3 × 3 kernels. They have also
shown that better segmentation is obtained using intensity normalisation. They have
also found LReLU is more effective when compared to ReLU in training CNN.
2 Proposed Techniques
Transfer learning
Transfer learning has few advantages, yet the primary benefits are saving preparing
time, better execution of neural organisations (by and large) and not requiring a ton
of information.
Brain Tumour Detection Using Convolutional Neural Networks … 753
One of the most useful types of machine learning is deep learning. CNN is a special
type of DNN; CNN is divided into several layers and has a complex hierarchical
structure. CNN additionally incorporates, input layer, output layer, convolutional
layers, pooling layers, normalisation layers and fully connected layers. At the end of
the day, every CNN is made up of several layers, the primary layers of these are the
convolutional layer and the sub-sampling layer. In this paper, we have implemented
four networks such as AlexNet, DenseNet, ResNet and VggNet [9–12].
Alexnet:
The network is quite closely related to Yann LeCun et al.’s LeNet, on the other hand,
has much more filters per layer and is clustered with convolutional layers. It included
convolutions, max pooling, dropout, data augmentation, ReLU activations, and SGD
with momentum. There are ReLU activations attached to each convolutional and
fully connected layer. AlexNet is eight layers deep convolutional neural network.
This network has been trained using millions of images from the ImageNet database,
and we can load this pre-trained network. It can categorise images into 1000 different
object categories, such as pencil, mouse and so on. Therefore, for a wide range of
images, the network has learned feature extraction. The input size for this network
is 227 × 227.
ResNet:
At the ILSVRC 2015, Kaiming He et al. introduced architecture with “skip connec-
tions” and heavy batch normalisation. The network is called residual neural network
(ResNet). These skip connections, also known as gated units, have a strong resem-
blance towards the late effective components used in RNNs. They were able to create
a CNN with 152 layers whilst maintaining a lower complexity than VggNet using this
methodology. It achieves a top-5 error rate (3.57%) on this data set, outperforming
human performance.
754 D. Pavani et al.
VggNet:
Simonyan and Zisserman introduced VggNet. This network is made up of 16 convo-
lutional layers and has a fairly consistent architecture. VggNet is a network that is
extremely similar to AlexNet, with only 3 × 3 convolutions, and consists contains
a large number of filters. It took 2–3 weeks to train on four GPUs. Right now, for
image feature extraction, VggNet is the most preferred network. The network has
been used in many applications as a feature extractor as the weight configurations
are made publicly available to use. However, the network comprises 138 million
parameters, making it tough to manage.
DenseNet:
In DenseNet, for each layer, extra information is gotten from every one of the past
layers and gives its own feature maps to every succeeding layer. The network uses
concatenation. Each layer is getting an “aggregate information” from the previous
layers. Since each layer gets feature maps from every former layer, the network can
be closely packed, for instance number of channels can be less. The growth rate k
denotes is the extra number of channels that has been added for each layer. Therefore,
DenseNet has higher computational efficiency, and it uses memory effectively.
Optimization:
Optimising the network is crucial to improve the performance of CNN. We use
optimization in training and testing phases. In this execution, we have utilised a
technique called stochastic gradient descent [13].
Stochastic Gradient Descent (SGD):
By calculating the negative gradient, cost issue has been raised after seeing only few
training examples. In CNN, running the whole CNN costs really high. This motivated
the use of SGD. It can be used to overcome the cost issue and still yield the best
results. The below Eq. (1) shows before update
The letter E in the above equation stands for expectation. It is the estimated
value we receive after weighing cost and gradient throughout the entire training set.
The gradient of the parameter is derived using a single or small number of training
samples, and the SGD is used to update the expectation. The below Eq. (2) shows
the update
∅ = ∅ − α∇∅ J ∅; x (i) , y (i) (2)
Brain Tumour Detection Using Convolutional Neural Networks … 755
4 Proposed Methodology
The proposed method is shown in below Fig. 1, which has CNN techniques to identify
whether the input brain image has the tumour or not. The above-mentioned four
pre-trained models, AlexNet, DenseNet, ResNet and VggNet, are used to achieve
this.
The MRI images of the brain are taken from the Kaggle community. The data set
consists of 260 images in which 127 images are positive and 133 are negative images.
Here, positive represents the images which has brain tumour and negative represents
the images which does not have a brain tumour. The size of the input image is taken
as 224 × 224. Now, we have the data set; the next step is to use this data set and
test convolution neural networks. Below are some of the input images we have taken
shown in Fig. 2.
We used a variety of networks in our proposed strategy, with the densenet201 and
vggnet19 combination providing the best results amongst the other combinations.
There are several types of combining two networks, in our work, we trained two
networks independently, but just like an ensemble model we have combined the
output. That means the model will give the output of the pre-trained model which has
the highest probability so that we can classify the given input image more accurately.
We have used MATLAB R2021a to implement our proposed method. MATLAB
has several pre-trained models. These models have been trained on a huge database
called ImageNet. This database contains 1000 object categories and it consists of
1.2 million images. MATLAB has different toolboxes. Here, we used deep learning
toolbox and installed all the four networks from it. In the toolbox, the pre-trained
AlexNet has 25 layers, ResNet has 177 layers, vggnet19 has 47 layers, and DenseNet
has 708 layers. ReLU is the activation function.
Below are the graphs obtained after the training of the networks. X axis shows the
number of iterations, and Y axis denotes accuracy. The graph shows how the accuracy
of the networks improved after repeatedly training the given 2D MRI images and
the error rate reduced. For training, 130 images are used. For the validation, we have
split the data set into 70:30 ratio. The validation accuracies obtained for DenseNet,
ResNet and VggNet are 98.33%, 91.67% and 97.50%, respectively. Figures 3, 4, 5
and 6 depict the graphs obtained after training AlexNet, DenseNet, ResNet, VggNet.
5 Experimental Results
From the database, we have taken few images from the GUI by clicking the browse
button. By calling a respective network, it will identify whether the image has a
tumour or not. Below are the few results shown in Fig. 7.
Brain Tumour Detection Using Convolutional Neural Networks … 757
Calculation of accuracy
From training, we obtained the below results [14]. AlexNet got the highest accuracy
amongst all which is 98.5%, and from the combinations of networks, the DenseNet
and VggNet combination gave maximum accuracy which is 95.4%.
758 D. Pavani et al.
Metrics Used
To evaluate the performance of the presented models, we used four metrics. The
metrics used are accuracy, precision, recall and F1 score. The equations are given
below as (3), (4), (5) and (6) are used to calculate accuracy, precision, recall and F1,
respectively.
TP + TN
Accuracy = (3)
TP + FN + FP + TN
TP
Precision = (4)
TP + FP
TP
Recall = (5)
TP + FN
precision × recall
F1 = 2 × (6)
precision + recall
The below Table 1 depicts the metrics used. Here, “TP” denotes true positive;
“TN” indicates true negative; “FP” describes false positive, and “FN” represents
false negative.
6 Conclusion
Analysis of medical images has become a tough task because of the large scale
of databases. To analyse such large number of data and accurately predicting the
disease is the biggest problem currently we are facing in the medical field. Using
CNN techniques to predict the disease will reduce the human inaccurate prediction
and reduce the time to identify the disease. In our paper, we have discussed four
pre-trained CNN techniques and the combination of two models. The Alexnet has
achieved the highest accuracy of 98.5%. The main purpose for medical image analysis
is to use the application in the real life, so in future work we will try to improve the
models which is suitable for 3D images.
760 D. Pavani et al.
References
1. D.V. Gore, V. Deshpande,Comparative study of various techniques using deep learning for
brain tumour detection. in 2020 International Conference for Emerging Technology (INCET),
2020, pp. 1–4. https://doi.org/10.1109/INCET49848.2020.9154030
2. M. Siar, M. Teshnehlab, Brain tumour detection using deep neural network and machine
learning algorithm. in 2019 9th International Conference on Computer and Knowledge
Engineering (ICCKE), (2019), pp. 363–368. https://doi.org/10.1109/ICCKE48569.2019.896
4846
3. N. Noreen, S. Palaniappan, A. Qayyum, I. Ahmad, M. Imran, M. Shoaib, A deep learning
model based on concatenation approach for the diagnosis of brain tumour. IEEE Access 8,
55135–55144 (2020). https://doi.org/10.1109/ACCESS.2020.2978629
4. T. Hossain, F.S. Shishir, M. Ashraf, M.A. Al Nasim, F. Muhammad Shah, Brain tumour detec-
tion using convolutional neural network. in 2019 1st International Conference on Advances in
Science, Engineering and Robotics Technology (ICASERT) (2019), pp. 1–6. https://doi.org/10.
1109/ICASERT.2019.8934561
5. R. Vinoth, C. Venkatesh, Segmentation and detection of tumour in MRI images using CNN and
SVM classification. Conf. Emerg. Devices Smart Syst. (ICEDSS) 2018, 21–25 (2018). https://
doi.org/10.1109/ICEDSS.2018.8544306
6. S. Pereira, A. Pinto, V. Alves, C.A. Silva, Brain tumour segmentation using convolutional
neural networks in MRI images. IEEE Trans. Med. Imaging 35(5), 1240–1251 (2016). https://
doi.org/10.1109/TMI.2016.2538465
7. Z. Sobhaninia, S. Rezaei, N. Karimi, A. Emami, S. Samavi, Brain tumor segmentation by
cascaded deep neural networks using multiple image scales. in 2020 28th Iranian Conference
on Electrical Engineering (ICEE) (2020), pp. 1–4. https://doi.org/10.1109/ICEE50131.2020.
9260876
8. S. Kumar, A. Negi, J.N. Singh, H. Verma, A deep learning for brain tumor MRI images semantic
segmentation using FCN. in 2018 4th International Conference on Computing Communication
and Automation (ICCCA) (2018), pp. 1–4. https://doi.org/10.1109/CCAA.2018.8777675
9. Z. Jia, D. Chen, Brain tumor identification and classification of MRI images using deep learning
techniques. in IEEE Access (2020), pp. 1–1. https://doi.org/10.1109/ACCESS.2020.3016319
10. H. Ucuzal, S. Yasar, C. Colak, Classication of brain tumor types by deep learning with convolu-
tional neural network on magnetic resonance images using a developed web-based interface. in
2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies
(ISMSIT) (2019). https://doi.org/10.1109/ismsit.2019.8932761
11. H. Mohsen, E.-S.A. El-Dahshan, E.-S.M. El-Horbaty, A.-B.M. Salem, Classification using
deep learning neural networks for brain tumors. Future Comput. Inf. J. 3(1), 68–71 (2018).
https://doi.org/10.1016/j.fcij.2017.12.001
12. A. Isn, C. Direkoglu, M. Sah, Review of MRI-based brain tumor image segmentation using
deep learning methods. Procedia Comput. Sci. 102, 317–324 (2016). https://doi.org/10.1016/
j.procs.2016.09.407
13. D.V. Gore, V. Deshpande, Comparative study of various techniques using deep Learning for
brain tumor detection. in 2020 International Conference for Emerging Technology (INCET)
(2020). https://doi.org/10.1109/incet49848.2020.9154030
14. T. Saba, A.S. Mohamed, M. El-Aendi, J. Amin, M. Sharif, Brain tumor detection using fusion
of hand crafted and deep learning features. Cognitive Syst. Res. 59, 221–230 (2020). https://
doi.org/10.1016/j.cogsys.2019.09.007
15. M. Siar, M. Teshnehlab, Brain tumor detection using deep neural network and machine learning
algorithm. in 2019 9th International Conference on Computer and Knowledge Engineering
(ICCKE) (2019), pp. 363–368. https://doi.org/10.1109/ICCKE48569.2019.8964846
Design of Circular Patch Antenna
with Square Slot for Wearable
Ultra-Wide Band Applications
Abstract In this paper, an ultra-wide band antenna made of simple circular patch
is designed. Various design evolutions of the circular patch antenna have been
discussed. A conventional circular patch with partial ground is designed to have an
ultra-wide frequency band. Since there is an impedance mismatch at mid frequencies,
a slit is etched at the ground to improve the matching. Then, a square slot is etched in
the circular patch to have better performance characteristics. The antenna is designed
on FR4 substrate having 1.6 mm thickness. The circular patch having square slot is
operating from 2.7 to 12.1 GHz which is more than an ultra-wide frequency range.
The other antennas are applicable at WiMAX (3.2–3.8 GHz), WLAN (5.1–5.3 GHz),
and X-band (8–12 GHz) frequencies. The proposed antenna is also applicable for
wearable application as its SAR value is 1.5 W/Kg. The proposed design is simulated
using Ansys HFSS simulation software.
1 Introduction
An antenna is a device used to transmit and receive the electromagnetic signals. The
principal considerations of an antenna are robustness, flexibility, miniaturization,
and economical. Since Federal Communications Commission (FCC) had released
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 761
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_70
762 S. Rekha et al.
3.10–10.60 GHz for the commercial applications, there has been a massive demand
for the ultra-wide band devices [1] because of its wide bandwidth. These UWB band
find its applications in satellites, wireless local area network, military applications,
biomedical applications, and so on.
The history of ultra-wide band antenna started long ago in the early stages of
antenna development such as conical antennas and horn antennas [2]. Nowadays,
microstrip antennas are preferred to attain broad frequency range because they are
compact. The simplest of the UWB antennas is a monopole antenna, but the gain for
monopole antennas is moderate [3]. Sometimes, there is a need to eliminate some
frequencies which can be achieved using slots [4]. In some cases, partial ground is
involved to achieve better performance [5]. Later, slots are introduced in the ground
which is called as defected ground structure (DGS) [6]. Single slot or multiple slots
are cut on the patch in order to achieve necessary performance [7]. Slots are also
employed to get notch band characteristics [8]. Planar monopole antenna has low gain
[9] compared to slot antennas. These slots are differently structured such as circle,
square, rectangle, conical, semi-circular, diamond, and so on [10]. Various structures
are used for different applications. The other factor that defines the performance of
the antenna is the dielectric substrate used to build the antenna [11]. Some antennas
are designed in such a way that both the ground and substrate are on the same plane
using co-planar wave feeding. There are certain antennas consist of more than one
port [12], and they are termed as MIMO antennas.
In this paper, a square slotted circular patch antenna has been designed for ultra-
wide band applications. A circular patch is simple in structure and possess high
gain. In Sect. 2, the techniques, parameters, and dimensions of the antenna design
have been discussed. In Sect. 3, the results (reflection coefficient and the gain) are
elaborated. The paper is concluded in the Sect. 4.
2 Antenna Design
In this research paper, design and evolution of simple circular patch antenna are
discussed. The circular patch antenna has been discussed as four antenna models
as in Fig. 1. All the designs are etched on FR4 lossy epoxy substrate as it is easily
available and cost effective. Figure 1a is the conventional circular patch having partial
ground. Figure 1b is the circular patch having a deep cut in the ground plane in order
to improve the impedance matching characteristics. Figure 1c is the circular patch
having square slot at the middle. Figure 1d is the circular patch having square slot as
well as the deep cut in the ground plane. These designs are simulated using full wave
simulator Ansys HFSS. The total dimension of the design is 32.75 * 37.5 * 1.6 mm3 .
The dimension of the proposed antenna is displayed in Table 1.
The circular patch is excited through a microstrip feed having 50 ohms as
impedance. Addition of slots on the antenna changes the reactance value (capac-
itance or inductance) which in return increases operating frequency band. A square
slot (4 mm * 4 mm) is being created at the center of the circular patch as in Fig. 1c, d.
Design of Circular Patch Antenna with Square Slot … 763
Later, the slot is adjusted slightly to achieve better impedance matching with UWB
frequency. Similarly, ground plane etching is implemented to have good resonance
and the gain characteristics. The result of the antennas has been discussed in the
Sect. 3.
764 S. Rekha et al.
The performance of the antenna is analyzed in terms of S parameters and the gain
parameter. In addition, the SAR value is determined for the antenna to claim that it
is applicable for wearable purposes.
The simple circular patch antenna (antenna 1) having partial ground is operating from
2.85–4.8 GHz to 5.8–11.84 GHz. Here, the UWB range is not completely covered. In
order to improve the operating frequency, antenna 2 is designed. Here, a rectangular
slot is etched in the ground plane to match the impedance from 4.8 to 5.8 GHz. Now,
the operating frequency is from 2.7 to 12.06 GHz which covers complete UWB
range. The reflection coefficient of antennae 1 and 2 is compared in the Fig. 2.
In Fig. 3, antenna 3 and 4 are compared. In order to analyze the effects of slots on
the circular patch, a simple square slot is etched at the center of the circle as shown
Fig. 2 Comparison of
reflection coefficient of
antenna 1 and 2
Design of Circular Patch Antenna with Square Slot … 765
Fig. 3 Comparison of
reflection coefficient of
antenna 3 and 4
(a) (b)
(b) (d)
E field
H field
Fig. 5 E field and H field radiation pattern at 9 GHz of a antenna 1, b antenna 2, c antenna 3, d
antenna 4
The proposed antenna can also be used for wearable applications. The specific absorp-
tion rate is the measure of radiation absorbed by the human body while using the
devices. There are some limits to the SAR values. In India, SAR must be less than
1.6 W/kg per 1 g of tissue. The SAR of the proposed antenna is less than the maximum
value. The plot for SAR of proposed model (antenna 4) is given in Fig. 6. Figure 6
shows that the average SAR value of antenna 4 is 1.5 w/kg which is below the SAR
limit. The proposed model is suitable for wearable purpose because of its SAR value
and the compact size of the antenna.
Design of Circular Patch Antenna with Square Slot … 767
4 Conclusion
wearable purposes because its average SAR value is 1.5 w/kg. The proposed antenna
is a simple, and a cost-effective antenna is applicable in UWB range.
References
1. First report and order on ultra-wide band technology (FCC, Washnigton D.C., 2002)
2. B. Anupama, A. Singh, S. Kavitha, K. Shet, D. Prasad, Bowel shaped and loaded conical dielec-
tric substrate horn antenna for ultra-wide band operation. in 2019 International Conference on
Intelligent Computing and Control Systems (ICICCS), India (2019), pp. 1143–1146
3. S. Rekha, M. Nesasudha, Design of circularly polarized planar monopole antenna with
improved axial ratio bandwidth. Microw. Opt. Technol. Lett. 59(9), 2353–2358 (2017)
4. P. Thongyoy, P. Rakluea, T.N. Ayudthaya, Compact thin-film UWB antenna with round corner
rectangular slot and partial circular patch. in 2012 9th International Conference on Elec-
trical Engineering/Electronics, Computer, Telecommunications and Information Technology,
Thailand (2012), pp. 1–4
5. N.M. Awad, M.K. Abdelazeez, Circular patch UWB antenna with ground-slot. in 2013 IEEE
Antennas and Propagation Society International Symposium (APSURSI), USA (2013), pp. 442–
443
6. P. Jain, B. Singh, S. Yadav, A. Verma, M.A. Zayed, A novel compact circular slotted microstrip-
fed antenna for UWB application. in 2015 International Conference on Communication,
Control and Intelligent Systems (CCIS), India (2015), pp. 22–24
7. I.K. Kim, J. Ghimire, J. Maharjan, I. Nadeem, S.W. Kim, D.Y. Choi, Ultra-wide band (UWB)
microstrip patch antenna with adjustable notch frequencies. in 2019 IEEE International
Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT),
Indonesia (2019), pp. 70–73
8. M. Karmugil, K. Anusudha, Design of circular microstip patch antenna for ultra-wide band
applications. in 2016 International Conference on Control, Instrumentation, Communication
and Computational Technologies (ICCICCT), India (2016), pp. 304–308
9. T.F. Nayna, E. Haque, F. Ahmed, Design of a X band defected ground circular patch antenna
with diamond shaped slot and ring in patch for UWB applications. in 2016 International
Conference on Signal Processing, Communication, Power and Embedded System (SCOPES),
India (2016), pp. 559–562
10. M.A. Kango, S. Oza-Rahurkar, Effect of dielectric materials on UWB antenna for wear-
able applications. in 2017 IEEE International Conference on Power, Control, Signals and
Instrumentation Engineering (ICPCSI), India (2017), pp. 1610–1615
11. M.M. Sharma, I.B. Sharma, R. Agarwal, Circular edge cut diminutive UWB antenna for wire-
less communications. in 2019 EEE Indian Conference on Antennas and Propagation (InCAP),
India (2019), pp. 1–4
12. X. Tang, Z. Yao, Y. Li, W. Zong, G. Liu, F. Shan, A high performance UWB MIMO antenna
with defected ground structure and U-shape branches. Int. J. RF Microwave Comput.-Aided
Eng. 31(2), e22270 (2020)
Design of Baugh-Wooley Multiplier
Using Full Swing GDI Technique
Abstract This paper presents a four-bit Baugh-Wooley multiplier using full swing
gate diffusion input (GDI) technology. In general, addition is a crucial arithmetic
operation and is heavily demanded in VLSI design. These are widely used in digital
signal processing, accumulators, microprocessors and many other applications. So,
the full adder performance decides the overall system performance. Proposed design
reduces the area as it contains a smaller number of transistors compared to other
logic designs. The proposed design as shown 44% decrease in PDP, and the area was
decreased by 6.25% of Baugh-Wooley multiplier.
1 Introduction
Due to heavy demand in VLSI, area and power are the vital factors in chip design.
Nowadays, every applications like mobiles, televisions and in many electronic
gadget’s chips are the fundamental elements. Full adder is the basic operation in many
designs. So, if we improve the performance of full adder, automatically performance
of the design increases [1–3]. Here, in this paper, we are designing a full adder using
full swing GDI technique which consumes less power and less area. And using that
full adder, we designed a four-bit Baugh-Wooley multiplier. It also consumes lesser
area and low power consumption.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 769
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_71
770 V. Ponugoti et al.
We can obtain the following Boolean functions using the basic GDI cell as shown in
the Table 1.
Above table shows the various logic functions which can be obtained using just
two transistors [6]. In conventional CMOS design [7], these logic functions require
about 6–12 transistors. These functions are obtained by just interchanging the inputs
between input terminals. In the above table, F1 and F2 can be called as universal
functions for GDI technique just like NAND and NOR gates in CMOS technology.
Using F1 and F2, we can realize every design using GDI technique [8].
This technique suits best for manufacturing in twin well CMOS process and SOI
silicon on insulator process only as these styles when used with this technique gives
less propagation delay and consumes less power. The GDI cell structure is different
from the overall PTL techniques, and it has some extraordinary highlights. These
features allow improvements in designing a complex circuit easily. These improve-
ments include transistor count reduction, low power dissipation. To understand the
GDI cell properties analysis of basic cell in different cases and configurations are to
be done.
The widespread problem in PTL is its low swing output signals. It is due to threshold
drop across single channel pass transistors [9]. To overcome this problem, the PTL
that is using at current days and uses additional circuitry. In the same way, general
GDI designs also give low swing output signals to get the full swing, we add extra
transistor to the existing design. Although number of transistors are increasing, we
are getting full swing output. This increase in number of transistors is also less than
actual transistors count in CMOS designs. To comprehend the low swing issue in
GDI technique, we consider a single function F1 and in the same way, it applies to
772 V. Ponugoti et al.
all designs that are designed in GDI technique. In the function F1, low swing occurs
when A = 0 and B = 0. At this input voltage levels, the voltage of F1 is V tp rather
than 0. This is due to the PMOS pass transistor’s poor high-to-low transition. A = 0,
B = VDD to A = 0 and B = 0 are the only situation where the effect happens in all
transformations. The GDI cell operates as a regular CMOS inverter in about 50% of
the cases (for B = 1) which is widely used as digital buffer for logic level restoration.
This modification involves adding an extra transistor at the output which restores the
swing. As compared to simple GDI logic, this technique improves the output voltage,
control and power delay product. This logic style can be produced using a regular
CMOS process. Using the full swing GDI technique, the threshold problem was
solved and the performance swings degradation was improved. This new technique
only involves in increases in only one transistor, although it increases the overall
transistor count, this count is less than the transistors used in complementary metal
oxide semiconductor technology.
F1 and F2 functions using full swing GDI technique are shown below Fig. 3.
These logic functions produce full swing performance, use less power, are energy
efficient and take up a limited amount of space.
3 Full Adder
Full adder is a basic functional block in many applications. A full adder circuit
consists of three inputs (A, B, C) and two outputs (SUM, CARRY) [10]. It is a
combinational circuit which performs various three-bit operations. To design a full
adder XOR, XNOR gates and multiplexer are required as shown in Fig. 4. So here to
design the full adder using GDI technique, first, we designed the XOR and XNOR
gates in GDI technique [11]. The XOR and XNOR signals, as well as their comple-
ment signals, determine the overall power consumption and propagation delay of the
complete adder [12].
The XOR gate is the basic functional gate in many applications like adder, multiplier,
comparator, etc. The expression of XOR function is given as in (1)
A ⊕ B = A B + AB (1)
In the proposed XOR [13] circuit, we use four transistors as shown in Fig. 5.
There are 18 transistors in the complete adder that was constructed using full
swing gate diffusion input technique. Our design consumes less power compared to
the conventional CMOS full adder which constitutes 28 transistors. Our design takes
less area as it contains a smaller number of transistors. So, the designed full adder
is area efficient and power efficient full adder [14, 15]. Full adder circuit is shown
Fig. 6.
4 Multiplier
Multiplier is a four-bit two input device. The output of the multiplier gives multi-
plied value of given inputs in binary form. There are diverse types of multipliers. The
774 V. Ponugoti et al.
proposed design is a four-bit Baugh-Wooley multiplier using full swing GDI tech-
nique [16]. We designed the multiplier as shown in the Fig. 7 below. It is little different
from the conventional Baugh-Wooley multiplier. It contains a greater number of full
adders when compared to general one. The discussed design consists of two types of
cells, they are grey and white cells. A complete adder and a NAND gate are attached
to the grey cell. A complete adder and a AND gate are attached to the white cell. The
whole structure gives outputs p1 to p8. These p1 to p8 are the output bits of multi-
plied value. In this designed multiplier, the power consumption is more compared to
the base paper; because the design of multiplier, we used contain a greater number
of full adders compared to the referenced one. It contains total of 20 full adders’
internal which functions parallelly. Our design constitutes less area as it contains a
smaller number of transistors in total. As every basic component of this multiplier is
designed using GDI technique, each one takes a smaller number of transistors which
in total reduces the transistor count.
Fig. 8 Brief overview on results of full adder implemented with different techniques
Fig. 9 Brief overview on power of full adder and Baugh-Wooley multiplier (BWM) implemented
with different techniques
Design of Baugh-Wooley Multiplier Using Full Swing GDI Technique 777
Fig. 10 Brief overview on results Baugh-Wooley multiplier (BWM) implemented with different
techniques
less compared to full adder designed in CMOS technology. When we put together
this full adder and combine them into a multiplier, we get the power high power as
we are combining total of 20 full adders which are internally structured in grey and
white cells. As these cells are combined parallelly, the power consumption is little
higher.
6 Conclusion
References
1. P. Kishore, P.V. Sridevi, K. Babulu, Low power and high speed optimized 4-bit array multiplier
using MOD-GDI technique. in 2017 IEEE 7th International Advance Computing Conference
(IACC) (2017). https://doi.org/10.1109/iacc.2017.0106
2. A.K. Nishad, R. Chandel, Analysis of low power high performance XOR gate using GDI tech-
nique. in 2011 International Conference on Computational Intelligence and Communication
Networks (2011). https://doi.org/10.1109/cicn.2011.37
3. N. Weste, D. Harris, CMOS VLSI Design a Circuits and Systems Perspective (Addison-Wesley,
4thEd, 2011)
4. A. Morgenshtein, I. Schwartz, A. Fish, Gate diffusion input (GDI) logic in standard CMOS
nanoscale process. in 2010 IEEE 26th Convention of Electrical and Electronics Engineers in
Israel (2011)
5. A. Morgenshtein, A. Fish, I. Wagner, Gate-diffusion input (GDI): a power-efficient method
for digital combinatorial circuits. IEEE Trans. Very Large-Scale Intergr. (VLSI) Syst. 10(5),
566–581 (2002)
6. M. Shoba, R. Nakkeeran, GDI based full adders for energy efficient arithmetic applications.
Eng. Sci. Technol. Int. J. (2015)
7. A.M. Shams, D.K. Darwish, M.A. Bayoumi Performance analysis of low power 1-bit CMOS.
IEEE Trans. VLSI Syst. 10(1), 20–29 (2002)
8. V. Adler, E.G. Friedman, Delay and power expressions for a CMOS inverter driving a resistive
capacitive load. Analog Integr. Circ. Sig. Process. 14, 29–39 (1997)
9. T. Bhagyalaxmi, S. Rajendra, S. Srinivas, Power-aware alternative adder cell structure using
swing restored complementary pass transistor logic at 45nm technology, in 2nd International
Conference on Nanomaterials and Technologies (CNT 2014) (2014)
Design of Baugh-Wooley Multiplier Using Full Swing GDI Technique 779
10. C.-K. Tung, Y.-C. Hung, S.-H. Shieh, G.-S. Huang, A low-power high-speed hybrid CMOS
full adder for embedded system. in 2007 IEEE Design and Diagnostics of Electronic Circuits
and Systems conference (2007)
11. E. Abu-Shama, M. Bayoumi, A new cell for low-power adders. in Proceedings of International
Midwest Symposium Circuits System (1995)
12. K.K. Chaddha, R. Chandel, Design and analysis of a modified low power CMOS full adder
using gate diffusion input technique. J. Low Power Electron. 6(4), 482–490 (2010)
13. K. Ravi Kumar, P. Mahipl Reddy, M. Sadanandam, A. Santhosh Kumar, M. Raju, Design of
2T XOR gate based full adder using GDI technique. in International Conference on Innovative
Mechanisms for Industry Applications (ICIMIA 2017) (2017)
14. M.J. Garima, H. Lohani, Design, implementation and performance comparison of multiplier
topologies in power-delay space. Eng. Sci. Technol. Int. J. (2015)
15. A. Shams, T. Darwish, M. Bayoumi, Performance analysis of low-power I-bit CMOS full adder
cells. IEEE Trans. VLSI Syst. IO(1), 20–29 (2002)
16. O.A. Albadry, M.A. Mohamed El-Bendary, F.Z. Amer, S.M. Singy, Design of area efficient
and low power 4-bit multiplier based on full-swing GDI technique. in 2019 International
Conference on Innovative Trends in Computer Engineering (ITCE) (2019). https://doi.org/10.
1109/itce.2019.8646341
VLSI Implementation of the Low Power
Neuromorphic Spiking Neural Network
with Machine Learning Approach
1 Introduction
Biomedical system requires advanced computing methods and algorithms for the
analysis of data. Most of the data is either one dimensional or two dimensional.
For neuromorphic signals, hardware and software designs are complex due to their
signal features. The diagnosis and classification process are an important block in the
complete architecture. These signals are mostly having spatiotemporal characteristics
with peaks. Hence to analyze such signals, the data spiking neural networks will
be efficient. These third generation spiking neural networks are computationally
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 781
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_72
782 K. Venkateswara Reddy and N. Balaji
powerful and suit well in solving biomedical signal problems. These are also bio-
realistic in nature than their predecessors. The biomedical data follow spikes, binary
pattern, or noise burst which are apt for a neurospiking system. The ease of connecting
the biomedical information with the spiking neural network with low power or area
efficient or speed is a major advantage. But in very large-scale integrated circuit
(VLSI) implementations [1], all parameters cannot be optimized simultaneously.
Either the area or power or delay or efficiency can only be managed [2–7]. Since
neural architecture occupies more area, the power reduction is a challenging task but
speed can be improvised without compromising on the performance [8]. The spiking
neurons represent the analog biomedical input signal in time domain with binary
outputs. This compatibility of the spiking neural networks with the existing digital
systems which outputs analog values in the voltage or current domain is an added
advantage [8, 9]. VLSI chip for spiking neurons is based on asymmetric spike timing
dependent synaptic plasticity (STDP) algorithm while others use the gradient decent
algorithm. The bioinspired algorithms, such as spike timing dependent plasticity and
bat algorithms, are implemented or combined with the existing methods [10–12].
The next challenge lies on the training and optimization of the weight vectors in
each layer of the neural network architecture. The training is performed through
arithmetic calculations and computing of the particular algorithm with respect to
the architecture. These conventional neural network training methods differ from
the weight updating carried out in the biological neural networks. So, the swarm or
artificial intelligence-based learning is to be carried out and the arithmetic network
cannot be used. The main objective of this research work is to compare various spiking
neural networks (SNNs) architectures reported in the literature with different neuron
models along with the performance parameters such as power and delay.
2 Literature Survey
The previous research has presented wide variety of model types for neuromor-
phic systems like biologically-inspired [13] and artificial neural network models
such as Hodgkin-Huxley model, integrate-and-fire model, resonate-and-fire model,
quadratic-integrate-and-fire model, Izhikevich model, FitzHugh-Nagumo model,
Hindmarsh-Rose model, Morris-Lecar model, and Wilson model. The behavior and
characteristics of the biological neuron are to be linked with the artificial neural
model. Some of these models mimic the charge accumulation and neurons firing.
Few other models are based on nonbiologically principle or structure which does not
have neuroscience behaviors. The models are classified as biological plausible and
biological inspired. The former model behaves similar to biological neural systems
while the later replicates the biological neural systems behavior. The others may be of
just neuron models including axons, dendrites, or glial cells. McCulloch-Pitts model
defines the neural network models for spiking network which integrates the neuron
and fires. For the implementation of the algorithm in hardware, it replicates the
cell membrane dynamics, ion channel dynamics, neuron, delay components (axonal
VLSI Implementation of the Low Power Neuromorphic … 783
models), and pre- and post-synaptic neurons (dendritic models) [14]. The biomed-
ical circuits use small signal neural models for diagnose the various pathologies
[15]. In literature, several spiking neural network models were implemented [16,
17]. The feed-forward neural networks, including multilayer perceptron, are used
for implementation of neural spike detectors with several learning methods. The
learning methods are classified as supervised and unsupervised [18–20]. The imple-
mentation of the machine learning approach in hardware is a challenging task. The
low-price embedded systems were less efficient and suitable for portable devices.
The DSP architectures have specific structure for suitable purposes. The implemen-
tation is efficient only if the hardware supports higher computation blocks, necessary
memory, and faster communication. The biomedical applications have three main
blocks, namely preprocessing, feature extraction, and classification. For example,
in image processing application, the input image is processed block by block using
machine learning approaches. Wang et al. [21] presented a practical solution by
processing the input image using convolution neural networks (CNNs). For imple-
mentations, integrated (VLSI) architectures were used for color-based applications.
Complementary metal–oxide–semiconductor (CMOS) was used in dedicated chip
structures. Models were used to evaluate the performance of the learning algorithm
and network in both forward and reverse direction. Combining the advance tech-
nology in memory design, computing methods, and communication for a neuron,
a design was done by Seo et al. [22]. The SRAM provides better inter-neuron
communication among 256 neurons and 64 K binary synapses. The implementa-
tion was done in 45 nm SOI-CMOS. The network was extended for a neuromorphic
processor design by Seo and Seok [23]. Cao et al. [24] designed a spike-based hard-
ware which was energy efficient. Deep CNN was converted into SNN and is found
two orders of magnitude more energy efficient. The implementation was done in
FPGA. The memristor-based spiking neuromorphic networks can be demonstrated
as biology-plausible spike-time-dependent plasticity (STDP) windows in integrated
metal-oxide memristors [25]. Budinski et al. [26] introduced a Newton-type modifi-
cation of temporal Hebbian rule-based learning algorithm of a self-learning spiking
neural network. The mapping of the NN with spike system follows certain steps. Diehl
et al. [27, 28] mapped the RNN on a substrate of spiking neurons. The real-world
systems need practical spiking neuromorphic engine (SNE) which is time based
[13]. In applications like pattern recognition, parallel NN architecture is required
to perform the recognition or detection of patterns. Wang et al. [29] implemented a
large-scale neural networks hardware based on neural engineering framework (NEF)
in field programmable gate arrays (FPGAs). Reconfigurable mixed-signal spiking
neuromorphic architecture chip was developed by Luo et al. [30] with multichip
communication. For synaptic storage and computing, that chip was designed with 256
× 256 static random-access memory (SRAM) cells, 256 x 256 content addressable
memory (CAM) cells, 2 x 256 synapses and 256 neurons. The integrated chip neuron
implements spiking frequency adaptation and through address event representation
(AER) communication protocol provides communication. Pani et al. [31] presented
a modular and efficient FPGA design of an in silico spiking neural network using
Izhikevich model. The Xilinx Virtex 6-based device uses 1440 neurons. The synapses
784 K. Venkateswara Reddy and N. Balaji
3 Background Methodology
The spiking neural networks (SNNs) are faster, accurate, and computationally
powerful. The SNN models are accurately the nervous system and other machine
learning algorithm can be incorporated efficiently. This makes the architecture better
when compared to the conventional artificial neural networks (ANNs) architec-
tures. In the existing work by Farsa et al. [36], machine learning approach toward
spiking neuromorphic using LIF model for neural computing was made. The reported
investigation shows the trade-off between computational complexity and biological
accuracy. A spiking neural network with the LIF neurons for pattern recognition is
depicted in Fig. 1. The network consists of 25 dummy neurons in the input layer, 5
LIF neurons in the hidden layer, and 1 LIF neuron in the output layer.
The neuromorphic system architecture presented in this literature for imple-
menting the SNN model based on NCHU is presented in Fig. 2. Here, the input
provider unit supplies the inputs, control unit takes control of all units with proper
timing while the synaptic weights are stored in memory. Hardware design of the
SNN model in RTL is shown in Fig. 3 which consists of inputs, six NCHUs, single
SRAM block, and output spikes. The NCHUs were implemented with sequential and
pipeline design, whose values are stored in the SRAM. SRAM was designed using
registers to save the ongoing data between two consecutive layers in the network.
NCHU executes every task based on the timing signal, and the results are stored in
VLSI Implementation of the Low Power Neuromorphic … 785
The implementation was carried out in Xilinx for FPGA. The several parameters like
LUT, power dissipation, and delay are measured and tabulated. The results are shown
in Table 1, 2, and 3 for various blocks. For implementation Cyclone II, III, Stratix II
and III were used. Cyclone II was providing less power dissipation when compared
to other kits even with the lesser delay. The LUTs occupied by the Stratix III for
the NCHU block is more which shows its larger area. When SRAM is considered,
similar performance is observed.
Table 1 Parameter analysis of NCHU in SNN
FPGA family Device LUT Power dissipation (mW) Delay
Available Used Utilization (%) Core dynamic Core static I/O thermal Total thermal (nS)
power power power power
dissipation dissipation dissipation dissipation
Cyclone II EP2C5F256C6 4608 133 3 13.45 18.11 49.17 80.72 6.747
(90 nm)
Cyclone III EP3C5F256C6 5136 133 3 12.21 46.17 34.53 92.91 5.627
(65 nm)
VLSI Implementation of the Low Power Neuromorphic …
Stratix II EP2S15F484C3 12,480 134 <1 35.24 305.97 272.14 613.35 5.218
Stratix III EP3SL50F484C2 38,000 134 <1 19.22 370.71 159.31 549.24 5.129
789
790 K. Venkateswara Reddy and N. Balaji
The SRAM performance for various FPGA kits is presented in Table 2. From the
table, it can be observed that the static power varies much when compared to
dynamic power. In addition, the lower technology 65 nm consumes more power
when compared to higher technology 90 nm. The delay varies by 20%. The LUT
remains the same. Here, the lower technology 65 nm consumes less dynamic power
when compared to 90 nm. Similarly, the same analysis holds for SNN block in which
the dynamic power is reduced in the lower technology with nominal delay.
The SNN performance for various FPGA kits is presented in Table 3. From the table, it
can be observed that the static power varies much when compared to dynamic power.
In addition, the lower technology 65 nm consumes more power when compared to
higher technology 90 nm.
5 Conclusion
This paper presents the detail analysis and implementation of the spiking neural
network in field programmable gate array. Several methods of architecture were
reviewed, and the comparative analysis was made. The design is suitable for biomed-
ical application, where efficiency in diagnosis is not compromised. Details of the
architecture and its implementation are reported. For implementation, Quartus tool
in Spartan/Cyclone/Vertex Kits was used in 90 nm and 65 nm and the results are
presented. The analysis of neurocomputing hardware unit, SRAM, and spiking neural
network unit with respect to power, delay, and LUT is done. From the analysis, it is
been found that the power is higher when delay is less and vice versa. Based on the
system for medical application, the choice can be made.
VLSI Implementation of the Low Power Neuromorphic … 791
References
1. S. Roy, A. Banerjee, A. Basu, Liquid state machine with dendritically enhanced readout for
low-power, neuromorphic VLSI implementations. IEEE Trans. Biomed. Circuits Syst. 8(5),
681–695 (2014). https://doi.org/10.1109/TBCAS.2014.2362969
2. B. Deng, M. Zhang, F. Su, J. Wang, X. Wei, B. Shan, The implementation of feedforward
network on field programmable gate array. in IEEE 2014 7th International Conference on
Biomedical Engineering and Informatics (BMEI) (2014), pp. 483–487
3. P. Dondon, J. Carvalho, R. Gardere, P. Lahalle, G. Tsenov, V. Mladenov, Implementation of
a feed-forward artificial neural network in vhdl on fpga. in IEEE 2014 12th Symposium on
Neural Network Applications in Electrical Engineering (NEUREL) (2014), pp. 37–40
4. H. Mostafa, A. Khiat, A. Serb, C.G. Mayr, G. Indiveri, T. Prodromakis, Implementation of a
spike-based perceptron learning rule using tio2- x memristors. Front. Neurosci. 9, 357 (2015)
5. G.-M. Lozito, A. Laudani, F.R. Fulginei, A. Salvini, Fpga implementations of feed forward
neural network by using floating point hardware accelerators. Adv. Electr. Electron. Eng. 12(1),
30 (2014)
6. A. Perez-Garcia, G. Tornez-Xavier, L. Flores-Nava, F. Gomez- Castaneda, J. Moreno-Cadenas,
Multilayer perceptron network with integrated training algorithm in fpga. in IEEE 2014
11th International Conference on Electrical Engineering, Computing Science and Automatic
Control (CCE) (2014), pp. 1–6
7. R. Hasan, T.M. Taha, Enabling back propagation training of memristor crossbar neuromorphic
processors. in IEEE 2014 International Joint Conference on Neural Network (IJCNN) (2014),
pp. 21–28
8. F. Castanos, A. Franci, The transition between tonic spiking and bursting in a six-transistor
neuromorphic device. in 2015 12th International Conference on Electrical Engineering,
Computing Science and Automatic Control (CCE), IEEE (2015), pp. 1–6
9. F.L.M. Huayaney, H. Tanaka, T. Matsuo, T. Morie, K. Aihara, A VLSI spiking neural network
with symmetric STDP and associative memory operation. Int. Conf. Neural Inf. Process. 381–
388 (2011). https://doi.org/10.1007/978-3-642-24965-5_43.
10. M. Nouri, M. Jalilian, M. Hayati, D. Abbott, A digital neuromorphic realization of pair-based
and triplet-based spike-timing-dependent synaptic plasticity. IEEE Trans. Circuits Syst. II
Express Briefs 65(6), 804–808 (2018). https://doi.org/10.1109/TCSII.2017.2750214
11. D. Yamashita, K. Saeki, Y. Sekine, IC implementation of spike-timing-dependent synaptic
plasticity model using low capacitance value. in 2014 IEEE Asia Pacific Conference on Circuits
and Systems (APCCAS), Ishigaki (2014), pp. 221–224. https://doi.org/10.1109/APCCAS.2014.
7032759
12. H. Hsieh, K. Tang, Hardware friendly probabilistic spiking neural network with long-term and
short-term plasticity. IEEE Trans. Neural Netw. Learn. Syst. 24(12), 2063–2074 (2013). https://
doi.org/10.1109/TNNLS.2013.2271644
13. T. Liu, W. Wen,A fast and ultra low power time-based spiking neuromorphic architecture for
embedded applications. in 2017 18th International Symposium on Quality Electronic Design
(ISQED), Santa Clara, CA (2017), pp. 19–22. https://doi.org/10.1109/ISQED.2017.7918286
14. E.M. Izhikevich, Which model to use for cortical spiking neurons? IEEE Trans. Neural Netw.
15(5), 1063–1070 (2004)
15. A. Basu, Small-signal neural models and their applications. IEEE Trans. Biomed. Circ. Syst.
6(1), 64–75 (2012)
16. F. Grassia, T. Levi, T. Kohno, S. Saighi, Silicon neuron: digital hardware implementation of
the quartic model. Artif. Life Robot. 19(3), 215–219 (2014)
17. S. Hashimoto, H. Torikai, A novel hybrid spiking neuron: bifurcations, responses, and on-chip
learning. IEEE Trans. Circ. Syst. I: Regul. Pap. 57(8), 2168–2181 (2010)
18. M. Hu, H. Li, Y. Chen, Q. Wu, G.S. Rose, R.W. Linderman, Memristor crossbar-based neuro-
morphic computing system: a case study. IEEE Trans. Neural Netw. Learn. Syst. 25(10),
1864–1878 (2014)
792 K. Venkateswara Reddy and N. Balaji
36. E.Z. Farsa, A. Ahmadi, M.A. Maleki, M. Gholami, H.N. Rad, A low-cost high-speed neuro-
morphic hardware based on spiking neural network. IEEE Trans. Circ. Syst. II Express Briefs
66(9), 1582–1586 (2019). https://doi.org/10.1109/TCSII.2019.2890846
IoT-Based Energy Saving
Recommendations by Classification
of Energy Consumption Using Machine
Learning Techniques
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 795
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6_73
796 G. S. N. Dhipti et al.
1 Introduction
1.1 Concepts
Revolution in the field of electricity is partitioned into four levels. During the
first transformation, new wellsprings of energy were found for functioning of the
machines. The enormous uprooting of coal and the creation of steam engines are
outstanding advancement arranges at the next stage [1]. The subsequent stage
called as enormous engineering and power was a time of vast development in
trade. The third upheaval presented personal computers and the initial stage of
resemblance advancements, like communication, which entitled computerization for
stream chains [2].
An immense range of current advancements like communication system,
insightful robots, and things over the Web, AI and ML is appropriate to the fourth
modern revolution [3, 4]. The Internet of things is combination of interrelated figuring
gadgets, motorized, and computerized machines or individuals that have unique iden-
tifications and the ability to transfer information to an organization without involve-
ment of human or computerized transfer. IoT can possibly upgrade the conceit in
various regions including health benefits, shrewd urban communities, manufacturing,
agribusiness, irrigation, and the energy area [5].
1.2 Motivation
Energy productivity offers financial rewards in long haul by diminishing the expense
of fuel imports/supply, energy generation, and decreasing outflows from the energy
area. For improving energy proficiency and a more ideal energy the executives,
a successful examination of the continuous information in the energy production
network assumes a key part [6]. Internet of things avail sensors and resembling inno-
vations for distinguishing and communicating constant information, which entitle
quick estimations and absolute infusions [7]. In this proposal, we deliberate the
intent of Internet of things in all stages of the energy flow.
Here, the proposal is to point the expected commitment of Internet of things to
effective usage of energy, decrease of energy stipulation, and expanding the portion
for non-conventional sources of electric energy. The flow of energy supply is shown
in Fig. 1.
2 Internet of Things
Data on Web are arising modernization which exploits the Web and plans to display
network linking actual gadgets or “things” [8]. The actual gadgets includes home
IoT-Based Energy Saving Recommendations by Classification of Energy … 797
appliances and industrial equipment. Taking advantage of fitting sensors and resem-
bling organizations, this equipment can set important instructions and contributes
authorized supervisions for individuals. Major components of IoT are shown in Fig. 2,
which involves the flow as data collection from smart devices thereby processed
according to the protocols and taken to the cloud for data analysis to develop a well-
defined system for efficient usage. In Fig. 2, clearly shown that major components
of IoT.
3 Implementing Technologies
Sensing components are the critical play role of Internet of things [9]. These compo-
nents are used to gather and transfer information progressively. They improve
adequacy, usefulness, and assumes a basic part in accomplishment of IoT [10, 11].
Temperature sensing device is utilized for identifying the vacillations for warming
and freezing a system [12]. Hotness is a significant. On the side of electrical energy
usage, the hotness sensing devices are utilized to amplify the presentation of a frame-
work when hotness varies changes at typical tasks. Humidity sensing component is
utilized to recognize the dampness and moistness in atmosphere. The proportion of
dampness noticeable in atmosphere to the most noteworthy measure of dampness at
a specific hotness of air around is called comparative humidity [13].
3.2 Actuators
Processing IoT information is a difficult issue. Since, IoT information known as big
data refers to tremendous measure of organized and unorganized information, created
from sensors, program, smart, or intellectual gadgets. Because of the attributes of
big data, which are 3 V’s [18], it should be productively handled and investigated
[19]. Various steps involved in evaluation are as shown in Fig. 3.
IoT can assume a pivotal part in lessening energy misfortunes and bringing down
release of carbon dioxide into the atmosphere [20]. The maintenance of electrical
energy dependent on IoT can witness constant electric energy utilization and intensify
the degree of mindfulness of the usage of electrical energy at any level of the energy
flow [21, 22]. IoT advances will support to analyze each device in a city. Construction,
metropolitan systems, energy organizations, and functions could be associated with
sensing devices. These associations can guarantee an energy-effective brilliant city by
consistent checking of information accumulated from sensors. As far as cooperative
effect of smart grid, it is displayed in Fig. 4 in an automatized city fitted out with
800 G. S. N. Dhipti et al.
Internet of things networks, various segments of the city can be associated together
[23].
The energy utilization in urban communities can be partitioned into various parts;
private structures (domestic) and business (administrations). The domestic energy
utilization in the private area incorporates lighting, heating, cooling, warming,
and aeriation (Fig. 5). Electrical energy exploitation commonly represents 50% of
consumption in commercial constructions. Consequently, supervision of high voltage
IoT-Based Energy Saving Recommendations by Classification of Energy … 801
alternating current system is substantial in lowering the usage. IoT devices can be
assumed to be a noteworthy to lessen the wastage of electrical energy. By finding
some indoor regulators reliant on residence, non-residence can be listed out. When
an unused region is distinguished, a several possible steps can be taken to bring down
energy utilization.
5 Results
Table 2 Summary of attributes appliances, lights, T1, RH_1, T2, RH_2, etc.,
Vare N Mean 8d Median Trimmed mad min max range skew Kurtoals 8θ
Date 1 19,735 NaN NA NA NaN NA Inf −Inf −Inf NA NA NA
Appliances 2 19,735 98 103 60 73 30 10 1080 1070 3 14 1
Lights 3 19,735 4 8 0 2 0 0 70 70 2 4 0
T1 4 19,735 22 2 22 22 1 17 26 9 0 0 0
RH_1 5 19,735 40 4 40 40 4 27 63 36 0 0 0
T2 6 19,735 20 2 20 20 2 16 30 14 1 1 0
RH_2 7 19,735 40 4 40 41 4 20 56 36 0 1 0
T3 8 19,735 22 2 22 22 2 17 29 12 0 0 0
RH_3 9 19,735 39 3 39 39 3 29 50 21 0 −1 0
T4 10 19,735 21 2 21 21 2 15 26 11 0 0 0
RH_4 11 19,735 39 4 38 39 5 28 51 23 0 −1 0
T5 12 19,735 20 2 19 19 2 15 26 10 1 0 0
RH_5 13 19,735 51 9 49 50 6 30 96 67 2 5 0
T6 14 19,735 8 6 7 8 6 −6 28 34 1 0 0
RH_6 15 19,735 55 31 55 56 40 1 100 99 0 −1 0
T7 16 19,735 20 2 20 20 2 15 26 11 0 0 0
RH_7 17 19,735 35 5 35 35 5 23 51 28 0 −1 0
T8 18 19,735 22 2 22 22 2 16 27 11 0 0 0
RH_8 19 19,735 43 5 42 43 5 30 59 29 0 0 0
G. S. N. Dhipti et al.
IoT-Based Energy Saving Recommendations by Classification of Energy … 803
6 Conclusion
The proposed RF model will in general achieve better with a lower number of
energy levels and contrasted and the ordinary technique. Rather than going through
regression-based load forecasting from the regular technique, the created classi-
fier pre-processed the mathematical esteemed information into levels and afterward
804 G. S. N. Dhipti et al.
anticipated them utilizing a more straightforward order measure. The two classifiers
perform better with a lower number of energy levels.
IoT-Based Energy Saving Recommendations by Classification of Energy … 805
References
1. P.N. Stearns, Reconceptualizing the industrial revolution. J. Interdiscip. Hist. 42, 442–443
(2011)
2. M. Jensen, The modern industrial revolution, exit, and the failure of internal control systems.
J. Financ. 48, 831–880 (1993)
3. H. Kagermann, J. Helbig, A. Hellinger, Wahlster, W, Recommendations for Implementing the
Strategic Initiative Industrie 4.0: Securing the Future of German Manufacturing Industry;
Final Report of the Industrie 4.0 Working Group; Forschungsunion: Frankfurt/Main, Germany
(2013)
4. S.K. Datta, C. Bonnet, MEC and IoT based automatic agent reconfiguration in industry
4.0. in Proceedings of the2018 IEEE International Conference on Advanced Networks and
Telecommunications Systems (ANTS), Indore, India, 16–19 December 2018, pp. 1–5
5. G.S. Naveen Kumar, V.S.K. Reddy, Detection of shot boundaries and extraction of key frames
for video retrieval. Int. J. Know.-Based Intell. Eng. Syst. 24(1), 11–17
6. Y.S. Tan, Y.T. Ng, J.S.C. Low, Internet-of-things enabled real-time monitoring of energy
efficiency on manufacturing shop floors. Procedia CIRP 61, 376–381 (2017)
7. K. Tamilselvan, P. Thangaraj, Pods—a novel intelligent energy efficient and dynamic
frequency scalings for multi-core embedded architectures in an IoT environment. Microprocess.
Microsyst. 72, 102907 (2020)
8. K. Haseeb, A. Almogren, N. Islam, I. Ud Din, Z. Jan, An energy-efficient and secure routing
protocol for intrusion avoidance in IoT-based WSN. Energies 12, 4174 (2019)
9. S.D.T. Kelly, N.K. Suryadevara, S.C. Mukhopadhyay, Towards the implementation of IoT for
environmental condition monitoring in homes. IEEE Sens. J. 13, 3846–3853 (2013)
10. E. Venkateswara Reddy, M.Ramesh, M. Jane, A comparative study of clustering techniques
for big data sets using apache mahout. in 3rd IEEE International Conference on Smart City
and Big Data 2016, Sultanate of Oman, April 2016
11. G. Di Francia, The development of sensor applications in the sectors of energy and environment
in Italy, 1976–2015. Sensors 17, 793 (2017)
12. ITFirmsCo. 8Types of Sensors that Coalesce Perfectly with an IoT App. (2018)
13. A.S. Morris, R. Langari, Level measurement. in A.S. Morris, R. Langari (eds) Measurement
and Instrumentation (2nd ed, Academic Press, Boston, MA, USA, 2016), pp. 531–545
14. V Reddy Eluri, C Ramesh, SN Dhipti, D. Sujatha, Analysis of MRI based brain tumor detection
using RFCM clustering and SVM classifier. in International Conference on Soft Computing
and Signal Processing(ICSCSP-2018) springer Series, June 22, 2018
15. J. Blanco, A. García, J. Morenas, Design and implementation of a wireless sensor and actuator
network to support the intelligent control of efficient energy usage. Sensors 18, 1892 (2018)
16. G.S. Naveen Kumar, V.S.K. Reddy, High-performance video retrieval based on spatio-temporal
features. in Microelectronics, electromagnetics and telecommunications (Springer, Singapore),
pp. 433–441
IoT-Based Energy Saving Recommendations by Classification of Energy … 807
A Chavan, Pundalik, 11
Aathukuri, Lokesh, 197 Chopra, Ashish, 597
Abisheek, K., 505 Chourasia, Bharti, 643
Acharya, Dinesh U., 1
Adusumalli, Vyshnavi, 219
Agajyelew, Bekele Worku, 381
D
Agrawal, Shruti, 597
Das, Arunava, 187
Ahuja, Akansha, 119
Ajay, K. D. K., 447 Datta, Bingi Sai, 751
Akuri, Sree Rama Chandra Murthy, 197 Deepak, N. R., 11
Amuru, Deepthi, 663 Deore, Siddhesh, 295
Ankam, Praveen, 83 Dey, Ranadeep, 107
Arthi, K., 145 Dhipti, G. Siva Naga, 795
Dileep, P., 257
Dimmita, Nandini, 197
B Dinesh Kumar, R., 505
Bachche, Ruturaj, 295 Durani, Homera, 429
Balaji, N., 781 Durga Devi, P., 553
Bale, Mahesh Babu, 219 Durgalaxmi, Kavali, 751
Bansal, Pratosh, 175 Durgam, Rajesh, 725
Barlapudi, Mounika, 655 Durgam, Thirupathi, 705
Berin, T. Grace, 733
Bharathi, B., 305
Bhatt, Nirav, 429 F
Bhave, Sameer, 175 Firdausi, Tauseef Jamal, 187
Bhise, Pratibha R., 371
Bichave, Aditya, 295
Bodkhe, Aryak, 535
Boggavarapu, Venkata Bharath Krishna, G
219 Gaikwad, Vinayak, 535
Bopidi, Srikanth, 769 Ganatra, Amit, 35
Geetha, M., 1, 165
Giri Prasad, M. N., 457
C Gottumukkala, V. S. S. P. Raju, 239
Challa, Archana, 219 Goyal, Shimpy, 49
Chandrakala, M., 553 Gupta, Sudhanshu, 535
© The Editor(s) (if applicable) and The Author(s), under exclusive license 809
to Springer Nature Singapore Pte Ltd. 2022
V. S. Reddy et al. (eds.), Soft Computing and Signal Processing, Advances in Intelligent
Systems and Computing 1413, https://doi.org/10.1007/978-981-16-7088-6
810 Author Index
M
Madesh, M., 505 R
Madhan, E. S., 393 Raghavendran, Ch. V., 479
Madhura Prabha, R., 315 Raj, N., 715, 725
Mahammad, Eliyaz, 673 Raju, Rollakanti, 521
Author Index 811
S
Sadiwala, Ritesh, 359, 705 U
Sandeep, M., 251 Uma Devi, G., 545
Sannareddy, Varshitha, 655 Upadhya, K. Jyothi, 165
Sasikala, S., 315
Sathvik, Pulluri, 761
Satyanarayana, A. N., 545 V
Satyanarayana, D., 457 Vaibhav, Boga, 761
Seelam, Nagarjuna Reddy, 655 Vanitha, K., 457
Sekhar, V. Chandra, 239 Velakanti, Gouthami, 83
Selvaraj, Navaneethan, 393 Venkataram, Pallapa, 585
Shah, Ronak, 535 Venkateswara Reddy, E., 405
Shankar, Vadthyavath, 633 Venkateswara Reddy, K., 781
Shastry, Nikhil S., 23 Vijayakamal, M., 337
Shekar, B. H., 621 Vijayalakshmi, V., 491
Shetty, Roopashri, 1 Vijay, T. K., 381
Shreya, K., 571 Vishal, B., 205
Shyamala, G., 1 Vuduthuri, Gali Reddy, 655
Singh, Amritanshu Kumar, 145 Vuppu, Shankar, 83
Singh, Rajiv, 49
Siva Naga Dhipti, G., 405
Somanaidu, U., 521 W
Sreekanth, Nara, 283 Walia, Harnehmat, 59
Sreelekha, A., 257
Sreenivasu, Morukurthi, 381
Sree, Pokkuluri Kiran, 263 Y
Srinivas Reddy, G., 521 Yaduvanshi, Rajveer Singh, 441
Srinivas, T., 585 Yeshwanth, G. Srinivasa, 571
Sriram, K. G., 205 Yeshwanth, K., 673