Project Final Document

LOOP HOLE IDENTIFICATION ON IOT DEVICES FOR RED HAT HACKER AND
BLACK HAT HACKER
INTRODUCTION
Recent advances in technology have led to the introduction of cyber-physical systems, which due
to their better computational and communicational ability and integration between physical and
cyber-components, has led to significant advances in many dynamic applications. But this
improvement comes at the cost of being vulnerable to cyber-attacks. Cyber-physical systems are
made up of logical elements and embedded computers, which communicate with communication
channels such as the Internet of Things(IoT). More specifically, these systems include digital or
cyber components, analog components, physical devices and humans that designed to operate
between physical and cyber parts. In other words, a cyber-physical system is any system that
includes cyber and physical components and humans, and has the ability to trade between the
physical and cyber parts. In cyber-physical systems, the security of these types of systems becomes
more important due to the addition of the physical part.
Physical components including sensors, which receive data from the physical environment, maybe
attacked and be injected incorrect data into the system. One of the most important challenges of a
cyber-physical system, in its physical part is the presence of a large number of sensors in the
environment, which collect so much data, with so much variety, and at high speed. Also, the
connection between the sensors and the necessary calculations and the analysis of the obtained
data will be among the main challenges. Therefore, one of the most important features of a cyber-
physical system is to communicate between these sensors, compute and control the system
The security of cyber-physical systems to detect cyber-attacks is an important issue in these

systems . It should be noted that cyber-attacks occur in irregular ways, and it is not possible to
describe these attacks in a regular and orderly manner. In general, cyber attacks in cyber-physical
systems are divided into two main types: denial of service(Dos) and deception attacks. In denial
of service, the attacker prevents communication between network nodes and communication
channels. However, in the deception attacks that inject false data to system, which are carried out
by abusing system components , such as sensors or controllers and it can corrupt data or enter
incorrect information into the system and cause misbehaving.
DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 1

BLACK HAT HACKER
These attacks can be detected by system monitoring in the system. But if the attacker can plan a
high-level attack to prevent himself from being identified, these attacks are called stealthy
deception attacks, and other common methods of counteracting such attacks will not work.
Therefore, it is important to be aware of the attacks that occur in order to respond in a timely
manner to attackers. In other words, the security system must be aware of the attack, otherwise it
will not be able to detect and control the attack. Cyber defense can be improved by using security
analytic to search for hidden patterns and how to deceive.

BLACK HAT HACKER
LITERATURE SURVEY
[1] Kwon, Cheolhyeon, Weiyi Liu, and Inseok Hwang. ”Security analysis for cyber-physical
systems against stealthy deception attacks.” In 2013 American control conference, IEEE
(2013): 3344-3349
The security issue in the state estimation problem is investigated for a networked control system
(NCS). The communication channels between the sensors and the remote estimator in the NCS are
vulnerable to attacks from malicious adversaries. The false data injection attacks are considered.
The aim of this project to find the so-called insecurity conditions under which the estimation
system is insecure in the sense that there exist malicious attacks that can bypass the anomaly
detector but still lead to unbounded estimation errors. In particular, a new necessary and sufficient
condition for the insecurity is derived in the case that all communication channels are
compromised by the adversary. Moreover, a specific algorithm is proposed for generating attacks
with which the estimation system is insecure. Furthermore, for the insecure system, a system
protection scheme through which only a few (rather than all) communication channels require
protection against false data injection attacks is proposed. A simulation example is utilized to
demonstrate the effectiveness of the proposed conditions/algorithms in the secure estimation
problem for a flight vehicle.

BLACK HAT HACKER
[2] Pajic, Miroslav, James Weimer, Nicola Bezzo, Oleg Sokolsky, George J. Pappas, and
Insup Lee. ”Design and implementation of attack-resilient cyberphysical systems: With a
focus on attack-resilient state estimators.” IEEE Control Systems Magazine 37, no. 2 (2017):
66-81.
Recent years have witnessed a significant increase in the number of security-related incidents in
control systems. These include high-profile attacks in a wide range of application domains, from
attacks on critical infrastructure, as in the case of the Maroochy Water breach [1], and industrial
systems (such as the StuxNet virus attack on an industrial supervisory control and data acquisition
system [2], [3] and the German Steel Mill cyberattack [4], [5]), to attacks on modern vehicles [6]-
[8]. Even high-assurance military systems were shown to be vulnerable to attacks, as illustrated in
the highly publicized downing of the RQ-170 Sentinel U.S. drone [9]-[11]. These incidents have
greatly raised awareness of the need for security in cyberphysical systems (CPSs), which feature
tight coupling of computation and communication substrates with sensing and actuation
components. However, the complexity and heterogeneity of this next generation of safety-critical,
networked, and embedded control systems have challenged the existing design methods in which
security is usually consider as an afterthought.

BLACK HAT HACKER
[3] Sheng, Long, Ya-Jun Pan, and Xiang Gong. ”Consensus formation control for a class of
networked multiple mobile robot systems.” Journal of Control Science and Engineering 2012
(2012).
Embedded computational resources in autonomous robotic vehicles are becoming more abundant
and have enabled improved operational effectiveness of cooperative robotic systems in civilian
and military applications. Compared to autonomous robotic vehicles that operate single tasks,
cooperative teamwork has greater efficiency and operational capability. Multirobotic vehicle
systems have many potential applications, such as platooning of vehicles in urban transportation,
the operation of the multiple robots, autonomous underwater vehicles, and formation of aircrafts
in military affairs [1–3]. The project of group behaviors for multirobot systems is the main
objective of the work. Group cooperative behavior signifies that individuals in the group share a
common objective and action according to the interest of the whole group. Group cooperation can
be efficient if individuals in the group coordinate their actions well. Each individual can coordinate
with other individuals in the group to facilitate group cooperative behavior in two ways, named
local coordination and global coordination. For the local coordination, individuals react only to
other individuals that are close, such as fish engaged in a school.

BLACK HAT HACKER
[4] Zeng, Wente, and Mo-Yuen Chow. ”Resilient distributed control in the presence of
misbehaving agents in networked control systems.” IEEE transactions on cybernetics 44, no.
11 (2014): 2038-2049.
In this project, we project the problem of reaching a consensus among all the agents in the
networked control systems (NCS) in the presence of misbehaving agents. A reputation-based
resilient distributed control algorithm is first proposed for the leader-follower consensus network.
The proposed algorithm embeds a resilience mechanism that includes four phases (detection,
mitigation, identification, and update), into the control process in a distributed manner. At each
phase, every agent only uses local and one-hop neighbors' information to identify and isolate the
misbehaving agents, and even compensate their effect on the system. We then extend the proposed
algorithm to the leaderless consensus network by introducing and adding two recovery schemes
(rollback and excitation recovery) into the current framework to guarantee the accurate
convergence of the well-behaving agents in NCS. The effectiveness of the proposed method is
demonstrated through case studies in multirobot formation control and wireless sensor networks.

BLACK HAT HACKER
[5] Sun, Hongtao, Chen Peng, Taicheng Yang, Hao Zhang, and Wangli He. ”Resilient
control of networked control systems with stochastic denial of service attacks.”
Neurocomputing 270 (2017): 170-177.
This project focuses on resilient control of networked control systems (NCSs) under the denial of
service (DoS) attacks which is characterized by a Markov process. Firstly, the packets dropout are
modeled as Markov process according to the game between attack strategies and defense
strategies. Then, an NCS under such game results is modeled as a Markovian jump linear system
and four theorems are proved for the system stability analysis and controller design. Finally, a
numerical example is used to illustrative the application of these theorems. Networked control
systems (NCSs) have received an increasing attention in the past decades. Now, NCSs have been
widely applied in industrial processes, electric power networks, intelligent transportation and so
on . With the growing of the NCSs, network, as a critical element in an NCS, is vulnerable to
cyber-threats which can menace the control systems.

BLACK HAT HACKER
SYSTEM ANALYSIS & FEASIBILITY PROJECT
Existing Method:
In the existing system, implementation of machine learning algorithms is bit complex to build due
to the lack of information about the data visualization. Mathematical calculations are used in
existing system for model building this may takes the lot of time and complexity. To overcome all
this, we use machine learning packages available in the scikit-learn library.
Disadvantages:
 High complexity.
 Time consuming.
Proposed System:
Proposed several machine learning models to classify whether there will be a cyber-attack or not,
but none have adequately addressed this misdiagnosis problem. Also, similar studies that have
proposed models for evaluation of such performance classification mostly do not consider the
heterogeneity and the size of the data Therefore, we propose a Support Vector , Decision Tree,
Random forest, Extra Tree Classifier and ad boost and neural network classifier classification
techniques.
Advantages:
 Highest accuracy
 Reduces time complexity.

BLACK HAT HACKER
Block Diagram
Fig: Block Diagram

BLACK HAT HACKER
Architecture:

BLACK HAT HACKER
METHODOLOGY AND ALGORITHMS:
1. DECISION TREE:
Decision tree is a flowchart-like tree structure where an internal node represents feature(or
attribute), the branch represents a decision rule, and each leaf node represents the outcome. The
topmost node in a decision tree is known as the root node. It learns to partition on the basis of the
attribute value. It partitions the tree in recursively manner call recursive partitioning. This
flowchart-like structure helps you in decision making. It's visualization like a flowchart diagram
which easily mimics the human level thinking. That is why decision trees are easy to understand
and interpret.
The basic idea behind any decision tree algorithm is as follows:

1. Select the best attribute using Attribute Selection Measures (ASM) to split the records.
2. Make that attribute a decision node and breaks the dataset into smaller subsets.

BLACK HAT HACKER
3. Starts tree building by repeating this process recursively for each child until one of the
conditions will match:
 All the tuples belong to the same attribute value.
 There are no more remaining attributes.
 There are no more instances.
2. Extra tree classifier :
ExtraTreesClassifier is an ensemble learning method fundamentally based on decision trees.

ExtraTreesClassifier, like RandomForest, randomizes certain decisions and subsets of data to
minimize over-learning from the data and overfitting. Let’s look at some ensemble methods ordered
from high to low variance, ending in ExtraTreesClassifier.
Decision Tree (High Variance)
A single decision tree is usually overfits the data it is learning from because it learn from only one
pathway of decisions. Predictions from a single decision tree usually don’t make accurate
predictions on new data.

BLACK HAT HACKER
Random Forest (Medium Variance)
Random forest models reduce the risk of overfitting by introducing randomness by:
 building multiple trees (n_estimators)
 drawing observations with replacement (i.e., a bootstrapped sample)
 splitting nodes on the best split among a random subset of the features selected at every node
Extra Trees (Low Variance)
Extra Trees is like Random Forest, in that it builds multiple trees and splits nodes using random
subsets of features, but with two key differences: it does not bootstrap observations (meaning it
samples without replacement), and nodes are split on random splits, not best splits. So, in summary,
ExtraTrees:
 builds multiple trees with bootstrap = False by default, which means it samples without
replacement
 nodes are split based on random splits among a random subset of the features selected at
every node
In Extra Trees, randomness doesn’t come from bootstrapping of data, but rather comes from the
random splits of all observations.
ExtraTrees is named for (Extremely Randomized Trees).

BLACK HAT HACKER
3. Random Forest Classifier:
A random forest is a machine learning technique that’s used to solve regression and classification
problems. It utilizes ensemble learning, which is a technique that combines many classifiers to
provide solutions to complex problems.
A random forest algorithm consists of many decision trees. The ‘forest’ generated by the random
forest algorithm is trained through bagging or bootstrap aggregating. Bagging is an ensemble meta-
algorithm that improves the accuracy of machine learning algorithms.
The (random forest) algorithm establishes the outcome based on the predictions of the decision
trees. It predicts by taking the average or mean of the output from various trees. Increasing the
number of trees increases the precision of the outcome.
A random forest eradicates the limitations of a decision tree algorithm. It reduces the over fitting
of datasets and increases precision. It generates predictions without requiring many configurations
in packages (like Scikit-learn).
Features of a Random Forest Algorithm:
 It’s more accurate than the decision tree algorithm.
 It provides an effective way of handling missing data.
 It can produce a reasonable prediction without hyper-parameter tuning.
 It solves the issue of over fitting in decision trees.
 In every random forest tree, a subset of features is selected randomly at the node’s splitting
point.
Decision trees are the building blocks of a random forest algorithm. A decision tree is a decision
support technique that forms a tree-like structure. An overview of decision trees will help us
understand how random forest algorithms work.
A decision tree consists of three components: decision nodes, leaf nodes, and a root node. A
decision tree algorithm divides a training dataset into branches, which further segregate into other
branches. This sequence continues until a leaf node is attained. The leaf node cannot be segregated
further.

BLACK HAT HACKER
The nodes in the decision tree represent attributes that are used for predicting the outcome.
Decision nodes provide a link to the leaves. The following diagram shows the three types of nodes
in a decision tree.
The information theory can provide more information on how decision trees work. Entropy and
information gain are the building blocks of decision trees. An overview of these fundamental
concepts will improve our understanding of how decision trees are built.
Entropy is a metric for calculating uncertainty. Information gain is a measure of how uncertainty
in the target variable is reduced, given a set of independent variables.
The information gain concept involves using independent variables (features) to gain information
about a target variable (class). The entropy of the target variable (Y) and the conditional entropy of
Y (given X) are used to estimate the information gain. In this case, the conditional entropy is
subtracted from the entropy of Y.
Information gain is used in the training of decision trees. It helps in reducing uncertainty in these
trees. A high information gain means that a high degree of uncertainty (information entropy) has

BLACK HAT HACKER
been removed. Entropy and information gain are important in splitting branches, which is an
important activity in the construction of decision trees.
Let’s take a simple example of how a decision tree works. Suppose we want to predict if a customer
will purchase a mobile phone or not. The features of the phone form the basis of his decision. This
analysis can be presented in a decision tree diagram.
The root node and decision nodes of the decision represent the features of the phone mentioned
above. The leaf node represents the final output, either buying or not buying. The main features
that determine the choice include the price, internal storage, and Random Access Memory (RAM).
The decision tree will appear as follows.
Applying decision trees in random forest

The main difference between the decision tree algorithm and the random forest algorithm is that
establishing root nodes and segregating nodes is done randomly in the latter. The random forest
employs the bagging method to generate the required prediction.
Bagging involves using different samples of data (training data) rather than just one sample. A
training dataset comprises observations and features that are used for making predictions. The
decision trees produce different outputs, depending on the training data fed to the random forest
algorithm. These outputs will be ranked, and the highest will be selected as the final output.
Our first example can still be used to explain how random forests work. Instead of having a single
decision tree, the random forest will have many decision trees. Let’s assume we have only four

BLACK HAT HACKER
decision trees. In this case, the training data comprising the phone’s observations and features will
be divided into four root nodes.
The root nodes could represent four features that could influence the customer’s choice (price,
internal storage, camera, and RAM). The random forest will split the nodes by selecting features
randomly. The final prediction will be selected based on the outcome of the four trees.
The outcome chosen by most decision trees will be the final choice. If three trees predict buying,
and one tree predicts not buying, then the final prediction will be buying. In this case, it’s predicted
that the customer will buy the phone.
4. SUPPORT VECTOR MACHINES:
The objective of the support vector machine algorithm is to find a hyper plane in an N-dimensional
space (N — the number of features) that distinctly classifies the data points.
Possible hyper planes :
To separate the two classes of data points, there are many possible Hyper planes that could be
chosen. Our objective is to find a plane that has the maximum margin, i.e. the maximum distance

BLACK HAT HACKER
between data points of both classes. Maximizing the margin distance provides some reinforcement
so that future data points can be classified with more confidence.
Hyper planes and Support Vectors
Hyper planes in 2D and 3D feature space
Hyper planes are decision boundaries that help classify the data points. Data points falling on either
side of the hyper plane can be attributed to different classes. Also, the dimension of the hyper plane
depends upon the number of features. If the number of input features is 2, then the hyper plane is
just a line. If the number of input features is 3, then the hyper plane becomes a two-dimensional
plane. It becomes difficult to imagine when the number of features exceeds 3.

BLACK HAT HACKER
Support Vectors
Support vectors are data points that are closer to the hyper plane and influence the position and
orientation of the hyper plane. Using these support vectors, we maximize the margin of the
classifier. Deleting the support vectors will change the position of the hyper plane. These are the
points that help us build our SVM.
Large Margin Intuition
In logistic regression, we take the output of the linear function and squash the value within the
range of [0,1] using the sigmoid function. If the squashed value is greater than a threshold value
(0.5) we assign it a label 1, else we assign it a label 0. In SVM, we take the output of the linear
function and if that output is greater than 1, we identify it with one class and if the output is -1, we
identify is with another class. Since the threshold values are changed to 1 and -1 in SVM, we obtain
this reinforcement range of values ([-1, 1]) which acts as margin.

BLACK HAT HACKER
Cost Function and Gradient Updates
In the SVM algorithm, we are looking to maximize the margin between the data points and the
hyper plane. The loss function that helps maximize the margin is hinge loss.
Hinge loss function (function on left can be represented as a function on the right)
The cost is 0 if the predicted value and the actual value are of the same sign. If they are not, we
then calculate the loss value. We also add a regularization parameter the cost function. The
objective of the regularization parameter is to balance the margin maximization and loss. After
adding the regularization parameter, the cost functions looks as below.

BLACK HAT HACKER
Loss function for SVM

Now that we have the loss function, we take partial derivatives with respect to the weights to find
the gradients. Using the gradients, we can update our weights.
Gradients
When there is no misclassification, i.e. our model correctly predicts the class of our data point, we
only have to update the gradient from the regularization parameter.
Gradient Update — No misclassification
When there is a misclassification, i.e. our model make a mistake on the prediction of the class of
our data point, we include the loss along with the regularization parameter to perform gradient
update.

BLACK HAT HACKER
5. Neural Network:
An artificial neural network (ANN) is the piece of a computing system designed to simulate the
way the human brain analyzes and processes information. It is the foundation of artificial
intelligence (AI) and solves problems that would prove impossible or difficult by human or
statistical standards. ANNs have self-learning capabilities that enable them to produce better
results as more data becomes available.
An ANN has hundreds or thousands of artificial neurons called processing units, which are
interconnected by nodes. These processing units are made up of input and output units. The input
units receive various forms and structures of information based on an internal weighting system,
and the neural network attempts to learn about the information presented to produce one output
report. Just like humans need rules and guidelines to come up with a result or output, ANNs also
use a set of learning rules called backpropagation, an abbreviation for backward propagation of
error, to perfect their output results.
An ANN initially goes through a training phase where it learns to recognize patterns in
data, whether visually, aurally, or textually. During this supervised phase, the network
compares its actual output produced with what it was meant to produce—the desired
output. The difference between both outcomes is adjusted using backpropagation. This
means that the network works backward, going from the output unit to the input units to
adjust the weight of its connections between the units until the difference between the
actual and desired outcome produces the lowest possible error.

BLACK HAT HACKER
Whenever we increase the layers in our ANN then it is nothing but our Deep Neural Networks.
A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the
input and output layers. There are different types of neural networks but they always consist of the same
components: neurons, synapses, weights, biases, and functions.

BLACK HAT HACKER
SYSTEM REQUIREMENTS SPECIFICATION
Functional and non-functional requirements:
Requirement’s analysis is very critical process that enables the success of a system or software
project to be assessed. Requirements are generally split into two types: Functional and non-
functional requirements.
Functional Requirements: These are the requirements that the end user specifically
demands as basic facilities that the system should offer. All these functionalities need to be
necessarily incorporated into the system as a part of the contract. These are represented or stated
in the form of input to be given to the system, the operation performed and the output expected.
They are basically the requirements stated by the user which one can see directly in the final
product, unlike the non-functional requirements.
Examples of functional requirements:
1) Authentication of user whenever he/she logs into the system
2) System shutdown in case of a cyber-attack
3) A verification email is sent to user whenever he/she register for the first time on some
software system.
Non-functional requirements: These are basically the quality constraints that the
system must satisfy according to the project contract. The priority or extent to which these factors
are implemented varies from one project to other. They are also called non-behavioral
requirements.
They basically deal with issues like:
 Portability
 Security
 Scalability
 Performence
 Flexibility
Examples of non-functional requirements:
1) Emails should be sent with a latency of no greater than 12 hours from such an activity.
2) The processing of each request should be done within 10 seconds
3) The site should load in 3 seconds whenever of simultaneous users are > 10000

BLACK HAT HACKER
SOFTWARE AND HARDWARE REQUIREMENTS:
Hardware:
Operating system : Windows 7 or 7+
RAM : 8 GB
Hard disc or SSD : More than 500 GB
Processor : Intel 3rd generation or high or Ryzen with 8 GB Ram
Software:
Software’s : Python 3.6 or high version
IDE : PyCharm.
Framework : Flask

BLACK HAT HACKER
SYSTEM DESIGN:
Input Design:
In an information system, input is the raw data that is processed to produce output. During the
input design, the developers must consider the input devices such as PC, MICR, OMR, etc.
Therefore, the quality of system input determines the quality of system output. Well-designed
input forms and screens have following properties −
 It should serve specific purpose effectively such as storing, recording, and retrieving the
information.
 It ensures proper completion with accuracy.
 It should be easy to fill and straightforward.
 It should focus on user’s attention, consistency, and simplicity.
 All these objectives are obtained using the knowledge of basic design principles regarding
−
o What are the inputs needed for the system?
o How end users respond to different elements of forms and screens.
Objectives for Input Design:
The objectives of input design are −
 To design data entry and input procedures
 To reduce input volume
 To design source documents for data capture or devise other data capture methods
 To design input data records, data entry screens, user interface screens, etc.
 To use validation checks and develop effective input controls.

BLACK HAT HACKER
Output Design:
The design of output is the most important task of any system. During output design, developers
identify the type of outputs needed, and consider the necessary output controls and prototype
report layouts.
Objectives of Output Design:
The objectives of input design are:
 To develop output design that serves the intended purpose and eliminates the production
of unwanted output.
 To develop the output design that meets the end user’s requirements.
 To deliver the appropriate quantity of output.
 To form the output in appropriate format and direct it to the right person.
 To make the output available on time for making good decisions.
MODULES:
1. User:
1.1 View Home page:
Here user view the home page of the cyber-attack web application.
1.2 View about page:
In the about page, users can learn more about the poverty classification.
1.3 Input Model:
The user must provide input values for the certain fields in order to get results.
1.4 View Results:
User view’s the generated results from the model.
1.5 View score:
Here user have ability to view the score in %

BLACK HAT HACKER
2. System
2.1 Working on dataset:
System checks for data whether it is available or not and load the data in csv files.
2.2 Pre-processing:
Data need to be pre-processed according the models it helps to increase the accuracy of
the model and better information about the data.
2.3 Training the data:
After pre-processing the data will split into two parts as train and test data before training
with the given algorithms.
2.4 Model Building
To create a model that predicts the personality with better accuracy, this module will help
user.
2.5 Generated Score:
2.6 Here user view the score in %
2.7 Generate Results:
We train the machine learning algorithm and predict the cyber-attack detection.

BLACK HAT HACKER
UML DIAGRAMS
UML stands for Unified Modelling Language. UML is a standardized general-purpose

modelling language in the field of object-oriented software engineering. The standard is managed,
and was created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: a Meta-model
and a notation. In the future, some form of method or process may also be added to; or associated
with, UML.
The Unified Modelling Language is a standard language for specifying, Visualization,
Constructing and documenting the artefacts of software system, as well as for business modelling
and other non-software systems.
The UML represents a collection of best engineering practices that have proven successful
in the modelling of large and complex systems.
The UML is a very important part of developing objects-oriented software and the
software development process. The UML uses mostly graphical notations to express the design of
software projects.
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modelling Language so that they can
develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modelling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks, patterns
and components.
7. Integrate best practices.

BLACK HAT HACKER
USE CASE DIAGRAM
CLASS DIAGRAM

BLACK HAT HACKER
SEQUENCE DIAGRAM

BLACK HAT HACKER
COLLABORATION DIAGRAM
DEPLOYMENT DIAGRAM

BLACK HAT HACKER
ACTIVITY DIAGRAM

BLACK HAT HACKER
COMPONENT DIAGRAM
ER DIAGRAM

BLACK HAT HACKER
DFD DIAGRAM

BLACK HAT HACKER

BLACK HAT HACKER
OUTPUT SCREEN SHOTS WITH DESCRIPTION
Home Page:
Here user view the home page of cyber-attack detection web application.
Fig1: Home Page
ABOUT
Here we can read about our project.

BLACK HAT HACKER
Register
In the page, users need to register by entering his credentials.
Log in
In the page, users has to enter the credentials to enter into the cyber-attack prediction.

BLACK HAT HACKER
Load
In the load page, users can load the cyber dataset.

BLACK HAT HACKER
View
Here we can see the uploaded data set.

BLACK HAT HACKER
Pre-process
Here we can pre-process and split our data into train and test.
`
Model
Here we train our data with different ML algorithms.`

BLACK HAT HACKER
Prediction
This page show the detection result of the cyber-attack detection data.

BLACK HAT HACKER
FUTURE SCOPE
There are quite a few things that can be polished or be added in the future work. • We have opted
to use two data mining classifies in this project namely the ID3 and Naive Bayes classifier. There
are more classieres such as the Bayesian network classifier, Neural Network classifier and C4.5
classifier. Such classifiers were not included in this project and could be counted in future to give
a more data to be compared with.

BLACK HAT HACKER
CONCLUSION
In this project, an attempt was made to use the resilient control consensus method in complex
discrete cyber-physical networks with a number of local attacks off. By applying this control
method, it was observed that even in the presence of cyber-attacks, the system can remain stable
and isolate the attacked node and the performance of the system is not weakened. Using the neural
network used in this project, it was observed that with a deep neural network, with 7 hidden layers,
the system shows better performance. Also in a recurrent neural network integrated with a deep
neural network, a deep layer network with a linear function performs better. Therefore, it can be
said that the system has less complexity. So With deep learning method, systems can analyse
patterns and learn from them to help prevent similar attacks and respond to changing behaviour.
In short, machine learning can make cyber security simpler, more proactive, less expensive and
far more effective. After observing the state of the system reported by the neural network, the
control system makes decisions based on it and, if there is an attack, detects it and isolates it, so as
not to have a detrimental effect on the behaviour of other agents. In future research, more attacks
on agents can be considered, also data mining and other machine learning methods, such as support
vector machine (SVM) algorithms or other types of neural networks such as recurrent neural
networks to evaluate system performance improvements.

BLACK HAT HACKER
SOURCE CODE
# Importing necessary libraries
from flask import Flask, render_template, request, url_for, flash, redirect,session
import pandas as pd
import numpy as np
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
import mysql.connector
db=mysql.connector.connect(user="root",password="",port='3306',database='cyber_attack')
cur=db.cursor()
app=Flask(__name__)
app.secret_key="CBJcb786874wrf78chdchsdcv"
@app.route('/')
def index():
return render_template('index.html')
@app.route('/about')
def about():
return render_template('about.html')

BLACK HAT HACKER
@app.route('/drug')
def drug():
return render_template('drug.html')
@app.route('/login',methods=['POST','GET'])
def login():
if request.method=='POST':
useremail=request.form['useremail']
session['useremail']=useremail
userpassword=request.form['userpassword']
sql="select * from user where Email='%s' and Password='%s'"%(useremail,userpassword)
cur.execute(sql)
data=cur.fetchall()
db.commit()
if data ==[]:
msg="user Credentials Are not valid"
return render_template("login.html",name=msg)
else:
return render_template("load.html",myname=data[0][1])
return render_template('login.html')
@app.route('/registration',methods=["POST","GET"])
def registration():
if request.method=='POST':
username=request.form['username']
useremail = request.form['useremail']
userpassword = request.form['userpassword']
conpassword = request.form['conpassword']

BLACK HAT HACKER
Age = request.form['Age']
address = request.form['address']
contact = request.form['contact']
if userpassword == conpassword:
sql="select * from user where Email='%s' and Password='%s'"%(useremail,userpassword)
cur.execute(sql)
data=cur.fetchall()
db.commit()
print(data)
if data==[]:
sql = "insert into user(Name,Email,Password,Age,Address,Contact)values(%s,%s,%s,%s,%s,%s)"
val=(username,useremail,userpassword,Age,address,contact)
cur.execute(sql,val)
db.commit()
flash("Registered successfully","success")
return render_template("login.html")
else:
flash("Details are invalid","warning")
return render_template("registration.html")
else:
flash("Password doesn't match", "warning")
return render_template("registration.html")
return render_template('registration.html')
@app.route('/load',methods=["GET","POST"])
def load():
global df, dataset

BLACK HAT HACKER
if request.method == "POST":
data = request.files['data']
df = pd.read_csv(data)
dataset = df.head(100)
msg = 'Data Loaded Successfully'
return render_template('load.html', msg=msg)
return render_template('load.html')
@app.route('/view')
def view():
print(dataset)
print(dataset.head(2))
print(dataset.columns)
return render_template('view.html', columns=dataset.columns.values, rows=dataset.values.tolist())
@app.route('/preprocess', methods=['POST', 'GET'])
def preprocess():
global x, y, x_train, x_test, y_train, y_test, hvectorizer,df
size = int(request.form['split'])
size = size / 100
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
df['protocol_type']=le.fit_transform(df['protocol_type'])
df['flag']= le.fit_transform(df['flag'])
df['service']= le.fit_transform(df['service'])

BLACK HAT HACKER
# Assigning the value of x and y
x = df.iloc[:, :-1]
y = df.iloc[:, -1]
x_train, x_test, y_train, y_test = train_test_split(x,y, stratify=y, test_size=0.3, random_state=42)
# describes info about train and test set
print("Number transactions X_train dataset: ", x_train.shape)
print("Number transactions y_train dataset: ", y_train.shape)
print("Number transactions X_test dataset: ", x_test.shape)
print("Number transactions y_test dataset: ", y_test.shape)
print(x_train,x_test)
return render_template('preprocess.html', msg='Data Preprocessed and It Splits Successfully')
return render_template('preprocess.html')
@app.route('/model', methods=['POST', 'GET'])
def model():
global model
print('ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc')
s = int(request.form['algo'])
if s == 0:
return render_template('model.html', msg='Please Choose an Algorithm to Train')
elif s == 1:

BLACK HAT HACKER
print('aaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb')
from sklearn.ensemble import ExtraTreesClassifier
et = ExtraTreesClassifier()
et.fit(x_train,y_train)
y_pred = et.predict(x_test)
ac_et = accuracy_score(y_test, y_pred)
ac_et = ac_et * 100
print('aaaaaaaaaaaaaaaaaaaaaaaaa')
msg = 'The accuracy obtained by Extra Tree Classifier is ' + str(ac_et) + str('%')
return render_template('model.html', msg=msg)
elif s == 2:
classifier = DecisionTreeClassifier(max_leaf_nodes=39, random_state=0)
classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)
ac_dt = accuracy_score(y_test, y_pred)
ac_dt = ac_dt * 100
msg = 'The accuracy obtained by Decision Tree Classifier is ' + str(ac_dt) + str('%')
elif s == 3:
from sklearn.svm import SVC
svc=SVC()
svc=svc.fit(x_train,y_train)
y_pred = svc.predict(x_test)
ac_svc = accuracy_score(y_test, y_pred)
ac_svc = ac_svc * 100
msg = 'The accuracy obtained by Support Vector Classifier is ' + str(ac_svc) + str('%')

BLACK HAT HACKER
elif s == 4:
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=12)
knn.fit(x_train, y_train)
y_pred = knn.predict(x_test)
ac_knn = accuracy_score(y_test, y_pred)
ac_knn = ac_knn * 100
msg = 'The accuracy obtained by K-Nearest Neighbour is ' + str(ac_knn) + str('%')
elif s == 5:
adb = AdaBoostClassifier()
adb.fit(x_train, y_train)
y_pred = adb.predict(x_test)
ac_adb = accuracy_score(y_test, y_pred)
ac_adb = ac_adb * 100
msg = 'The accuracy obtained by AdaBoost Classifier is ' + str(ac_adb) + str('%')
elif s == 6:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.models import load_model
model = load_model('neural_network.h5')
score=0.9423418045043945
ac_nn = score * 100

BLACK HAT HACKER
msg = 'The accuracy obtained by Neural Network is ' + str(ac_nn) + str('%')
return render_template('model.html')
@app.route('/prediction', methods=["GET", "POST"])
def prediction():
# f1=int(request.form['city'])
f1 = float(request.form['duration'])
f2 = float(request.form['protocol_type'])
f3 = float(request.form['service'])
f4 = float(request.form['flag'])
f5 = float(request.form['src_bytes'])
f6 = float(request.form['dst_bytes'])
f7 = float(request.form['land'])
f8 = float(request.form['wrong_fragment'])
f9 = float(request.form['urgent'])
f10 = float(request.form['hot'])
f11 = float(request.form['num_failed_logins'])
f12 = float(request.form['logged_in'])
f13 = float(request.form['num_compromised'])
f14 = float(request.form['root_shell'])
f15 = float(request.form['su_attempted'])
f16 = float(request.form['num_root'])
f17 = float(request.form['num_file_creations'])
f18 = float(request.form['num_shells'])
f19 = float(request.form['num_access_files'])
f20 = float(request.form['num_outbound_cmds'])

BLACK HAT HACKER
f21 = float(request.form['is_host_login'])
f22 = float(request.form['is_guest_login'])
f23 = float(request.form['count'])
f24 = float(request.form['srv_count'])
f25 = float(request.form['serror_rate'])
f26 = float(request.form['srv_serror_rate'])
f27 = float(request.form['rerror_rate'])
f28 = float(request.form['srv_rerror_rate'])
f29 = np.float(request.form['same_srv_rate'])
f30 = float(request.form['diff_srv_rate'])
f31 = float(request.form['srv_diff_host_rate'])
f32 = float(request.form['dst_host_count'])
f33 = float(request.form['dst_host_srv_count'])
f34 = float(request.form['dst_host_same_srv_rate'])
f35 = float(request.form['dst_host_diff_srv_rate'])
f36 = float(request.form['dst_host_same_src_port_rate'])
f37 = float(request.form['dst_host_srv_diff_host_rate'])
f38 = float(request.form['dst_host_serror_rate'])
f39 = float(request.form['dst_host_srv_serror_rate'])
f40 = float(request.form['dst_host_rerror_rate'])
f41 = float(request.form['dst_host_srv_rerror_rate'])
print(f2)
print(type(f2))
li =
[f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16,f17,f18,f19,f20,f21,f22,f23,f24,f25,f26,f27,f28,f29,f3
0,
f31,f32,f33,f34,f35,f36,f37,f38,f39,f40,f41]

BLACK HAT HACKER
print(li)
# model.fit(X_transformed, y_train)
# print(f2)
import pickle
model = ExtraTreesClassifier()
model.fit(x_train,y_train)
result = model.predict([li])
print(result)
print('result is ',result)
# (Anomaly = 0, Normal = 1 )
if result == 0:
msg = 'There is a Cyber Attack'
return render_template('prediction.html', msg=msg)
else:
msg = 'There is No Cyber Attack'
return render_template('prediction.html', msg=msg)
return render_template('prediction.html')
if __name__=='__main__':
app.run(debug=True)

Project Final Document

Uploaded by

Copyright:

Available Formats

Project Final Document

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Final Document

Uploaded by

Copyright:

Available Formats

LOOP HOLE IDENTIFICATION ON IOT DEVICES FOR RED HAT HACKER AND

BLACK HAT HACKER

The security of cyber-physical systems to detect cyber-attacks is an important issue in these

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 1

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 2

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 3

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 4

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 5

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 6

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 7

SYSTEM ANALYSIS & FEASIBILITY PROJECT

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 8

Fig: Block Diagram

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 9

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 10

METHODOLOGY AND ALGORITHMS:

The basic idea behind any decision tree algorithm is as follows:

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 11

2. Extra tree classifier :

ExtraTreesClassifier is an ensemble learning method fundamentally based on decision trees.

Decision Tree (High Variance)

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 12

 building multiple trees (n_estimators)

 drawing observations with replacement (i.e., a bootstrapped sample)

Extra Trees (Low Variance)

ExtraTrees is named for (Extremely Randomized Trees).

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 13

3. Random Forest Classifier:

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 14

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 15

Applying decision trees in random forest

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 16

4. SUPPORT VECTOR MACHINES:

Possible hyper planes :

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 17

Hyper planes and Support Vectors

Hyper planes in 2D and 3D feature space

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 18

Large Margin Intuition

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 19

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 20

Loss function for SVM

Gradient Update — No misclassification

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 21

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 22

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 23

Functional and non-functional requirements:

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 24

SOFTWARE AND HARDWARE REQUIREMENTS:

Operating system : Windows 7 or 7+

Hard disc or SSD : More than 500 GB

Processor : Intel 3rd generation or high or Ryzen with 8 GB Ram

Software’s : Python 3.6 or high version

DEPARTMENT OF MCA, SVCET(AUTONOMUS), CHITTOOR 25

 It ensures proper completion with accuracy.

 It should be easy to fill and straightforward.

 It should focus on user’s attention, consistency, and simplicity.

o What are the inputs needed for the system?

o How end users respond to different elements of forms and screens.