Insurace (1) REPORT

FRAUD DETECTION IN INSURANCE CLAIMS USING
MACHINE LEARNING
ABSTRACT
Insurance Company working as commercial enterprise from last few years has
been experiencing fraud cases for all type of claims. Amount claimed by fraudulent
is significantly huge that may causes serious problems, hence along with
government, different organization also working to detect and reduce such
activities. Such frauds occurred in all areas of insurance claim with high severity
such as insurance claimed towards auto sector is fraud that widely claimed and
prominent type, which can be done by fake accident claim. So, our aim to develop
a project that work on insurance claim data set to detect fraud and fake claims
amount. The project implements machine learning algorithms to build model to
label and classify claim. Also, to study comparative of all machine learning
algorithms used for classification using confusion matrix in term soft accuracy,
precision, recall etc. For fraudulent transaction validation, machine learning model
is built using Python Library.
CHAPTER 1
INTRODUCTION
The insurance industry has been facing numerous challenges due to fraud claims
from the very beginning. Losses incurred due to frauds, impacts all the parties
involved. Even one undetected fraud could lead to a huge loss ; resulting in
increased premium-costs, process inefficiency and loss of trust. Though all
insurance companies have their fraud- detection systems in place, still most of
those processes are very inefficient and time consuming. Traditional mechanisms
rely heavily on human intervention and hence are not adaptable to any changes or
situation, if required. A long on-going investigation results in delay in pay-outs and
has a negative impact on the customer. Uncaught fraudulent claims not only hinder
the profitability of the firm but also encourage other policy holders to show similar
behavior. Insurance fraud occurs when individuals attempt to profit by failing to
fulfill the terms of the insurance agreement. Frauds can be categorized under soft-
fraud or hard-fraud. If a policy holder intentionally plans an accident or invents a
loss just to gain benefits from the insurance company then it is said to be a hard
fraud. However, when an actual injury or theft occurs, and the insured exaggerates
the claim to obtain more money from the company, then that is termed as a soft
fraud. The evolution of big-data and the growth of unstructured data has given rise
to a lot of fraudsters exploiting the system. If the data is not analyzed thoroughly,
there will be huge chances of occurrence of a fraud . Data mining and analytics has
changed the fraud detection scenario. Data can be gathered from various sources
and can be stored in a combined repository for further use. Implementing analytical
solutions costs an initial investment to the insurance companies; thus they always
resist implementing it. However, it has been observed that using machine learning
and analytical capabilities have strengthened the insurance lifecycle in many
forms. It has been able to provide a lot of cost benefits to the companies by saving
up a lot of money, by reducing the overall cost of fraud detection and improving
the overall ROI of fraud detection. So, the insurers need to start leveraging their
machine learning capability in order to build more robust and risk-free systems.
Hence, there is a crucial need to develop a system that can help the insurance
industry to identify potential frauds with a high degree of accuracy, so that other
claims can be cleared rapidly while the identified cases can be examined in detail.
The dataset used in this study is found to have a class imbalance problem; means
that the number of instances of one class(positive) far exceeds the number of
instances of other class(negative).The class having far less number of instances
than the other becomes the minority class; other class being called as majority
class. Due to which, the minority class tends to be ignored during the classification
process. To avoid minority data-instances to be treated as a noise and the classifier
to be biased with the majority one, this data- imbalance needs to be fixed. A simple
way to fix this imbalance is by balancing the data set, either by oversampling
instances of the minority class or undersampling instances of the majority class.
This research paper aims to develop a model to help the insurers take pro-active
decisions and make them better equipped to combat fraud. In this paper, we
propose a procedure for auto-fraud identification using Random Forests
Classification Technique, before which we remove the class imbalance-ness of our
original dataset. This is done using synthetic minority oversampling technique
Insurance fraud occurs when an insurance provider, advisor, adjuster, or consumer

intentionally deceives in order to obtain an illegal gain. There has been an increase
in fraudulent insurance claims in recent years, particularly in the automobile
insurance industry. Falsify insurance claim information, exaggerate insurance
claims to represent an accident, or submit a claim form for damage or injury that
has never occurred by making a false claim for car theft. That's all an example of a
car insurance fraud. When insurance companies use fraud detection systems, they
not only detect fraud but also save millions, if not billions, of dollars that would
otherwise be paid to the person who made the fraudulent claim.
This Project aims to suggest the most accurate and simplest way that can be used to
fight fraudulent claims. The main problem with detecting fraudulent activities is
the massive number of claims that run through the companies systems. This
problem can also be used as an advantage if the officials were to take into account
that they hold a big enough database if they combined the database of the claims.
Which can be used in order to develop better models to flag the suspicious claims
This paper will look into the different methods that have been used in solving
similar problems to test out the best methods that have been used previously.
Searching if examining these methods and trying to enhance and build a predictive
model that could flag out the suspicious claims based on the researching and
testing out the different models and comparing these models to come up with a
simple enough time-efficient and accurate model that can flag out the suspicious
claims without stressing the system it runs on.
CHAPTER 2
LITERATURE REVIEW
Title: Fraud Detection and Analysis for Insurance Claim using Machine Learning
Authors: Abhijeet Urunkar,Amruta Khot,Rashmi Bhat,Nandinee Mudegol,
Publication: 2022 IEEE International Conference on Signal Processing Informatics
Communication and Energy Systems (SPICES)
Insurance Company working as commercial enterprise from last few years
have been experiencing fraud cases for all type of claims. Amount claimed by
fraudulent is significantly huge that may causes serious problems, hence along
with government, different organization also working to detect and reduce such
activities. Such frauds occurred in all areas of insurance claim with high severity
such as insurance claimed towards auto sector is fraud that widely claimed and
prominent type, which can be done by fake accident claim. So, we aim to develop
a project that work on insurance claim data set to detect fraud and fake claims
amount. The project implement machine learning algorithms to build model to
label and classify claim. Also, to study comparative study of all machine learning
algorithms used for classification using confusion matrix in term soft accuracy,
precision, recall etc. For fraudulent transaction validation, machine learning model
is built using PySpark Python Library."
Title: Detecting insurance claims fraud using machine learning techniques

Authors: Riya Roy,K. Thomas George,
Publication: 2017 International Conference on Circuit Power and Computing
Technologies (ICCPCT)
The insurance industries consist of more than thousand companies in
worldwide. And collect more than one trillions of dollars premiums in each year.
When a person or entity make false insurance claims in order to obtain
compensation or benefits to which they are not entitled is known as an insurance
fraud. The total cost of an insurance fraud is estimated to be more than forty
billions of dollars. So detection of an insurance fraud is a challenging problem for
the insurance industry. The traditional approach for fraud detection is based on
developing heuristics around fraud indicator. The autovehicle insurance fraud is
the most prominent type of insurance fraud, which can be done by fake accident
claim. In this paper, focusing on detecting the autovehicle fraud by using, machine
learning technique. Also, the performance will be compared by calculation of
confusion matrix. This can help to calculate accuracy, precision, and recall."
Title: Predicting Fraudulent Claims in Automobile Insurance

Authors: G. Kowshalya,M. Nandhini,
Publication: 2018 Second International Conference on Inventive Communication
and Computational Technologies (ICICCT)
Insurance industry is rapid growing industry and it handles vast amount of
data. The major problem in the insurance industry is fraudulent claims. Fraudulent
claim is an illegal or illicit cheat intended for personal monetary benefit. The
existence of numerous fraudulent claims affects not only the insurance company,
but also the sincere policy holders. Generally, with the help of domain knowledge,
the insurance companies identify fraudulent claim using the conventional
techniques. In recent times, data mining has offered significant contributions in the
field of insurance analysis. As an attempt, in this paper, data mining techniques are
used to predict fraudulent claims and to calculate insurance premium amount for
different customers based on their personal and financial details. This work helps
in the basic screening process to investigate claims thus minimizing the human
resources and monetary losses."
Title: Detecting Fraudulent Insurance Claims Using Random Forests and Synthetic
Minority Oversampling Technique
Authors: Sonakshi Harjai,Sunil Kumar Khatri,Gurinder Singh,
Publication: 2019 4th International Conference on Information Systems and
Computer Networks (ISCON)
There has been a significant amount of growth in the number of fraudulent
activities by the policy-holders over the last couple of years. Deliberately
deceiving the insurance providers by omitting facts and hiding details while
claiming for insurance has led to significant loss of money and customer value. To
keeps these risks under control; a proper framework is required for judiciously
monitoring insurance fraud. In this paper, we demonstrate a novel approach for
building a machine- learning based auto-insurance fraud detector which will
predict fraudulent insurance claims from the dataset of over 15,420 car-claim
records. The proposed model is built using synthetic minority oversampling
technique (SMOTE) which removes the class imbalance-ness of the dataset. We
use random forests classification method to classify the claim records. The data
used in our experiment is taken from a publically available auto insurance datasets.
The outcomes of our approach were compared with other existing models based on
various performance metrics."
Title: Blockchain Technology for Fraudulent Practices in Insurance Claim Process

Authors: Jaideep Gera,Anitha Rani Palakayala,Venkata Kishore Kumar
Rejeti,Tenali Anusha,
Publication: 2020 5th International Conference on Communication and Electronics
Systems (ICCES)
On top of distributed computing, Bitcoin was implemented as
cryptocurrency. Bitcoin originally came with blockchain technology to protect
coins from misuse. Blockchain provided a distributed ledger of cryptocurrency
transactions in an immutable form to protect data from malicious attacks. Thus
blockchain complemented the security aspect of Bitcoin. In the later stages,
blockchain evolved as a distributed ledger technology that is used in different
domains like healthcare. There are different issues associated with different
domains. For instance, in the insurance domain, there are issues related to false
claims and the claims that are manipulated by competent authority illegally. This
problem addressed in this paper by implementing an insurance application with
blockchain technology. The consensus is used to ensure that the claim process of
the insurance company will be carried out with integrity, accountability, and non-
repudiation. Especially, every transaction is cryptographically signed and stored as
a collection of blocks in the blockchain. This approach safeguards claim
transactions and prevent any fraudulent attempts. A prototype application is built
using the IBM blockchain platform and its underlying components. Experimental
results showed that the proposed implementation prevents fraud claims in the
insurance industry."
Title: A time-efficient model for detecting fraudulent health insurance claims using
Artificial neural networks
Authors: Shamitha S.K.,V. Ilango,
Publication: 2020 International Conference on System Computation Automation
and Networking (ICSCAN)
Health insurance has come in rescue for people, in reducing their medical
expenditure, which otherwise would have taken a high toll on their income. There
are both private and government-funded agencies serving in the health insurance
sector. With soaring high demand among the public, healthcare is not safe from the
fraudsters. The usage of computerized techniques has proved this area even more
vulnerable. It has become highly essential to detect this fraud at the earliest, such
that the impact of loss could be minimized. This paper throws light on a framework
in detecting fraud with faster learning and identifying the maximum number of
fraud instances. The usual problems, like data heterogeneity and imbalanced
classification of classes, have also been discussed in this paper. As a part of
developing an efficient framework for fraud detection, we applied several learners
and optimization techniques. The framework has evaluated with claims dataset
obtained from the CMS Medicare facility. We finally reached to a conclusion that
the application of Multi-Layer Perceptron, a feed-forward Neural Network with
genetic algorithm optimization had helped in enhancing the results and gain higher
accuracy. PCA was also applied to pick the most significant variables. The use of
PCA and other appropriate pre-processing techniques has also helped us in
reducing the training time, thereby achieving efficiency in terms of accuracy and
speed."
Title: Insurance fraud evaluation: a fuzzy expert system

Authors: B. Stefano,F. Gisella,
Publication: 10th IEEE International Conference on Fuzzy Systems. (Cat.
No.01CH37297)
All the studies dealing with the Italian insurance market show that fraud is
an increasingly relevant problem in that sector. Insurance companies are trying to
embed real "fraud units" into their activities, in order to identify suspicious cases
and fraudulent patterns either in the insuring phase or in the settlement of claims.
The companies face three opposing problems: the high cost of expert activity,
requests for fast settlements and (for the Italian market) the requirement to cover
anyone who asks for a policy. Carrying out any audit analysis requires fraud
experts. That's the reason why fast settlement tends to generate extra costs in fraud
investigations. What companies need is a standard, automatic, fast control method
to filter real suspicious cases to fraud experts, leaving the call centres free to pay
the majority of claims immediately. Unsuspicious claims can thus be settled
automatically, even by the non-expert call centre operators, while claims that
exceed a fixed threshold value are investigated by fraud experts. Claim auditors
can then dedicate their activities to potentially fraudulent claims only. This paper
shows how a fuzzy logic control (FLC) model can efficiently evaluate an "index of
suspects" on each claim, in order to stress fraudulent situations to be investigated
by the experts."
Title: Detecting Fraudulent Motor Insurance Claims Using Support Vector

Machines with Adaptive Synthetic Sampling Method
Authors: Charles Muranda,Ahmed Ali,Thokozani Shongwe,
Publication: 2020 61st International Scientific Conference on Information
Technology and Management Science of Riga Technical University (ITMS)
Classification algorithms suffer from imbalanced training sets. In the area of
detecting fraudulent claims in the insurance industry, fraud cases are rare as
compared to the genuine ones. Therefore, algorithms of detecting fraud have fewer
training samples of positive cases, leading to lower performance metrics compared
to when there are equal cases. In this paper, we propose a machine learning method
of detecting fraudulent claims. The proposed method uses the adaptive synthetic
sampling method (ADASYN) to remove imbalances in the dataset. We then used
Support Vector Machines (SVM) to classify the claim cases. The outcome of the
algorithm is compared to the imbalanced datasets and other existing methods."
Title: Automobile Insurance Fraud Detection using Supervised Classifiers

Authors: Iffa Maula Nur Prasasti,Arian Dhini,Enrico Laoh,
Publication: 2020 International Workshop on Big Data and Information Security
(IWBIS)
Automobile fraudulent claim leads to several consequences for the company
and policyholder. The current detection system is costly and inefficient. This
research aims to design a prediction model in detecting automobile insurance fraud
using a machine learning approach. The study used realworld data on an
automobile insurance company in Indonesia. The dataset has a high imbalanced
distribution between the data of policyholders who commit fraud and legitimate
data. This research handles the imbalanced dataset problem by using the Synthetic
Minority Oversampling Technique (SMOTE) and undersampling methods. The
proposed supervised classifiers are Multilayer Perceptron (MLP), Decision Tree
C4.5, and Random Forest(RF). The performance of models is evaluated through
the confusion matrix, ROC Curve, and parameters such as sensitivity. This
research found that Random Forest outperformed the results comparing to other
classifiers with 98.5% accuracy."
Title: Blockchain Technology for Preventing Counterfeit in Health Insurance

Authors: Baker Alhasan,Mohammad Qatawneh,Wesam Almobaideen,
Publication: 2021 International Conference on Information Technology (ICIT)
The huge volume of data and requests for health insurance led to an
increased fraud (counterfeit) in the sector either from stakeholders or users.
Therefore, the weakness of traditional systems and lack of transparency in
obtaining health insurance cards led to a crisis of confidence on part of both
parties, regarding both privacy of the patientu2019s data or the fraudulent
insurance cards. In addition to that, governments are spending a lot of time and
money to get rid of this dilemma. Hence, a system that can solve the counterfeiting
problem is highly needed. In this paper, we have proposed and implemented a
Blockchain (BC) system to prevent counterfeiting in health insurance sector. We
carried out several experiments in order to demonstrate the usability and efficiency
of the designed system. The results show system strength and effectiveness in term
of speed, security and privacy."
Title: Fraud detection and frequent pattern matching in insurance claims using data
mining techniques
Authors: Aayushi Verma,Anu Taneja,Anuja Arora,
Publication: 2017 Tenth International Conference on Contemporary Computing
(IC3)
Fraudulent insurance claims increase the burden on society. Frauds in health
care systems have not only led to additional expenses but also degrade the quality
and care which should be provided to patients. Insurance fraud detection is quite
subjective in nature and is fettered with societal need. This empirical study aims to
identify and gauge the frauds in health insurance data. The contribution of this
insurance claim fraud detection experimental study untangle the fraud
identification frequent patterns underlying in the insurance claim data using rule
based pattern mining. This experiment is an effort to assess the fraudulent patterns
in the data on the basis of two criteria-period based claim anomalies and disease
based anomalies. Rule based mining results according to both criteria are analysed.
Statistical Decision rules and k-means clustering are applied on Period based claim
anomalies outliers detection and association rule based mining with Gaussian
distribution is applied on disease based anomalies outlier detection. These outliers
depict fraud insurance claims in the data. The proposed approach has been
evaluated on real-world dataset of a health insurance organization and results show
that our proposed approach is efficient in detecting fraud insurance claim using
rule based mining."
Title: Health Care Insurance Fraud Detection Using Blockchain
Authors: Gokay Saldamli,Vamshi Reddy,Krishna S. Bojja,Manjunatha K.
Gururaja,Yashaswi Doddaveerappa,Loai Tawalbeh,
Publication: 2020 Seventh International Conference on Software Defined Systems
(SDS)
The health care industry is one of the important service providers that
improves people lives. As the cost of the healthcare service increases, health
insurance becomes the only way to get quality service in case of an accident or a
major illness. As health insurance will reduces the costs and provides financial and
economic stability for an individual. One of the main tasks of healthcare insurance
providers is to monitor and manage the data and to provide support to customers.
Due to regulations and business secrecy, insurance companies do not share the
patient's data but since the data are not integrated and not in sync between
insurance providers, there has been an increase in the number of fraud's occurring
in healthcare. Often times ambiguous or false information is provided to health
insurance companies in order to make them pay for some false claims to the policy
holders. The individual policyholder may also claim benefits from multiple
insurance providers. There is a financial loss of billions of dollars each year as
estimated by the National Health Care Anti-Fraud Association (NHCAA). In order
to prevent health insurance fraud, it is necessary to build a system to securely
manage and monitor insurance activities by integrating data from all the insurance
companies. As blockchain provides an immutable data maintaining and sharing,
we propose a blockchain based solution for health insurance fraud detection."
Title: Agentless Insurance Model Based on Modern Artificial Intelligence

Authors: Krishanu Prabha Sinha,Mehdi Sookhak,Shaoen Wu,
Publication: 2021 IEEE 22nd International Conference on Information Reuse and
Integration for Data Science (IRI)
Since past couple of years, Agents have been a crucial part of the financial
sector, primarily focusing on the Auto Insurance sector, whose key responsibilities
are centered around finding new prospective customers and maintaining a
relationship with existing customers. But with every other company streamlining
their business processes with the latest Technology, Insurance Industry is not too
far behind. Currently, Insurance Industry has dived and started exploring the online
space. Prospective customers can now get online insurance quotes, chat with an
online robot and even purchase an Insurance policy online. Digitalization,
Automation, and Streamlining are key buzzwords in every type of business sector.
Given the above trends, Insurance Agents seem to be an unnecessary expense. In
this paper, we propose an Artificial-Intelligence driven approach that eliminates
the need for a human Insurance Agent that will ultimately reduce the overall cost
for the end customer. As part of our contribution to the above problem statement,
we have proposed a Software Application where four Statistical Models are
deployed. These Models are tasked with determining prospective customers who
will likely buy an Insurance Policy, identifying customers who are likely to cancel
a policy so that we can provide them with something better, identifying customers
submitting fraudulent insurance claims and finally a Recommendation System
Model to recommend updates to current policy to existing policy of Customers. In
our Experimentation Results, we identified a cluster of customers who were most
likely to buy a product using an Unsupervised Statistical Machine Learning
model."
CHAPTER 3
SYSTEM DESIGN
EXSISTING SYSTEM:
There has been a significant amount of growth in the number of fraudulent

activities by the policy-holders over the last couple of years. Deliberately
deceiving the insurance providers by omitting facts and hiding details while
claiming for insurance has led to significant loss of money and customer value. To
keeps these risks under control; a proper framework is required for judiciously
monitoring insurance fraud. In this paper, we demonstrate a novel approach for
building a machine- learning based auto-insurance fraud detector which will
predict fraudulent insurance claims from the dataset of over 15,420 car-claim
records. The proposed model is built using synthetic minority oversampling
technique (SMOTE) which removes the class imbalance-ness of the dataset. We
use random forests classification method to classify the claim records. The data
used in our experiment is taken from a publically available auto insurance datasets.
The outcomes of our approach were compared with other existing models based on
various performance metrics."
PROPOSED SYSTEM:
Vehicle insurance fraud involves conspiring to make false or exaggerated claims

involving property damage or personal injuries following an accident. Some
common examples include staged accidents where fraudsters deliberately "arrange"
for accidents to occur; the use of phantom passengers, where people who were not
even at the scene of the accident claim to have suffered grievous injury, and
making false personal injury claims where personal injuries are grossly
exaggerated. The typical analysis of these datasets includes Algorithms is
implemented on the python tool depends upon real info represented through from
python libraries. In this project we propose a method focusing on detecting vehicle
fraud by using, machine learning algorithms, and also the final analysis and
conclusion based on performance steps using KNN algorithm and the results are
found .
BLOCK DIAGRAM:
CHAPTER 4
METHODOLOGY
RANDOM FOREST:
Random forest is a supervised learning algorithm which is used for both

classification as well as regression. But however, it is mainly used for
classification problems. As we know that a forest is made up of trees and more
trees means more robust forest. Similarly, random forest algorithm creates decision
trees on data samples and then gets the prediction from each of them and finally
selects the best solution by means of voting. It is an ensemble method which is
better than a single decision tree because it reduces the over-fitting by averaging
the result.
Working of Random Forest Algorithm
We can understand the working of Random Forest algorithm with the help of
following steps −
 Step 1 − First, start with the selection of random samples from a given
dataset.
 Step 2 − Next, this algorithm will construct a decision tree for every
sample. Then it will get the prediction result from every decision tree.
 Step 3 − In this step, voting will be performed for every predicted
result.
 Step 4 − At last, select the most voted prediction result as the final
prediction result.
The following diagram will illustrate its working −

MODULES:
• Collection of dataset:
The data set is collected from several data science domains like Kaggle etc and the
data is pre-processed.
• performing Pre-processing:
The data set is imported in AI algorithm and before performing the pre-processing
techniques are implemented to enhance the dataset some of the techniques such as
one hot encoding.
• Training:
After pre-processing the dataset is trained using the best neighbour using KNN
algorithm to achieve maximum accuracy. After training procedure is completed all
the features of in the different data's are extracted and stored in a .sav file
• Classification:
In this final part the input parameters are given and analysed with model file
and claim status is classified.
CHAPTER 5
SOFTWARE DESCRIPTION
PYTHON:
PYTHON 3.7:
Python is an interpreter, high-level, general-purpose programming language.

Created by Guido van Rossum and first released in 1991, Python's design
philosophy emphasizes code readability with its notable use of significant
whitespace.
Python is an easy to learn, powerful programming language. It has efficient

high-level data structures and a simple but effective approach to object- oriented
programming. Python’s elegant syntax and dynamic typing, together with its
interpreted nature, make it an ideal language for scripting and rapid application
development in manya reason most platforms and may be freely distributed. The
same site also contains distributions of and pointers to many free third party
Python modules, programs and tools, and additional documentation. The Python
interpreter is easily extended with new functions and data types implemented in C
or C++ (or other languages callable from C). Python is also suitable as an
extension language for customizable applications. This tutorial introduces the
reader informally to the basic concepts and features of the Python language and
system. It helps to have a Python interpreter handy for hands-on experience, but all
examples are self-contained, so the tutorial can be read off- line as well. For a
description of standard objects and modules, see library-index. Reference-index
gives a more formal definition of the language. To write extensions in C or C++,
read extending-index and c-api-index. There are also several books covering
Python in depth. This tutorial does not attempt to be comprehensive and cover
every single feature, or even every commonly used feature. Instead, it introduces
many of Python’s most notes worthy features, and will give you a good idea of the
language’s flavor and style. After reading it, you will be able to read and write
Python modules and programs, and you will be ready to learn more about the
various Python library modules described in library-index. If you do much work on
computers, eventually you find that there’s some task you’d like to automate. For
example, you may wish to perform a search-and-replace over a large number of
text files, or rename and rearrange a bunch of photo files in a complicated way.
Perhaps you’d like to write a small custom database, or a specialized GUI
application or a simple game. If you’re a professional software developer, you may
have to work with several C/C++/Java libraries but find the usual
write/compile/test/re-compile cycle is too slow. Perhaps you’re writing a test suite
for such a library and find writing the testing code a tedious task. Or maybe you’ve
written a program that could use an extension language, and you don’t want to
design and implement a whole new language for your application.
Typing an end-of-file character (Control-D on Unix, Control-Z on Windows) at the

primary prompt causes the interpreter to exit with a zero exit status. If that doesn’t
work, you can exit the interpreter by typing the following command: quit(). The
interpreter’s line-editing features include interactive editing, history substitution
and code completion on systems that support read line. Perhaps the quickest check
to see whether command line editing is supported is typing Control-P to the first
Python prompt you get. If it beeps, you have command line editing; see Appendix
Interactive Input Editing and History Substitution for an introduction to the keys.
Ifnothing appears to happen, or if ^P is echoed, command line editing isn’t
available; you’ll only be able to use backspace to remove characters from the
current line. The interpreter operates somewhat like the Unix shell: when called
with standard input connected to a tty device, it reads and executes commands
interactively; when called with a file name argument or with a file as standard
input, it reads and executes a script from that file. A second way of starting the
interpreter is python -c command [arg] ..., which executes the statement(s) in
command, analogous to the shell’s -c option. Since Python statements often
contain spaces or other characters that are special to the shell, it is usually advised
to quote commands in its entirety with single quotes.Some Python modules are
also useful as scripts. These can be invoked using python-m module [arg]...,which
executes the source file for the module as if you had spelled out its full name on
the command line. When a script file is used, it is sometimes useful to be able to
run the script and enter interactive mode afterwards. This can be done by passing -i
before the script.
There are tools which use doc strings to automatically produce online or printed
documentation or to let the user interactively browse through code; it’s good
practice to include doc strings in code that you write, so make a habit of it. The
execution of a function introduces a new symbol table used for the local variables
of the function. More precisely, all variable assignments in a functions to read the
value in the local symbol table; whereas variable references first look in the local
symbol table, then in the local symbol tables of enclosing functions, then in the
global symbol table, and finally in the table of built-in names. Thus, global
variables cannot be directly assigned a value within a function (unless named in a
global statement), although they may be referenced. The actual parameters
(arguments) to a function call are introduced in the local symbol table of the called
function when it is called; thus, arguments are passed using call by value (where
the value is always an object reference, not the value of the object).1 When a
function calls another function, a new local symbol table is created for that call. A
function definition introduces the function name in the current symbol table. The
value of the function name has a type that is recognized by the interpreter as a
user-defined function. This value can be assigned to another name which can then
also be used as a function.
Annotations are stored in the annotations attribute of the function as a dictionary

and haven o effect on any other part of the function. Parameter annotations are
defined by a colon after the parameter name, followed by an expression evaluating
to the value of the annotation. Return annotationsare defined by a literal ->,
followed by an expression, between the parameter list and the colon denoting the
end of the def statement.
The comparison operators in and not in check whether a value occurs (does not
occur) in a sequence. The operator is and does not compare whether two objects
are really the same object; this only matters for mutable objects like lists. All
comparison operators have the same priority, which is lower than that of all
numerical operators. Comparisons can be chained. For
example,a<b==ctestswhetheraislessthanbandmoreoverbequalsc. Comparisons may
be combined using the Boolean operators and the outcome of a comparison (or of
any other Boolean expression) may be negated with not. These have lower
priorities than comparison operators; between them, not has the highest priority
and or the lowest, so that A and not B or C is equivalent to (A and (not B)) or C.
As always, parentheses can be used to express the desired composition. The
Boolean operators and are so-called short-circuit operators: their arguments are
evaluated from left to right, and evaluation stops as soon as the outcome is
determined. For example, if A and C are true but Bis false, A and B and C does not
evaluate the expression C. When used as a general value and not as a Boolean, the
return value of a short-circuit operator is the last evaluated argument.
Classes provide a means of bundling data and functionality together. Creating a
new class creates a new type of object, allowing new instances of that type to be
made. Each class instance can have attributes attached to it for maintaining its
state. Class instances can also have methods (defined by its class) for modifying its
state. Compared with other programming languages, Python’s class mechanism
adds classes with a minimum of new syntax and semantics. It is a mixture of the
class mechanisms found in C++ and Modula-3. Python classes provide all the
standard features of Object Oriented Programming: the class inheritance
mechanism allows multiple base classes, a derived class can override any methods
of its base class or classes, and a method can call the method of a base class with
the same name. Objects can contain arbitrary amounts and kinds of data. As is true
for modules, classes partake of the dynamic nature of Python: they are created at
runtime, and can be modified further after creation. In C++ terminology, normally
class members (including the data members) are public (except see below Private
Variables), and all member functions are virtual. A sin Modula-3, there are no
short hands for referencing the object’s members from its methods: the method
function is declared with an explicit first argument representing the object, which
is provided implicitly by the call. A sin Small talk, classes themselves are objects.
This providesSemantics for importing and renaming. Unlike C++ and Modula-3,
built-in types can be used as base classes for extension by the user. Also, like in C+
+, most built-in operators with special syntax (arithmetic operators, sub scripting
etc.) can be redefined for class instances.(Lacking universally accepted
terminology to talk about classes, I will make occasional use of Smalltalk and C++
terms. I would use Modula-3 terms, since its object- oriented semantics are closer
to those of Python than C++, but I expect that few readers have heard of it.)
Objects have individuality, and multiple names (in multiple scopes) can be bound
to the same object. This is known as aliasing in other languages. This is usually not
appreciated on a first glance at Python, and can be safely ignored when dealing
with immutable basic types (numbers, strings, tuples).However, aliasing has a
possibly surprising effect on these mantic of Python code involving mutable
objects such as lists, dictionaries, and most other types. This is usually used to the
benefit of the program, since aliases behave like pointers in some respects. For
example, passing an object is cheap since only a pointer is passed by the
implementation; and if a function modifies an object passed as an argument, the
caller will see the change — this eliminates the need for two different argument
passing mechanisms as in Pascal.
A namespace is a mapping from names to objects. Most name spaces are currently
implemented as Python dictionaries, but that’s normally not noticeable in any way
(except for performance), and it may change in the future. Examples of name
spaces are: these to f built-in names (containing functions such as abs(), and built-
in exception names); the global names in a module; and the local names in a
function invocation. In a sense the set of attributes of an object also form a
namespace. The important thing to know about namespaces is that there is
absolutely no relation between names in different namespaces; for instance, two
different modules may both define a function maximize without confusion — users
of the modules must prefix it with the module name. By the way, I use the word
attribute for any name following a dot — for example, in the expression z. real,
real is an attribute of the object z. Strictly speaking, references to names in
modules are attribute references: in the expression modname.funcname, modname
is a module object and funcname is an attribute of it. In this case there happens to
be a straight forward mapping between the module’s attributes and the global
names defined in the module: they share the same namespace!1 Attributes may be
read-only or writable. In the latter case, assignment to attributes is possible.
Module attributes are writable: you can write modname.the_answer = 42.
Writable attributes may also be deleted with the del statement. For example, del
mod name .the_ answer will remove the attribute the_answer from the object
named by mod name. Namespaces are created at different moments and have
different lifetimes. The namespace containing the built-in names is created when
the Python interpreter starts up, and is never deleted. The global namespace for a
module is created when the module definition is read in; normally, module
namespaces also last until the interpreter quits.The statements executed by the top-
level invocation of the interpreter, either read from a script file or interactively, are
considered part of a module called main, so they have their own global
namespace.(The built-in names actually also live in a module; this is called built
ins.) The local namespace for a function is created when the function is called, and
deleted when the function returns or raises an exception that is not handled within
the function. (Actually, forgetting would be a better way to describe what actually
happens.) Of course, recursive invocations each have their own local namespace.
To speed uploading modules, Python caches the compiled version of each module
in the pycache directory under the name module.version.pyc, where the version
encodes the format of the compiled file; it generally contains the Python version
number. For example, in CPython release 3.3 the compiled version of spam.py
would be cached as pycache/spam.cpython-33.pyc. This naming convention allows
compiled modules from different releases and different versions of Python to
coexist. Python checks the modification date of the source against the compiled
version to see if it’s out of date and needs to be recompiled. This is a completely
automatic process. Also, the compiled modules are platform-independent, so the
same library can be shared among systems with different architectures. Python
does not check the cache in two circumstances. First, it always recompiles and
does not store the result for the module that’s loaded directly from the command
line. Second, it does not check the cache if there is no source module. To support
anon-source (compiled only) distribution, the compiled module must be in the
source directory, and there must not be a source module. Some tips for experts:
You can use the -O or -OO switches on the Python command to reduce the size of
a compiled module. The -O switch removes assert statements, the -OO switch
removes both assert statements and doc strings. Since some programs may rely on
having these available, you should only use this option if you know what you’re
doing. “Optimized” modules have an opt- tag and are usually smaller. Future
releases may change the effects of optimization.
A program doesn’t run any faster when it is read from a .pyc file than when it is
read from a .py file; the only thing that’s faster about .pyc files is the speed with
which they are loaded. The module compile all can create .pyc files for all modules
in a directory.There is more detail on this process, including a flow chart of the
decisions.
THONNY IDE:
Thonny is as mall and light weight Integrated Development Environment. It was

developed to provide a small and fast IDE, which has only a few dependencies
from other packages. Another goal was to be as independent as possible from a
special Desktop Environment like KDE or GNOME, so Thonny only requires the
GTK2 toolkit and therefore you only need the GTK2 runtime libraries installd to
run it.
For compiling Thonny yourself, you will need the GTK (>= 2.6.0) libraries and
header files. You will also need the Pango, Gliband ATK libraries and header files.
All these files are available at http://www.gtk.org. Furthermore you need, of
course, a C compiler and the Make tool; a C++ compiler is also required for the
included Scintilla library. The GNU versions of these tools are recommended.
Compiling Thonny is quite easy. The following should do it:
% ./configure
% make
% make install
The configure script supports several common options, for a detailed list, type
% ./configure --help
There are also some compile time options which can be found in src/Thonny .h.
Please see Appendix C for more information. In the case that your system lacks
dynamic linking loader support, you probably want to pass the option --disable-vte
to the configure script. This prevents
compiling Thonny with dynamic linking loader support to automatically load

libvte.so.4 if available. Thonny has been successfully compiled and tested under
Debian 3.1 Sarge, Debian 4.0 Etch, Fedora Core 3/4/5, Linux From Scratch and
FreeBSD 6.0. It also compiles under Microsoft Windows.
At startup, Thonny loads all files from the last time Thonny was launched. You
can disable this feature in the preferences dialog (see Figure 3-4). If you specify
some files on the command line, only these files will be opened, but you can find
the files from the last session in the file menu under the "Recent files" item. By
default this contains the last 10 recently opened files. You can change the amount
of recently opened files in the preferences dialog. You can start several instances
of Thonny , but only the first will load files from the last session. To run a second
instance of Thonny , do not specify any file names on the command-line, or
disable opening files in a running instance using the appropriate command line
option.
Thonny detects an already running instance of itself and opens files from the
command-line in the already running instance. So, Thonny can be used to view
and edit files by opening them from other programs such as a file manager. If you
do not like this for some reason, you can disable using the first instance by using
the appropriate command line option.
If you have installed libvte.so in your system, it is loaded automatically by Thonny

and you will have a terminal widget in the notebook at the bottom. If Thonny
cannot find libvte.so at startup, the terminal widget will not be loaded. So there is
no need to install the package containing this file in order to run Thonny .
Additionally, you can disable the use of the terminal widget by command line
option, for more information see Section3.2.You can use this terminal (from now
on called VTE) nearly as an usual terminal program like xterm. There is basic
clipboard support. You can paste the contents of the clipboard by pressing the right
mouse button to open the popup menu and choosing Paste. To copy text from the
VTE, just select the desired text and then press the right mouse button and choose
Copy from the pop up menu. On systems running the X Window System you can
paste the last selected text by pressing the middle mouse button in the VTE (on 2-
button mice, the middle button can often be simulated by pressing both mouse
buttons together).
As long as a project is open, the Make and Run commands will use the project’s
settings, instead of the defaults. These will be used whichever document is
currently displayed. The current project’s settings are saved when it is closed, or
when Thonny is shut down. When restarting Thonny , the previously opened
project file that was in use at the end of the last session will be reopened.
Execute will run the corresponding executable file, shell script or interpreted script
in a terminal window. Note that the Terminal tool path must be correctly set in the
Tools tab of the Preferences dialog - you can use any terminal program that runs a
Bourne compatible shell and accept the "-e" command line argument to start a
command. After your program or script has finished executing, you will be
prompted to press the return key. This allows you to review any text output from
the program before the terminal window is closed.
By default the Compile and Build commands invoke the compiler and linker with
only the basic arguments needed by all programs. Using Set Includes an
arguments you can add any include paths and compile flags for the compiler, any
library names and paths for the linker, and any arguments you want to use when
running Execute.
Thonny has basic printing support. This means you can print a file by passing the
filename of the current file to a command which actually prints the file. However,
the printed document contains no syntax highlighting.
CHAPTER 6
RESULTS
CHAPTER 7
CONCLUSION
As the different countries around the world evolve into a more economical-based
one, stimulating their economy is the goal. To fight these fraudsters and money
launderers was quite a complex task before the era of machine learning but thanks
to machine learning and AI we are able to fight these kinds of attacks. The
proposed solution can be used in insurance companies to find out if a certain
insurance claim made is a fraud or not. The model was designed after testing
multiple algorithms to come up with the best model that will detect if a claim is
fraudulent or not. This is aimed at the insurance companies as a pitch to come up
with a more tailored model for their liking to their own systems. The model should
be simple enough to calculate big datasets, yet complex enough to have a decent
successful percentile.
REFERENCES:
 Abhijeet Urunkar,Amruta Khot,Rashmi Bhat,Nandinee Mudegol,"Fraud

Detection and Analysis for Insurance Claim using Machine Learning",2022
IEEE International Conference on Signal Processing Informatics
Communication and Energy Systems (SPICES)
 Riya Roy,K. Thomas George,"Detecting insurance claims fraud using
machine learning techniques",2017 International Conference on Circuit
Power and Computing Technologies (ICCPCT)
 G. Kowshalya,M. Nandhini,"Predicting Fraudulent Claims in Automobile
Insurance",2018 Second International Conference on Inventive
Communication and Computational Technologies (ICICCT)
 Sonakshi Harjai,Sunil Kumar Khatri,Gurinder Singh,"Detecting Fraudulent
Insurance Claims Using Random Forests and Synthetic Minority
Oversampling Technique",2019 4th International Conference on Information
Systems and Computer Networks (ISCON)
 Jaideep Gera,Anitha Rani Palakayala,Venkata Kishore Kumar Rejeti,Tenali
Anusha,"Blockchain Technology for Fraudulent Practices in Insurance
Claim Process",2020 5th International Conference on Communication and
Electronics Systems (ICCES)
 Shamitha S.K.,V. Ilango,"A time-efficient model for detecting fraudulent
health insurance claims using Artificial neural networks",2020 International
Conference on System Computation Automation and Networking
(ICSCAN)
 B. Stefano,F. Gisella,"Insurance fraud evaluation: a fuzzy expert
system",10th IEEE International Conference on Fuzzy Systems. (Cat.
No.01CH37297)
 Charles Muranda,Ahmed Ali,Thokozani Shongwe,"Detecting Fraudulent
Motor Insurance Claims Using Support Vector Machines with Adaptive
Synthetic Sampling Method",2020 61st International Scientific Conference
on Information Technology and Management Science of Riga Technical
University (ITMS)
 Iffa Maula Nur Prasasti,Arian Dhini,Enrico Laoh,"Automobile Insurance
Fraud Detection using Supervised Classifiers",2020 International Workshop
on Big Data and Information Security (IWBIS)
 Baker Alhasan,Mohammad Qatawneh,Wesam Almobaideen,"Blockchain
Technology for Preventing Counterfeit in Health Insurance",2021
International Conference on Information Technology (ICIT)
 Aayushi Verma,Anu Taneja,Anuja Arora,"Fraud detection and frequent
pattern matching in insurance claims using data mining techniques",2017
Tenth International Conference on Contemporary Computing (IC3)
 Gokay Saldamli,Vamshi Reddy,Krishna S. Bojja,Manjunatha K.
Gururaja,Yashaswi Doddaveerappa,Loai Tawalbeh,"Health Care Insurance
Fraud Detection Using Blockchain",2020 Seventh International Conference
on Software Defined Systems (SDS)
 Krishanu Prabha Sinha,Mehdi Sookhak,Shaoen Wu,"Agentless Insurance
Model Based on Modern Artificial Intelligence",2021 IEEE 22nd
International Conference on Information Reuse and Integration for Data
Science (IRI)
 Sonakshi Harjai,Sunil Kumar Khatri,Gurinder Singh,"Detecting Fraudulent
Insurance Claims Using Random Forests and Synthetic Minority
Oversampling Technique",2019 4th International Conference on Information
Systems and Computer Networks (ISCON)
 “Identifying Health Insurance Claim Frauds Using Mixture of Clinical
Concepts” Md Enamul Haque;Mehmet Engin Tozal, Published in: IEEE
Transactions on Services Computing ( Volume: 15, Issue: 4, 01 July-Aug.
2022)
 “A Secure AI-Driven Architecture for Automated Insurance Systems: Fraud
Detection and Risk Measurement” Najmeddine Dhieb;Hakim
Ghazzai;Hichem Besbes;Yehia Massoud , Published in: IEEE Access
( Volume: 8, 25 March 2020)
 "Sequence Embeddings Help Detect Insurance Fraud" Ivan Fursov;Elizaveta
Kovtun;Rodrigo Rivera-Castro;Alexey Zaytsev;Rasul Khasyanov;Martin
Spindler;Evgeny Burnaev, Published in: IEEE Access ( Volume: 10, 07
February 2022)
 "Blockchain-Based Processing of Health Insurance Claims for Prescription
Drugs"Aysha Alnuaimi;Amna Alshehhi;Khaled Salah;Raja
Jayaraman;Ilhaam A. Omar;Ammar Battah, Published in: IEEE Access
( Volume: 10,04 November 2022)

Insurace (1) REPORT

Uploaded by

Copyright:

Available Formats

Insurace (1) REPORT

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Insurace (1) REPORT

Uploaded by

Copyright:

Available Formats

FRAUD DETECTION IN INSURANCE CLAIMS USING

Insurance fraud occurs when an insurance provider, advisor, adjuster, or consumer

Title: Detecting insurance claims fraud using machine learning techniques

Title: Predicting Fraudulent Claims in Automobile Insurance

Title: Blockchain Technology for Fraudulent Practices in Insurance Claim Process

Title: Insurance fraud evaluation: a fuzzy expert system

Title: Detecting Fraudulent Motor Insurance Claims Using Support Vector

Title: Automobile Insurance Fraud Detection using Supervised Classifiers

Title: Blockchain Technology for Preventing Counterfeit in Health Insurance

Title: Agentless Insurance Model Based on Modern Artificial Intelligence

There has been a significant amount of growth in the number of fraudulent

Vehicle insurance fraud involves conspiring to make false or exaggerated claims

Random forest is a supervised learning algorithm which is used for both

Working of Random Forest Algorithm

The following diagram will illustrate its working −

Python is an interpreter, high-level, general-purpose programming language.

Python is an easy to learn, powerful programming language. It has efficient

Typing an end-of-file character (Control-D on Unix, Control-Z on Windows) at the

Annotations are stored in the annotations attribute of the function as a dictionary

Thonny is as mall and light weight Integrated Development Environment. It was

Compiling Thonny is quite easy. The following should do it:

compiling Thonny with dynamic linking loader support to automatically load

If you have installed libvte.so in your system, it is loaded automatically by Thonny

 Abhijeet Urunkar,Amruta Khot,Rashmi Bhat,Nandinee Mudegol,"Fraud

You might also like