Diabetes Prediction System
Diabetes Prediction System
Diabetes Prediction System
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import warnings
warnings.filterwarnings('ignore')
In [2]: df=pd.read_csv("diabetes.csv")
df.head()
Out[2]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome
1 1 85 66 29 0 26.6 0.351 31 0
3 1 89 66 23 94 28.1 0.167 21 0
In [3]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Pregnancies 768 non-null int64
1 Glucose 768 non-null int64
2 BloodPressure 768 non-null int64
3 SkinThickness 768 non-null int64
4 Insulin 768 non-null int64
5 BMI 768 non-null float64
6 DiabetesPedigreeFunction 768 non-null float64
7 Age 768 non-null int64
8 Outcome 768 non-null int64
dtypes: float64(2), int64(7)
memory usage: 54.1 KB
In [5]: df.describe()
Loading [MathJax]/extensions/Safe.js
CHECKING FOR MISSING VALUES
In [6]: df.isnull().sum()
Pregnancies 0
Out[6]:
Glucose 0
BloodPressure 0
SkinThickness 0
Insulin 0
BMI 0
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64
In [12]: sns.heatmap(df.isnull())
<AxesSubplot:>
Out[12]:
CO RELATION MATRIX
In [14]: df.corr()
Loading [MathJax]/extensions/Safe.js
Out[14]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesP
In X all the independent variables are stored In Y the predictor variable(“OUTCOME”) is stored. Train-test
split is a technique used in machine learning to assess model performance. It divides the dataset into a
training set and a testing set, with a 0.2 test size indicating that 20% of the data is used for testing and 80%
for training.
LogisticRegression()
Out[22]:
Fitting the X train and y train data into the variable called model.
Making Prediction
In [23]: prediction =model.predict(x_test)
In [24]: print(prediction)
[1 0 1 0 0 1 0 0 1 1 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0
0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 1 0 1 0
0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 1 0]
In [25]: accuracy=accuracy_score(prediction,y_test)
In [26]: print(accuracy)
0.7987012987012987
In [ ]:
Loading [MathJax]/extensions/Safe.js