Practical 4
Practical 4
Practical 4
ipynb - Colaboratory
PRACTICAL 4
Generate Probability Density Function (PDF) and Cumulative Distribution Function (CDF) for the given Iris data set to find the distribution of
various attributes of the dataset.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
#Load Dataset
iris = pd.read_csv('/content/Iris.csv')
iris.shape
(150, 6)
iris.describe()
plt.title('Species Count')
sns.countplot(iris['Species'])
plt.figure(figsize=(17,9))
plt.title('Comparison between various species based on sapel length and width')
sns.scatterplot(iris['SepalLengthCm'],iris['SepalWidthCm'],hue =iris['Species'],s=50)
1/5
PRACTICAL 4.ipynb - Colaboratory
Correlation is a statistical method used to determine whether a linear relationship between variables exists and shows if one variable tends to
occur with large or small values of another variable.
#The correlation coefficients between measurement variables:
iris.groupby("Species").corr()
Species
Bi-variate Analysis
sns.pairplot(iris,hue="Species",height=4)
2/5
PRACTICAL 4.ipynb - Colaboratory
<seaborn.axisgrid.PairGrid at 0x7fceaa57d730>
Checking Correlation
plt.figure(figsize=(10,11))
sns.heatmap(iris.corr(),annot=True)
plt.plot()
3/5
PRACTICAL 4.ipynb - Colaboratory
ig, axes = plt.subplots(2, 2, figsize=(16,9))
sns.boxplot( y="PetalWidthCm", x= "Species", data=iris, orient='v' , ax=axes[0, 0])
sns.boxplot( y="PetalLengthCm", x="Species", data=iris, orient='v' , ax=axes[0, 1])
sns.boxplot( y="SepalLengthCm", x= "Species", data=iris, orient='v' , ax=axes[1, 0])
sns.boxplot( y="SepalWidthCm", x= "Species", data=iris, orient='v' , ax=axes[1, 1])
plt.show()
4/5