Nothing Special   »   [go: up one dir, main page]

Introduction To Data Visualization With Seaborn Chapter3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Count plots and bar

plots
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
Categorical plots
Examples: count plots, bar plots

Involve a categorical variable

Comparisons between groups

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


catplot()
Used to create categorical plots

Same advantages of relplot()

Easily create subplots with col= and row=

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


countplot() vs. catplot()
import matplotlib.pyplot as plt
import seaborn as sns

sns.countplot(x="how_masculine",
data=masculinity_data)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


countplot() vs. catplot()
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="how_masculine",
data=masculinity_data,
kind="count")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the order
import matplotlib.pyplot as plt
import seaborn as sns

category_order = ["No answer",


"Not at all",
"Not very",
"Somewhat",
"Very"]

sns.catplot(x="how_masculine",
data=masculinity_data,
kind="count",
order=category_order)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Bar plots
Displays mean of quantitative variable per
category

import matplotlib.pyplot as plt


import seaborn as sns

sns.catplot(x="day",
y="total_bill",
data=tips,
kind="bar")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Con dence intervals
Lines show 95% con dence intervals for the
mean

Shows uncertainty about our estimate

Assumes our data is a random sample

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Turning off con dence intervals
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="day",
y="total_bill",
data=tips,
kind="bar",
ci=None)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the orientation
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="total_bill",
y="day",
data=tips,
kind="bar")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Creating a box plot
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
What is a box plot?
Shows the distribution of quantitative data

See median, spread, skewness, and outliers

Facilitates comparisons between groups

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


How to create a box plot
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Change the order of categories
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box",
order=["Dinner",
"Lunch"])

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Omitting the outliers using `sym`
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box",
sym="")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the whiskers using `whis`
By default, the whiskers extend to 1.5 * the interquartile range

Make them extend to 2.0 * IQR: whis=2.0

Show the 5th and 95th percentiles: whis=[5, 95]

Show min and max values: whis=[0, 100]

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the whiskers using `whis`
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box",
whis=[0, 100])

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Point plots
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
What are point plots?
Points show mean of quantitative variable

Vertical lines show 95% con dence intervals

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Line plot: average level of nitrogen dioxide over Point plot: average restaurant bill, smokers vs.
time non-smokers

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point plots vs. line plots
Both show:

Mean of quantitative variable

95% con dence intervals for the mean

Differences:

Line plot has quantitative variable (usually time) on x-axis

Point plot has categorical variable on x-axis

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point plots vs. bar plots
Both show:

Mean of quantitative variable

95% con dence intervals for the mean

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point plots vs. bar plots

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Creating a point plot
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="age",
y="masculinity_important",
data=masculinity_data,
hue="feel_masculine",
kind="point")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Disconnecting the points
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="age",
y="masculinity_important",
data=masculinity_data,
hue="feel_masculine",
kind="point",
join=False)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Displaying the median
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Displaying the median
import matplotlib.pyplot as plt
import seaborn as sns
from numpy import median

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point",
estimator=median)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Customizing the con dence intervals
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point",
capsize=0.2)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Turning off con dence intervals
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point",
ci=None)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

You might also like