Nothing Special   »   [go: up one dir, main page]

Customer Segmentation in Python Chapter2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Introduction to RFM
segmentation

Karolis Urbonas
Head of Data Science, Amazon
DataCamp Customer Segmentation in Python

What is RFM segmentation?

Behavioral customer segmentation based on three metrics:

Recency (R)
Frequency (F)
Monetary Value (M)
DataCamp Customer Segmentation in Python

Grouping RFM values

The RFM values can be grouped in several ways:

Percentiles e.g. quantiles


Pareto 80/20 cut
Custom - based on business knowledge

We are going to implement percentile-based grouping.


DataCamp Customer Segmentation in Python

Short review of percentiles

Process of calculating percentiles:

1. Sort customers based on that metric


2. Break customers into a pre-defined number of groups of equal size
3. Assign a label to each group
DataCamp Customer Segmentation in Python

Calculate percentiles with Python

Data with eight CustomerID and a randomly calculated Spend values.


DataCamp Customer Segmentation in Python

Calculate percentiles with Python


spend_quartiles = pd.qcut(data['Spend'], q=4, labels=range(1,5))

data['Spend_Quartile'] = spend_quartiles

data.sort_values('Spend')
DataCamp Customer Segmentation in Python

Assigning labels
Highest score to the best metric - best is not always highest e.g. recency
In this case, the label is inverse - the more recent the customer, the better
DataCamp Customer Segmentation in Python

Assigning labels
# Create numbered labels
r_labels = list(range(4, 0, -1))

# Divide into groups based on quartiles


recency_quartiles = pd.qcut(data['Recency_Days'], q=4, labels=r_labels)

# Create new column


data['Recency_Quartile'] = recency_quartiles

# Sort recency values from lowest to highest


data.sort_values('Recency_Days')
DataCamp Customer Segmentation in Python

Assigning labels

As you can see, the quartile labels are reversed, since the more recent customers
are more valuable.
DataCamp Customer Segmentation in Python

Custom labels

We can define a list with string or any other values, depending on the use case.

# Create string labels


r_labels = ['Active', 'Lapsed', 'Inactive', 'Churned']

# Divide into groups based on quartiles


recency_quartiles = pd.qcut(data['Recency_Days'], q=4, labels=r_labels)

# Create new column


data['Recency_Quartile'] = recency_quartiles

# Sort values from lowest to highest


data.sort_values('Recency_Days')
DataCamp Customer Segmentation in Python

Custom labels

Custom labels assigned to each quartile


DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Let's practice with


percentiles!
DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Recency, Frequency,
Monetary Value calculation

Karolis Urbonas
Head of Data Science, Amazon
DataCamp Customer Segmentation in Python

Definitions
Recency - days since last customer transaction
Frequency - number of transactions in the last 12 months
Monetary Value - total spend in the last 12 months
DataCamp Customer Segmentation in Python

Dataset and preparations


Same online dataset like in the previous lessons

Need to do some data preparation


New TotalSum column = Quantity x UnitPrice.
DataCamp Customer Segmentation in Python

Data preparation steps

We're starting with a pre-processed online DataFrame with only the latest 12
months of data:

print('Min:{}; Max:{}'.format(min(online.InvoiceDate),
max(online.InvoiceDate)))

Min:2010-12-10; Max:2011-12-09

Let's create a hypothetical snapshot_day data as if we're doing analysis recently.

snapshot_date = max(online.InvoiceDate) + datetime.timedelta(days=1)


DataCamp Customer Segmentation in Python

Calculate RFM metrics


# Aggregate data on a customer level
datamart = online.groupby(['CustomerID']).agg({
'InvoiceDate': lambda x: (snapshot_date - x.max()).days,
'InvoiceNo': 'count',
'TotalSum': 'sum'})

# Rename columns for easier interpretation


datamart.rename(columns = {'InvoiceDate': 'Recency',
'InvoiceNo': 'Frequency',
'TotalSum': 'MonetaryValue'}, inplace=True)

# Check the first rows


datamart.head()
DataCamp Customer Segmentation in Python

Final RFM values

Our table for RFM segmentation is completed!


DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Let's practice calculating


RFM values!
DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Building RFM segments

Karolis Urbonas
Head of Data Science, Amazon
DataCamp Customer Segmentation in Python

Data
Dataset we created previously
Will calculate quartile value for each column and name then R, F, M
DataCamp Customer Segmentation in Python

Recency quartile
r_labels = range(4, 0, -1)

r_quartiles = pd.qcut(datamart['Recency'], 4, labels = r_labels)

datamart = datamart.assign(R = r_quartiles.values)


DataCamp Customer Segmentation in Python

Frequency and Monetary quartiles


f_labels = range(1,5)
m_labels = range(1,5)

f_quartiles = pd.qcut(datamart['Frequency'], 4, labels = f_labels)


m_quartiles = pd.qcut(datamart['MonetaryValue'], 4, labels = m_labels)

datamart = datamart.assign(F = f_quartiles.values)


datamart = datamart.assign(M = m_quartiles.values)
DataCamp Customer Segmentation in Python

Build RFM Segment and RFM Score


Concatenate RFM quartile values to RFM_Segment

Sum RFM quartiles values to RFM_Score


def join_rfm(x): return str(x['R']) + str(x['F']) + str(x['M'])

datamart['RFM_Segment'] = datamart.apply(join_rfm, axis=1)

datamart['RFM_Score'] = datamart[['R','F','M']].sum(axis=1)
DataCamp Customer Segmentation in Python

Final result
DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Let's practice building RFM


segments
DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Analyzing RFM segments

Karolis Urbonas
Head of Data Science, Amazon
DataCamp Customer Segmentation in Python

Largest RFM segments


datamart.groupby('RFM_Segment').size().sort_values(ascending=False)[:10]
DataCamp Customer Segmentation in Python

Filtering on RFM segments


Select bottom RFM segment "111" and view top 5 rows
datamart[datamart['RFM_Segment']=='111'][:5]
DataCamp Customer Segmentation in Python

Summary metrics per RFM Score


datamart.groupby('RFM_Score').agg({
'Recency': 'mean',
'Frequency': 'mean',
'MonetaryValue': ['mean', 'count'] }).round(1)
DataCamp Customer Segmentation in Python

Grouping into named segments

Use RFM score to group customers into Gold, Silver and Bronze segments.

def segment_me(df):
if df['RFM_Score'] >= 9:
return 'Gold'
elif (df['RFM_Score'] >= 5) and (df['RFM_Score'] < 9):
return 'Silver'
else:
return 'Bronze'

datamart['General_Segment'] = datamart.apply(segment_me, axis=1)

datamart.groupby('General_Segment').agg({
'Recency': 'mean',
'Frequency': 'mean',
'MonetaryValue': ['mean', 'count']
}).round(1)
DataCamp Customer Segmentation in Python

New segments and their values


DataCamp Customer Segmentation in Python

CUSTOMER SEGMENTATION IN PYTHON

Practice building custom


segments

You might also like