Nothing Special   »   [go: up one dir, main page]

Customer segmentation 21 (1)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

CUSTOMER SEGMENTATION

USING MACHINE LEARNING


A Report submitted to the Rajiv Gandhi University of Knowledge and Technologies in partial fulfillment
of the degree of

Bachelor of Technology in

Electronics and Communication Engineering


Submitted by
Ch.Veerababu (S180642)

K.Bhavya (S180109)

B.Srinivasu (S180551)

V.Rupavathi (S180269)

K.Suryasri (S180291)

Under the supervision of

N.Ramesh Babu M. Tech


Asst. Professor-Department of ECE
RGUKT Srikakulam
CUSTOMER SEGMENTATION USING MACHINE LEARNING
Department of Electronics and Communication and Engineering

Rajiv Gandhi University of Knowledge and Technologies, Srikakulam


S.M. Puram (V), Etcherla (M), Srikakulam (Dt) – 532410

CERTIFICATE
This is to certify that the work entitled, “Customer segmentation using
k-means clustering” is the bonafied work of V.RUPAVATHI (ID No:
S180269), K.BHAVYA (ID No: S180109), B.SRINIVASU (ID No:

S180551), CH.VEERABABU (S180642), K.SURYA SRI (ID


No:S180291) carried out under my guidance and supervision for 4th
year major project of Bachelor of Technology in the department of
Electronics and Communiction Engineering under RGUKT IIIT,
SRIKAKULAM. This work is done during the academic session
JANUARY 2024- APRIL 2024, under our guidance.

Mr.N.Ramesh Babu, Mr.N.Ramesh Babu ,

Assistant professor, Head of the Department,

Department of ECE, Department of ECE,

RGUKT, SRIKAKULAM RGUKT, SRIKAKULAM


CUSTOMER SEGMENTATION USING MACHINE LEARNING

DECLARATION

We Ch.veerababu, K.bhavya, V.rupavathi, B.srinivasu and K.surya sri hereby declare that

this report entitled “CUSTOMER SEGMENTATION USING K-MEANS

CLUSTERING” submitted by us under the guidance and supervision of Mr.N.Ramesh babu

We also declare that it has not been submitted previously in part or in full to this University or

other University or Institution to award any degree or diploma.

V.Rupavathi (S180269)

K.Bhavya (S180109)

B.Srinivasu (S180551)

Ch.Veerababu (S180642)

K.Surya sri (S180291)


CUSTOMER SEGMENTATION USING MACHINE LEARNING

ACKNOWLEDGEMENT

We would like to express my sincere gratitude to, my project Guide Mr.N.Ramesh babu, for

valuable suggestions and keen interest throughout the progress of my course of research.

We are grateful to Mr.N.Ramesh babu assistant professor of ECE, for providing excellent computing

facilities and a congenial atmosphere for progressing with my project.

At the outset, We would like to thank Rajiv Gandhi University of Knowledge and

Technologies, Srikakulam for providing all the necessary resources for the successful

completion of our course work. At last, but not least we thank our classmates and other

students for their physical and moral support.

With Sincere Regards

V.Rupavathi,

Ch.Veerababu,

K.Bhavya,

B.Srinivasu,

K.Surya sri.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

ABSTRACT

Customer segmentation using machine learning is a powerful technique that allows

businesses to categorize their customers into distinct groups based on common attributes such

as demographics, behaviors , and preferences. By using machine learning algorithms,

companies can identify patterns and trends within their customer base and use this

information to create targeted marketing campaigns, improve customer experience, improve

business ,and optimize business operations. This abstract provides an overview of the benefits

of customer segmentation using machine learning, including improved customer retention,

increased revenue, and more efficient resource allocation. Additionally, it outlines the main

steps involved in creating a customer segmentation model, including data collection and

cleaning, feature engineering, model selection and training, and evaluation and deployment.

Overall, customer segmentation using machine learning offers businesses a powerful tool to

gain deeper insights into their customers and make data-driven decisions that drive business

growth and success.

Keywords: K-means clustering, segmentation, Machine learning(Unsupervised)


CUSTOMER SEGMENTATION USING MACHINE LEARNING
CONTENTS

Title……………………………………………………………… i
T

Certificate……………………………………………………….. ii

Declaration……………………………………………………… iii
Acknowledgements……………………………………………... iv

Abstract………………………………………………………...... v

List of Figures………………………………………………........ viii

1. INTRODUCTION

1.1 Introduction……………………………………………………... 1

1.2 Motivation……………………………………………………….. 1

1.3 Problem Statement…………………………………………….... 1

1.4 Objectives………………………………………………………... 2

1.5 Goal………………………………………………………………. 3

1.6 Scope……………………………………………………………... 3

1.7 Applications……………………………………………………... 3

1.8 Limitations…………………………………………………….... 3-4

2. LITERATURE SURVEY

2.1 Create Dataset…………………………………………………… 5

2.2 Study……………………………………………………………... 6

2.3 Summary………………………………………………………… 6
CUSTOMER SEGMENTATION USING MACHINE LEARNING

3. EXISTING SYSTEM

3.1 Existing System………………………………………………….. 7

3.2 Disadvantages……………………………………………………. 7

4. PROPOSED SYSTEM

4.1 Proposed System…………………………………………………. 8

4.2 Advantages………………………………………………….......... 8

4.3 System Requirements……………………………………………. 8

5. METHODS AND ALGORITHMS

5.1 Algorithm…………………………………………………………. 9

5.2 Factors…………………………………………………………….. 10 6.

EXPERIMENT RESULTS………………………………… 11

7. SOURCE CODE……………………………………………. 12-27

8.CONCLUSION…………………………………………….... 28

9.REFERENCES…………………………………………….... 29
CUSTOMER SEGMENTATION USING MACHINE LEARNING

LIST OF FIGURES
Figure 1- Dataset ............................................................................................ 18
Figure 2- Data Preprocessing ......................................................................... 18
Figure 3- Distance ...........................................................................................19
Figure 4- Purchase Rate ................................................................................. 20
Figure 5 - Gender.............................................................................................21

Figure 6- Violin Plot ....................................................................................... 21

Figure 7- Age Vs Number of Customers ........................................................ 22


Figure 8 - Purchase Rate Vs Number of Customers ....................................... 22
Figure 9 - Distance Vs Purchase 2D Plot ........................................................23
Figure 10 - Distance Vs Purchase Rate (Elbow Graph) ................................. 24
Figure 11 - Distance Vs Purchase Rate( Centriod Graph) ............................. 25
Figure 12- Distance Vs Purchase Rate Vs Age (3D Plot) ...............................26
CHAPTER-1

INTRODUCTION
1.1 Introduction

Customer segmentation is the process of dividing a target market into distinct groups based
on their characteristics and behaviours. By analysing data such as demographics,
psychographics, and purchase history, businesses can identify common patterns among
customers. This segmentation allows businesses to personalize their marketing efforts,
allocate resources effectively, enhance customer satisfaction, foster loyalty, and drive
innovation. Overall, customer segmentation helps businesses understand and serve their
customers better, leading to improved business performance and growth.

1.2 Motivation

The motivation behind customer segmentation is to personalize marketing efforts, target the
right audience, optimize resources, enhance customer satisfaction and loyalty, differentiate
in the market, and drive product development and innovation. By understanding and catering
to the diverse needs of their customers, businesses can gain a competitive advantage and
achieve sustainable growth.

1.3 Problem Statement

The problem statement of customer segmentation includes challenges such as insufficient


data, complex customer behaviour, overlapping segments, inadequate segmentation criteria,
lack of real-time segmentation, integration issues, and limited resources and expertise.
Addressing these problems requires businesses to invest in data analysis capabilities,
advanced segmentation techniques, real-time approaches, collaboration, and continuous
refinement of strategies.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

1.4 Objectives

The objectives of customer segmentation can be summarized as follows:

1.Targeted Marketing: Customer segmentation allows businesses to identify specific customer


segments and tailor their marketing efforts to effectively reach and engage each group. The objective
is to deliver personalized messages, promotions, and offerings that resonate with the unique needs
and preferences of each segment.

2.Improved Customer Satisfaction: By understanding the distinct characteristics and behaviours


of different customer segments, businesses can customize their products, services, and customer experiences
to better meet their specific needs. The objective is to enhance customer satisfaction by providing relevant
and personalized solutions.

3.Resource Allocation Optimization: Customer segmentation helps businesses allocate their


marketing resources more efficiently by focusing on the most profitable and highpotential segments.
The objective is to prioritize and allocate resources strategically to maximize returns on investment
and improve operational efficiency.

4.Increased Customer Loyalty: Personalized experiences and targeted marketing efforts


based on customer segmentation can foster stronger customer relationships. The objective is to
enhance customer loyalty by delivering tailored solutions, building trust, and nurturing long-term
customer engagement and advocacy.

5.Market Differentiation: By identifying unique customer segments and developing tailored


value propositions, businesses can differentiate themselves in the market. The objective is to position
the brand as a provider of specialized solutions that cater to specific customer needs, creating a
competitive advantage and attracting the right audience.

6.Product Development and Innovation: Customer segmentation provides insights into


customer preferences, pain points, and unmet needs. The objective is to leverage these insights to
CUSTOMER SEGMENTATION USING MACHINE LEARNING

3
drive product development and innovation, creating new offerings that address specific customer
requirements and stay ahead of competitors.
1.5 Goal

The goal of customer segmentation is to divide a target market into distinct groups based on
specific characteristics and behaviours, in order to better understand and meet the needs of
different customer segments. The primary goal is to enable businesses to deliver personalized
experiences, tailor marketing efforts, allocate resources effectively, enhance customer
satisfaction, foster customer loyalty, differentiate in the market, and drive business growth.
Ultimately, the goal of customer segmentation is to improve overall business performance
by effectively catering to the diverse needs and preferences of customers.

1.6 Scope

The scope of customer segmentation involves collecting relevant customer data, selecting
segmentation criteria, analysing the data to identify patterns, grouping customers into distinct
segments, creating segment profiles, developing targeted marketing strategies, implementing
and testing those strategies, and continuously refining the segmentation approach. It
encompasses various departments and functions within a business and aims to understand
customers, deliver personalized experiences, and improve overall business performance.

1.7 Applications

• Targeted Marketing Campaigns.


• Product Development and Customization.
• Customer Acquisition and Retention.
• Pricing and Revenue Management.
• Customer Experience and Personalization. • Market Expansion and New
Market Identification.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

1.8 Limitations

1. Oversimplification: Customer segmentation involves categorizing customers into


distinct groups based on shared characteristics. However, this can lead to
oversimplification and overlooking individual differences within each segment.

2. Static Nature: Segmentation is often based on historical data and assumptions about
customer behavior. However, customer preferences and behaviors can change over time.
Segmentation models may not capture these changes effectively, resulting in outdated
and less accurate segment profiles.

3. Data Limitations: The effectiveness of customer segmentation relies on the


availability and quality of data. Businesses may face limitations in accessing
comprehensive and up-to-date data, hindering the accuracy and granularity of
segmentation efforts.

4. Cost and Resource Intensity: Implementing robust customer segmentation


requires significant resources, including data collection, analysis tools, and expertise.
Small businesses with limited budgets or lacking analytical capabilities may struggle to
fully leverage segmentation strategies.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

CHAPTER-2

LITERATURE SURVEY

2.1 Create Dataset


A shopping mall store provided the dataset for clustering using the K-means algorithm. Six
attributes and 200 tuples make up the data set, which represents the information of 200 consumers.
The characteristics in the data collection are Customer ID, Gender, Age, Distance (1-200) ,village
name, Purchase rate(%).

If a dataset contains null values, duplicates, or other noisy data, data cleaning must be performed.
Data cleansing ensures that information is reliable, usable, and available for analysis.

When we have the data, we may visualize it by comparing the distance and purchase rate, which is
gender-specific. According to the study, there are five different types of plots that illustrate groups of
customers who engage in the following activities, as well as customer behaviours linked to yearly
distance and purchase rate.

We can now build a K-means model based on the fact that there are a lot of groups, but not in great
detail. The silhouette coefficient approach is used to do Clustering using k-means for a range of k
clusters (let's say 1 to 5) and estimate the sum of square distances from each point to its assigned
canter for each value. Decide on the number of clusters that will give you the best silhouette score.
This defines how the silhouette score is calculated. We noticed that once K=5 is reached, there is no
rapid movement in WCSS (Within Cluster Sum of Squares). And, given the number of clusters we
have now, K=5 will be the correct number of clusters 5.

We can divide the plot into various groups, determine cluster can be prioritized, and then assign a
label to each using the method stated above. The K-means approach can be used to decide which of
the five clusters using distance and purchase rate.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

2.2 Study
Key Features of Customer Segmentation are

• Identification of Customer Groups


• Personalized Marketing

• Improved Targeting

• Enhanced Customer Satisfaction

• Resource Allocation Optimization

2.3 Summary
Customer segmentation is a strategic approach used by businesses to divide their target market
into distinct groups based on characteristics and behaviours. It aims to understand and meet the unique
needs of different customer segments. The benefits include improved targeting, enhanced satisfaction,
optimized resource allocation, customer loyalty, and innovation. The process involves data collection,
analysis, profile creation, and targeted marketing. However, there are limitations such as
oversimplification, data limitations, and biases. Applications include marketing, product
development, customer retention, pricing, and personalized experiences. Overall, customer
segmentation helps businesses understand customers and drive growth.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

CHAPTER-3

EXISTING SYSTEM

3.1 Existing System


The existing system of Customer Segmentation is based on the Demographic factors.

Demographic factors include:

• Age

• Gender

• Income

• Education

• Ethnicity

3.2 DBSCAN

Clustering analysis or simply Clustering is basically an Unsupervised learning method that divides the data
points into a number of specific batches or groups, such that the data points in the same groups have similar
properties and data points in different groups have different properties in some sense. It comprises many
different methods based on differential evolution.
E.g. K-Means (distance between points), Affinity propagation (graph distance), Mean-shift (distance between
points), DBSCAN (distance between nearest points), Gaussian mixtures (Mahalanobis distance to centers),
Spectral clustering (graph distance), etc.Fundamentally, all clustering methods use the same approach i.e. first
we calculate similarities and then we use it to cluster the data points into groups or batches. Here we will focus
on the Density-based spatial clustering of applications with noise (DBSCAN) clustering method.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

Steps Used In DBSCAN Algorithm:

1. Find all the neighbor points within eps and identify the core points or visited with more than MinPts
neighbors.

2. For each core point if it is not already assigned to a cluster, create a new cluster.

3. Find recursively all its density-connected points and assign them to the same cluster as the core point.
A point a and b are said to be density connected if there exists a point c which has a sufficient number of points
in its neighbors and both points a and b are within the eps distance. This is a chaining process. So, if b is a
neighbor of c, c is a neighbor of d, and d is a neighbor of e, which in turn is neighbor of a implying that b is a
neighbor of a.

4.Iterate through the remaining unvisited points in the dataset. Those points that do not belong to any cluster
are noise.

3.3 Disadvantages

• Doesn’t provide any facilities for rural areas.

• Here the differentiation is limited.

• Ignoring distance factors.


CUSTOMER SEGMENTATION USING MACHINE LEARNING

CHAPTER-4

PROPOSED SYSTEM

4.1 Proposed System

The proposed system of Customer Segmentation is based on using both demographic and geological
factors instead of using only demographic factors.

Instead of taking Annual Income and Spending Score we are taking Distance and Purchase rate.

4.2 Advantages

• Provide Home Delivery services to the customers who are in rural areas.

• Local partnerships and alliances.

• Targeting local opportunities.

• More customer retention.


CUSTOMER SEGMENTATION USING MACHINE LEARNING

10

4.3 System Requirements

• Python IDLE/Google Collaboratory.

• ML modules like numpy, pandas, matplotlib, seaborn.

• Windows, MAC OS.

Hardware requirements:

• RAM: 4GB or more.


CUSTOMER SEGMENTATION USING MACHINE LEARNING

11

CHAPTER-5

METHODS AND ALGORITHMS

5.1 Algorithm

Clustering

It is basically a type of unsupervised learning method. An unsupervised learning method is a


method in which we draw references from datasets consisting of input data without labeled
responses. Generally, it is used as a process to find meaningful structure, explanatory underlying
processes, generative features, and groupings inherent in a set of examples.

Clustering is the task of dividing the population or data points into a number of groups such that data
points in the same groups are more similar to other data points in the same group and dissimilar to
the data points in other groups. It is basically a collection of objects on the basis of similarity and
dissimilarity between them.

K-Means

Unsupervised Machine Learning is the process of teaching a computer to use unlabeled, unclassified data and
enabling the algorithm to operate on that data without supervision. Without any previous data training, the
machine’s job in this case is to organize unsorted data according to parallels, patterns, and variations.

K means clustering, assigns data points to one of the K clusters depending on their distance from the center
of the clusters. It starts by randomly assigning the clusters centroid in the space. Then each data point assign
CUSTOMER SEGMENTATION USING MACHINE LEARNING

12
to one of the cluster based on its distance from centroid of the cluster. After assigning each point to one of
the cluster, new cluster centroids are assigned. This process runs iteratively until it finds good cluster. In the
analysis we assume that number of cluster is given in advanced and we have to put points in one of the
group.

In some cases, K is not clearly defined, and we have to think about the optimal number of K. K Means
clustering performs best data is well separated. When data points overlapped this clustering is not suitable. K
Means is faster as compare to other clustering technique. It provides strong coupling between the data
points. K Means cluster do not provide clear information regarding the quality of clusters. Different initial
assignment of cluster centroid may lead to different clusters. Also, K Means algorithm is sensitive to noise. It
maymhave stuck in local minima.

Steps for the k-means algorithm

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids. (It can be other from the input dataset).

Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.

Step-4: Calculate the variance and place a new centroid of each cluster.

Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each cluster.

Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.

Step-7: The model is ready.

The k-means algorithm is commonly used for customer segmentation due to its simplicity and effectiveness in
identifying distinct groups within a dataset.

1. Data Preparation: Gather the customer data that includes relevant attributes for
segmentation, such as demographics, purchase history, behavior, or any other variables that provide
insights into customer characteristics.

2. Select the Number of Clusters (k): Determine the desired number of segments or
clusters based on the specific needs and goals of the segmentation analysis. The appropriate value of
CUSTOMER SEGMENTATION USING MACHINE LEARNING

13
k can be determined through prior knowledge, domain expertise, or by utilizing techniques like the
elbow method or silhouette score.

3. Attribute Scaling: Normalize or standardize the attribute values if they have different
scales or variances. This is important to ensure that all attributes contribute equally to the clustering
process.

4. Assign Data Points to Clusters: Measure the similarity or distance between each
data point and the cluster centers using a distance metric such as Euclidean distance. Assign each data
point to the nearest cluster center based on the minimal distance.

5. Evaluate and Interpret the Results: Analyze the resulting clusters based on the
attributes and characteristics of the customers in each cluster. Determine meaningful segment labels
or profiles for each cluster to gain insights into customer behavior, preferences, or other relevant
factors.

5.2 Factors

Factors include both demographic and geographic factors are used for segmentation.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

14

CHAPTER-6

EXPERIMENT RESULTS

Experiment Results
Mall shoppers can be divided into five groups depending on their yearly earnings and spending
habits. By analyzing the data, we can predict customer behaviour based on their distance and purchase
rate. This cluster analysis may be applied to a number of consumer marketing methods. By using this
clustering data we want to deliver the products who are far from the shopping mall and make more
sales and more beneficial to shopping mall sales. A cluster analysis may be used to establish what
kind of things clients wish to consume, allowing for the development of more targeted marketing
efforts. The people in clusters 3 and 4 are the potential clients in this situation.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

15

CHAPTER-7

SOURCE CODE

Python code for customer segmentation using k-means clustering

Step1: Import Libraries

import pandas as p import numpy as


n import matplotlib.pyplot
as m import seaborn as s

Step2: Import dataset cs=p.read_csv("/content/distance


- Sheet1.csv")

cs.head()
CUSTOMER SEGMENTATION USING MACHINE LEARNING

16

Step4: Data Visualization


m.figure(1, figsize=(15,6)) n=0 for x in ['Age',
'Distance(km)','PurchaseRate']:
n += 1
s.displot(cs[x], bins=20)
m.title('Displot of {}'.format(x))
m.show()

m.figure(figsize=(15,5))
s.countplot(y="Gender",data=cs)
m.show()

m.figure(1,figsize=(15,7)) n=0 for cols in


["Age","Distance(km)","PurchaseRate"]:
n+=1
m.subplot(1,3,n)
s.set(style="whitegrid")
m.subplots_adjust(hspace=0.5,wspace=0.5)
s.violinplot(x=cols, y= "Gender", data=cs)
m.ylabel('Gender' if n==1 else "" )
m.title('Violin plot')
m.show()

age_18_25=cs.Age[(cs.Age >= 18) & (cs.Age <= 25)]


age_26_35=cs.Age[(cs.Age <= 26) & (cs.Age <= 35)]
CUSTOMER SEGMENTATION USING MACHINE LEARNING

17

age_36_45=cs.Age[(cs.Age <= 36) & (cs.Age <= 45)]


age_46_55=cs.Age[(cs.Age <= 46) & (cs.Age <= 55)] age_55above
= cs.Age[cs.Age >= 56] agex = ["18-25","26-35","36-
45","45-55","55+"] agey = [ len(age_18_25.values),
len(age_26_35.values),
len(age_36_45.values), len(age_46_55.values), len( age_55above.values
)
]

m.figure(figsize=(15,6))
s.barplot(x=agex, y=agey, palette="mako")
m.title("Number of Customer and Ages")
m.xlabel("Age")
m.ylabel("Number of Customer")
m.show()

ss_1_20=df["PurchaseRate"][(df["PurchaseRate"] >= 1) &


(df["PurchaseRate"] <= 20)] ss_21_40=df["PurchaseRate"][(df["PurchaseRate"]>=
21) &
(df["PurchaseRate"] <= 40)]
ss_1_20=cs["PurchaseRate"][(cs["PurchaseRate"] >= 1) &
CUSTOMER SEGMENTATION USING MACHINE LEARNING

18

(cs["PurchaseRate"] <= 20)]


ss_21_40=cs["PurchaseRate"][(cs["PurchaseRate"]>= 21) &
(cs["PurchaseRate"] <= 40)] ss_41_60=cs["PurchaseRate"][(cs["PurchaseRate"]
>= 41) &
(cs["PurchaseRate"] <= 60)] ss_61_80=cs["PurchaseRate"][(cs["PurchaseRate"]
>= 61) &
(cs["PurchaseRate"] <= 80)] ss_81_100=cs["PurchaseRate"][(cs["PurchaseRate"]
>= 81) &
(cs["PurchaseRate"] <= 100)]
ssx = ["18-25","26-35","36-45","45-55","55+"] ssy = [ len(age_18_25.values),
len(age_26_35.values), len(age_36_45.values), len(age_46_55.values),
len(age_55above.values)
]

m.figure(figsize=(15,6))
s.barplot(x=ssx, y=ssy, palette="rocket")
m.title("PurchaseRate")
m.xlabel("Rate")
m.ylabel("Number of Customer having the purchase rate") m.show()
CUSTOMER SEGMENTATION USING MACHINE LEARNING

19

Step5: Plot Distance Vs Purchaserate graph

s.relplot(x="Distance(km)", y="PurchaseRate", data=df)

Step6: Plot elbow graph for Distance Vs Purchaserate


x1=cs.loc[:,["Distance(km)","PurchaseRate"]].values
from sklearn.cluster import KMeans wcss = [] for k in
range(1,12):
kmc=KMeans(n_clusters=k,init="kmeans++")
kmc.fit(x1) wcss.append(kmc.inertia_)

m.figure(figsize=(6,3))
m.grid()
m.plot(range(1,12),wcss,linewidth=2,color="red",marker="8")
m.xlabel("K Value")

m.ylabel("WCSS")
m.show()

Step7: Labeling

kmc=KMeans(n_clusters=5)
lab=kmc.fit_predict(x1) print(lab)

Step8: Print Clusters

print(kmc.cluster_centers_)

Step9: Plot Distance vs Purchaserate graph buy representing centroids

m.scatter(x1[:,0],x1[:,1],c=kmc.labels_, cmap='rainbow')
m.scatter(kmc.cluster_centers_[:,0], kmc.cluster_centers_[:,1],
color="black")
m.title("Clusters of Customers")
CUSTOMER SEGMENTATION USING MACHINE LEARNING

20
m.xlabel("Distance(km)")
m.ylabel("PurchaseRate")
m.show()

Step10: Plot 3D graph

clusters = kmc.fit_predict(x1) cs["cluster_lab"]


= clusters

from mpl_toolkits.mplot3d import Axes3D import


matplotlib.pyplot as m fig = m.figure(figsize=(20, 10)) ax =
fig.add_subplot(111, projection='3d')
ax.scatter(cs.Age[cs.cluster_lab == 0],
cs["Distance(km)"][cs.cluster_lab == 0],
cs["PurchaseRate"][cs.cluster_lab == 0], c='indigo', s=55)
ax.scatter(cs.Age[cs.cluster_lab == 1],
cs["Distance(km)"][cs.cluster_lab == 1],
cs["PurchaseRate"][cs.cluster_lab == 1], c='pink', s=55)
ax.scatter(cs.Age[cs.cluster_lab == 2],
cs["Distance(km)"][cs.cluster_lab == 2],
cs["PurchaseRate"][cs.cluster_lab == 2], c='black', s=55)
ax.scatter(cs.Age[cs.cluster_lab == 3],
cs["Distance(km)"][cs.cluster_lab == 3],
cs["PurchaseRate"][cs.cluster_lab == 3], c='yellow', s=55)
ax.scatter(cs.Age[cs.cluster_lab == 4],
cs["Distance(km)"][cs.cluster_lab == 4],
cs["PurchaseRate"][cs.cluster_lab == 4], c='darkgreen', s=55)
ax.view_init(30, 185) ax.set_xlabel("Age")
ax.set_ylabel("Distance(km)") ax.set_zlabel("PurchaseRate") m.show()

Step11: Customers with respect to their clusters.

cluster1=cs[cs["cluster_lab"]==1] print('Customers present in group1=',


len(cluster1)) print('Customers are -', cluster1["Customer"].values)
print("******************************************")
cluster2=cs[cs["cluster_lab"]==2] print('Customers present in group2=',
len(cluster2)) print('Customers are -', cluster2["Customer"].values)
print("******************************************")
cluster3=cs[cs["cluster_lab"]==0] print('Customers present in group3=',
len(cluster3)) print('Customers are -', cluster3["Customer"].values)
print("******************************************")
cluster4=cs[cs["cluster_lab"]==3] print('Customers present in group4=',
len(cluster4)) print('Customers are -', cluster4["Customer"].values)
print("******************************************")
cluster5=cs[cs["cluster_lab"]==4] print('Customers present in group5=',
len(cluster5)) print('Customers are -', cluster5["Customer"].values)
CUSTOMER SEGMENTATION USING MACHINE LEARNING

21
print("*******************************************")

OUTPUT Step2:

Figure 1- Dataset

Step3:

Figure 2- Data Preprocessing


CUSTOMER SEGMENTATION USING MACHINE LEARNING

22

Customer int64

Gender object Age int64

Village object

Distance(km) float64

PurchaseRate int64 dtype: object

Customer 0

Gender 0

Age 0

Village 0
Distance(km) 0 PurchaseRate
0 dtype: int64

Step4:

Figure 3- Distance
CUSTOMER SEGMENTATION USING MACHINE LEARNING

23

Figure 4- Purchase Rate

Figure 5 - Gender
CUSTOMER SEGMENTATION USING MACHINE LEARNING

24

Figure 6- Violin Plot

Figure 7- Age Vs Number of Customers


CUSTOMER SEGMENTATION USING MACHINE LEARNING

25

Figure 8 - Purchase Rate Vs Number of Customers

Step5:
CUSTOMER SEGMENTATION USING MACHINE LEARNING

26

Figure 9 - Distance Vs Purchase 2D Plot


CUSTOMER SEGMENTATION USING MACHINE LEARNING

27

Step6:

Figure 10 - Distance Vs Purchase Rate (Elbow Graph)

Step7:

[2 1 0 4 4 3 3 3 4 1 0 1 4 4 3 3 2 4 0 0 2 2 2 0 4 1 0 0 2 1 2 3 2 2 0 0 4
3 3 4 4 2 2 0 4 0 1 1 0 4 4 1 4 3 3 3 3 4 4 4 3 3 3 3 4 2 2 2 2 2 2 4 1
1
0 2 2 3 0 1 4 0 2 1 1 2 1 1 0 2 3 0 0 2 2 0 1 0 2 0 2 3 2 2 0 2 3 1 0 2 0
1 0 0 0 2 1 0 2 2 0 2 0 0 0 2 2 2 2 0 0 2 2 2 2 0 1 0 0 1 0 0 1 0 2 23 0
0 1 2 2 0 3 2 0 2 2 2 1 0 3 1 0 0 2 0 2 0 2 1 0 2 0 2 1 2 1 0 0 1 0 01 1
1 1 0 1 2 1 0 1 0 0 2 2 1 2 1]

Step8:

[[ 30.82033898 14.84745763]
[ 73.22105263 45.23684211]
[ 31.6 80.41666667]
[129.07826087 82.34782609]
[123.57 14.7 ]]
CUSTOMER SEGMENTATION USING MACHINE LEARNING

28

Step9:

Figure 11 - Distance Vs Purchase Rate( Centriod Graph)


CUSTOMER SEGMENTATION USING MACHINE LEARNING

29

Step10:

Figure 12- Distance Vs Purchase Rate Vs Age (3D Plot)

Step11:
CUSTOMER SEGMENTATION USING MACHINE LEARNING

30
Customers present in group1= 20
Customers are - [ 4 5 9 13 14 18 25 37 40 41 45 50 51 53 58 59 60
65 72 81]
******************************************
Customers present in group2= 59
Customers are - [ 3 11 19 20 24 27 28 35 36 44 46 49 75 79 82
89 92 93
96 98 100 105 109 111 113 114 115 118 121 123 124 125 130 131 136 138
139 141 142 144 148 149 153 156 161 164 165 167 169 172 174 179 180 182
183 188 192 194 195]
-******************************************
Customers present in group3= 60
Customers are - [ 1 17 21 22 23 29 31 33 34 42 43 66 67 68 69
70 71 76
77 83 86 90 94 95 99 101 103 104 106 110 116 119 120 122 126 127
128 129 132 133 134 135 145 146 151 152 155 157 158 159 166 168 170 173
175 177 190 196 197 199]
******************************************
Customers present in group4= 23
Customers are - [ 6 7 8 15 16 32 38 39 54 55 56 57 61
62 63 64 78 91
102 107 147 154 162]
******************************************
Customers present in group5= 38
Customers are - [ 2 10 12 26 30 47 48 52 73 74 80 84 85 87 88
97 108 112
117 137 140 143 150 160 163 171 176 178 181 184 185 186 187 189 191
193
198 200]
*******************************************
CUSTOMER SEGMENTATION USING MACHINE LEARNING

31

CHAPTER-8

CONCLUSION
• In today's highly competitive business environment, customer segmentation is an
essential strategy for companies that want to stay ahead of the curve.

• Ultimately, customer segmentation allows businesses to create more personalized


experiences for their customers, which can lead to increased loyalty, higher sales,
and improved profitability. By investing in customer segmentation, businesses can
position themselves for long-term success in an increasingly competitive
marketplace.
CUSTOMER SEGMENTATION USING MACHINE LEARNING

32

CHAPTER-9
Future Scope
Customer segmentation using machine learning has a promising future with several avenues for
growth and innovation. Here are some key areas of future scope:

Personalization and Recommendation Systems: As machine learning algorithms become more


sophisticated, they can enable highly personalized recommendations and marketing strategies tailored
to individual customers. Future advancements may involve real-time segmentation based on dynamic
customer behavior, preferences, and context.

Dynamic Segmentation: Traditional segmentation methods often rely on static customer


characteristics such as demographics. Future advancements in machine learning could enable dynamic
segmentation, where customer segments are continuously updated based on evolving behavior
patterns, interactions, and feedback.

Multimodal Data Integration: With the proliferation of data sources such as social media, IoT devices,
and online interactions, future customer segmentation models may incorporate diverse data types,
including text, images, videos, and sensor data. Advanced techniques like multimodal learning could
enhance segmentation accuracy and depth.

Unsupervised Learning Techniques: While supervised learning methods are commonly used for
customer segmentation, unsupervised learning techniques such as self-organizing maps (SOMs),
autoencoders, and generative adversarial networks (GANs) hold potential for discovering hidden
patterns and structures in data without the need for labeled examples.

Interpretable and Explainable Models: As the importance of transparency and interpretability in


machine learning models grows, future customer segmentation algorithms may prioritize the
development of interpretable models that provide insights into the underlying reasons for segment
assignments, helping businesses make more informed decisions.

Real-time Segmentation and Decision Making: With advancements in computational power and
streaming data processing technologies, the future of customer segmentation may involve real-time
CUSTOMER SEGMENTATION USING MACHINE LEARNING

33
segmentation models capable of analyzing data streams and adapting marketing strategies on the fly
to meet changing customer needs and preferences.

Ethical and Fair Segmentation Practices: As machine learning algorithms play an increasingly central
role in customer segmentation, there will be a greater emphasis on ensuring ethical and fair practices,
including mitigating biases, protecting customer privacy, and maintaining transparency in
segmentation processes.

Integration with Marketing Automation Platforms: Future customer segmentation solutions may
seamlessly integrate with marketing automation platforms, enabling businesses to automate
personalized marketing campaigns, targeted advertisements, and customer communication across
various channels.

Chapter -10

REFERENCES
[1] Tushar Kansal, 2 Suraj Bahuguna , 3 Vishal Singh, 4 Tanupriya Choudhury ,”Customer
Segmentation using K-means Clustering” ,2018.

[2] Sumit Koul, Trissa Merrin Philip, “CUSTOMER SEGMENTATION TECHNIQUES ON


ECOMMERCE”,2021.

[3] https://www.analyticsvidhya.com/blog/2021/05/k-means-clustering-
withmallcustomersegmentation-data-full-detailed-code-and-explanation/t

You might also like