Customer Relationship Management

Hi everyone.

Today I am writing about CRM and I will explain the steps in a practical way.

What is CRM?

Let you think that you have a grocery. And you want to grow your business. Let your main goal be retaining your customers and gaining more customers. How many customers come to your shop in one month do you think? 50? 60? Maybe it’s more. Then how many customers come and buy goods in your shop? İf your business is small then maybe you can give answers predictably and it may be close to the correct answer. What about you have a big company and you have a lot of customers? It is not possible to know which customers like which goods they like to consume or don’t know what are the customer’s behavior pattern in each number of goods. At this point, CRM will help us. CRM(Customer Relationship Management), is a technology that we can manage relationships between customers and firms. Understand your customer consumer behavior, making various campaigns, and making strategies. You can do all these with CRM.

İmportant Steps for CRM:

1-Data preparation (making data ready for the process)

2-Creating RFM (Categorizing customers by Recency, Frequency and Monetary values)

3-Creating CLTV(Caltucalate the customers lifetime value)

4-Making CLTV prediction(Forecast your customers with CLTV values)

Before starting the steps, let’s get to know the data set

In our data set, the data set named Online Retail II includes the sales of a UK based online retail store between 01/12/2009 and 09/12/2011

InvoiceNo: Invoice number. The unique number for each transaction, namely the invoice. If this code starts with C, it indicates that the transaction has been canceled.

StockCode: Product code. Unique number for each product.

Description: Name of the product

Quantity: Quantity of good. It expresses how many of the products in the invoices are sold.

InvoiceDate: InVoice date and time

CustomerID: Unique customer number

Country: Country name. Country of residence of the customer

Firstly we will import libraries

import datetime as dt
import pandas as pd
import pymysql
import mysql.connector
from sqlalchemy import create_engine
from sklearn.preprocessing import MinMaxScaler
from lifetimes import BetaGeoFitter
from lifetimes import GammaGammaFitter

Now we will read the related dataset

df = pd.read_excel("online_retail_II.xlsx",
sheet_name="Year 2010-2011")

Now our data preparation step is finished. We can continue with RFM. But first, What is the RFM?

RFM is segmenting groups in which the categories are Recency, Frequency, and Monetary. These values give us, how recently a customer has made a purchase, how often a customer makes a purchase and how much money a customer spends on purchases. These values make us know more information about the customers and their consumer behavior.

The first step of the RFM is grouping according to customer identity.

df["TotalPrice"] = df["Quantity"] * df["Price"]
today_date = dt.datetime(2011, 12, 11)rfm = df.groupby('Customer ID').agg({'InvoiceDate': lambda date: (today_date - date.max()).days,
'Invoice': lambda num: num.nunique(),
"TotalPrice": lambda price: price.sum()})

Here we define the column names

rfm.columns = ['recency', 'frequency', "monetary"]

This step shows us the scores. We use cut functions to make labels suitable for each column.

rfm["recency_score"] = pd.qcut(rfm['recency'], 5, labels=[5, 4, 3, 2, 1])
rfm["frequency_score"] = pd.qcut(rfm["frequency"].rank(method="first"), 5,
labels=[1, 2, 3, 4, 5])

The last station for the RFM :)

Calculating the CLTV

Now we have information about customers Recency, Frequency, Monetary ve segment values. But we still don’t know how much value the customer adds to the company in monetary terms. We will find with calculating the CLTV.

How to calculate CLTV

Follow these steps and calculate easily

Average Order Value = Total Revenue / Total Number of Orders

rfm['avg_order_value'] = rfm['monetary'] / rfm['frequency']

Purchase Frequency = Total_Number of Orders / Total Number of Customers

rfm["purchase_frequency"] = rfm['frequency'] / rfm.shape[0]

Churn Rate = 1 — Repeat Rate

repeat_rate = rfm[rfm.frequency > 1].shape[0] /rfm.shape[0] 
churn_rate = 1 - repeat_rate

Calculating the profit margin

rfm['profit_margin'] = rfm['monetary'] * 0.05

Customer Value = Average Order Value Purchase Frequency*

rfm['cv'] = (rfm['avg_order_value'] *rfm["purchase_frequency"])

CLTV = (Customer Value / Churn Rate) x Profit margin.

rfm['cltv'] = (rfm['cv'] / churn_rate) * rfm['profit_margin']

we divided it into labels using qcut function for recency score and frequency score

rfm["recency_score"] = pd.qcut(rfm['recency'], 5, labels=[5, 4, 3, 2, 1])
rfm["frequency_score"] = pd.qcut(rfm["frequency"].rank(method="first"), 5, labels=[1, 2, 3, 4, 5])

we named segments and columns

rfm['rfm_segment'] = rfm['recency_score'].astype(str) + rfm['frequency_score'].astype(str)seg_map = {
r'[1-2][1-2]': 'hibernating',
r'[1-2][3-4]': 'at_risk',
r'[1-2]5': 'cant_loose',
r'3[1-2]': 'about_to_sleep',
r'33': 'need_attention',
r'[3-4][4-5]': 'loyal_customers',
r'41': 'promising',
r'51': 'new_customers',
r'[4-5][2-3]': 'potential_loyalists',
r'5[4-5]': 'champions'
rfm['rfm_segment'] = rfm['rfm_segment'].replace(seg_map, regex=True)
rfm = rfm[["recency", "frequency", "monetary", "rfm_segment", "cltv"]]

first, we used the minmaxscaler function. Then we got this fit. we reused qcut function. And then we identified column names.

scaler = MinMaxScaler(feature_range=(1, 100))[["cltv"]])
rfm["cltv_c"] = scaler.transform(rfm[["cltv"]])
rfm["cltv_c_segment"] = pd.qcut(rfm["cltv_c"], 3, labels=["C","B", "A"])rfm = rfm[["recency", "frequency", "monetary", "rfm_segment",
"cltv_c", "cltv_c_segment"]]


We know the customer's CLTV values and RFM for now. But what happens to CLTV value in the future?

We can answer the question but first, we should calculate the CLTV prediction value

Here, we grouped InVoiceDate, Invoice, Total Price according to the Customer ID. Then we cleaned titles and rename columns with recency_cltv_p, T, frequency, and monetary.

today_date = dt.datetime(2011, 12, 11)##special user for recency, dinamic rfm = df.groupby('Customer ID').agg({'InvoiceDate': [lambda date: (date.max()-date.min()).days,
lambda date: (today_date - date.min()).days],
'Invoice': lambda num: num.nunique(),
'TotalPrice': lambda TotalPrice: TotalPrice.sum()})
rfm.columns = rfm.columns.droplevel(0)## recency_cltv_p
rfm.columns = ['recency_cltv_p', 'T', 'frequency', 'monetary']

We used BGNBD functions to find expectation sales values.

# BGNBD calculation of WEEKLY RECENCY and WEEKLY t 
## recency_weekly_cltv_p
rfm["recency_weekly_cltv_p"] = rfm["recency_cltv_p"] / 7
rfm["T_weekly"] = rfm["T"] / 7
rfm = rfm[rfm["monetary"] > 0]
## recency filter (more better results for cltv)
rfm = rfm[(rfm['frequency'] > 1)]
rfm["frequency"] = rfm["frequency"].astype(int) # BGNBD
bgf = BetaGeoFitter(penalizer_coef=0.01)['frequency'],
# exp_sales_1_month
rfm["exp_sales_1_month"] = bgf.predict(4,
# exp_sales_3_month
rfm["exp_sales_3_month"] = bgf.predict(12,

We used GammaGamma function to find the expected average profit

ggf = GammaGammaFitter(penalizer_coef=0.01)['frequency'], rfm['monetary'])
rfm["expected_average_profit"] = ggf.conditional_expected_average_profit(rfm['frequency'],
# 6 aylık cltv_p
cltv = ggf.customer_lifetime_value(bgf,
rfm["cltv_p"] = cltv

In summary, we first classified and rated our customers according to Recency, Frequency, Monetary values. This made it easier for us to segment customers. Then we looked at the customers’ lifetime value. By doing more, we found the expected CLTV values. In this way, we will be able to produce easier strategies in our work on customer relations.

Our grocery store will do not plan according to estimates, but according to the data and we will achieve the desired results :)