Customer Relationship Management

Hi everyone.

Today I am writing about CRM and I will explain the steps in a practical way.

What is CRM?

İmportant Steps for CRM:

1-Data preparation (making data ready for the process)

2-Creating RFM (Categorizing customers by Recency, Frequency and Monetary values)

3-Creating CLTV(Caltucalate the customers lifetime value)

4-Making CLTV prediction(Forecast your customers with CLTV values)

Before starting the steps, let’s get to know the data set

In our data set, the data set named Online Retail II includes the sales of a UK based online retail store between 01/12/2009 and 09/12/2011

InvoiceNo: Invoice number. The unique number for each transaction, namely the invoice. If this code starts with C, it indicates that the transaction has been canceled.

StockCode: Product code. Unique number for each product.

Description: Name of the product

Quantity: Quantity of good. It expresses how many of the products in the invoices are sold.

InvoiceDate: InVoice date and time

CustomerID: Unique customer number

Country: Country name. Country of residence of the customer

Firstly we will import libraries

import datetime as dt
import pandas as pd
import pymysql
import mysql.connector
from sqlalchemy import create_engine
from sklearn.preprocessing import MinMaxScaler

Now we will read the related dataset

sheet_name="Year 2010-2011")

Now our data preparation step is finished. We can continue with RFM. But first, What is the RFM?

RFM is segmenting groups in which the categories are Recency, Frequency, and Monetary. These values give us, how recently a customer has made a purchase, how often a customer makes a purchase and how much money a customer spends on purchases. These values make us know more information about the customers and their consumer behavior.

The first step of the RFM is grouping according to customer identity.

df["TotalPrice"] = df["Quantity"] * df["Price"]
today_date = dt.datetime(2011, 12, 11)rfm = df.groupby('Customer ID').agg({'InvoiceDate': lambda date: (today_date - date.max()).days,
'Invoice': lambda num: num.nunique(),
"TotalPrice": lambda price: price.sum()})

Here we define the column names

rfm.columns = ['recency', 'frequency', "monetary"]

This step shows us the scores. We use cut functions to make labels suitable for each column.

rfm["recency_score"] = pd.qcut(rfm['recency'], 5, labels=[5, 4, 3, 2, 1])
rfm["frequency_score"] = pd.qcut(rfm["frequency"].rank(method="first"), 5,
labels=[1, 2, 3, 4, 5])

The last station for the RFM :)

Calculating the CLTV

Now we have information about customers Recency, Frequency, Monetary ve segment values. But we still don’t know how much value the customer adds to the company in monetary terms. We will find with calculating the CLTV.

How to calculate CLTV

Follow these steps and calculate easily

Average Order Value = Total Revenue / Total Number of Orders

rfm['avg_order_value'] = rfm['monetary'] / rfm['frequency']

Purchase Frequency = Total_Number of Orders / Total Number of Customers

rfm["purchase_frequency"] = rfm['frequency'] / rfm.shape

Churn Rate = 1 — Repeat Rate

repeat_rate = rfm[rfm.frequency > 1].shape /rfm.shape
churn_rate = 1 - repeat_rate

Calculating the profit margin

rfm['profit_margin'] = rfm['monetary'] * 0.05

Customer Value = Average Order Value Purchase Frequency*

rfm['cv'] = (rfm['avg_order_value'] *rfm["purchase_frequency"])

CLTV = (Customer Value / Churn Rate) x Profit margin.

rfm['cltv'] = (rfm['cv'] / churn_rate) * rfm['profit_margin']

we divided it into labels using qcut function for recency score and frequency score

rfm["recency_score"] = pd.qcut(rfm['recency'], 5, labels=[5, 4, 3, 2, 1])
rfm["frequency_score"] = pd.qcut(rfm["frequency"].rank(method="first"), 5, labels=[1, 2, 3, 4, 5])

we named segments and columns

rfm['rfm_segment'] = rfm['recency_score'].astype(str) + rfm['frequency_score'].astype(str)seg_map = {
r'[1-2][1-2]': 'hibernating',
r'[1-2][3-4]': 'at_risk',
r'[1-2]5': 'cant_loose',
r'33': 'need_attention',
r'[3-4][4-5]': 'loyal_customers',
r'41': 'promising',
r'51': 'new_customers',
r'[4-5][2-3]': 'potential_loyalists',
r'5[4-5]': 'champions'
}
rfm['rfm_segment'] = rfm['rfm_segment'].replace(seg_map, regex=True)
rfm = rfm[["recency", "frequency", "monetary", "rfm_segment", "cltv"]]

first, we used the minmaxscaler function. Then we got this fit. we reused qcut function. And then we identified column names.

scaler = MinMaxScaler(feature_range=(1, 100))
scaler.fit(rfm[["cltv"]])
rfm["cltv_c"] = scaler.transform(rfm[["cltv"]])
rfm["cltv_c_segment"] = pd.qcut(rfm["cltv_c"], 3, labels=["C","B", "A"])rfm = rfm[["recency", "frequency", "monetary", "rfm_segment",
"cltv_c", "cltv_c_segment"]]

CLTV FİNAL :)

We know the customer's CLTV values and RFM for now. But what happens to CLTV value in the future?

We can answer the question but first, we should calculate the CLTV prediction value

Here, we grouped InVoiceDate, Invoice, Total Price according to the Customer ID. Then we cleaned titles and rename columns with recency_cltv_p, T, frequency, and monetary.

today_date = dt.datetime(2011, 12, 11)##special user for recency, dinamic rfm = df.groupby('Customer ID').agg({'InvoiceDate': [lambda date: (date.max()-date.min()).days,
lambda date: (today_date - date.min()).days],
'Invoice': lambda num: num.nunique(),
'TotalPrice': lambda TotalPrice: TotalPrice.sum()})
rfm.columns = rfm.columns.droplevel(0)## recency_cltv_p
rfm.columns = ['recency_cltv_p', 'T', 'frequency', 'monetary']

We used BGNBD functions to find expectation sales values.

# BGNBD calculation of WEEKLY RECENCY and WEEKLY t
## recency_weekly_cltv_p
rfm["recency_weekly_cltv_p"] = rfm["recency_cltv_p"] / 7
rfm["T_weekly"] = rfm["T"] / 7
# CONTROL
rfm = rfm[rfm["monetary"] > 0]
## recency filter (more better results for cltv)
rfm = rfm[(rfm['frequency'] > 1)]
rfm["frequency"] = rfm["frequency"].astype(int) # BGNBD
bgf = BetaGeoFitter(penalizer_coef=0.01)
bgf.fit(rfm['frequency'],
rfm['recency_weekly_cltv_p'],
rfm['T_weekly'])
# exp_sales_1_month
rfm["exp_sales_1_month"] = bgf.predict(4,
rfm['frequency'],
rfm['recency_weekly_cltv_p'],
rfm['T_weekly'])
# exp_sales_3_month
rfm["exp_sales_3_month"] = bgf.predict(12,
rfm['frequency'],
rfm['recency_weekly_cltv_p'],
rfm['T_weekly'])

We used GammaGamma function to find the expected average profit

ggf = GammaGammaFitter(penalizer_coef=0.01)
ggf.fit(rfm['frequency'], rfm['monetary'])
rfm["expected_average_profit"] = ggf.conditional_expected_average_profit(rfm['frequency'],
rfm['monetary'])
# 6 aylık cltv_p
rfm['frequency'],
rfm['recency_weekly_cltv_p'],
rfm['T_weekly'],
rfm['monetary'],
time=6,
freq="W",
discount_rate=0.01)
rfm["cltv_p"] = cltv

In summary, we first classified and rated our customers according to Recency, Frequency, Monetary values. This made it easier for us to segment customers. Then we looked at the customers’ lifetime value. By doing more, we found the expected CLTV values. In this way, we will be able to produce easier strategies in our work on customer relations.

Our grocery store will do not plan according to estimates, but according to the data and we will achieve the desired results :)

REFERENCES

https://www.salesforce.com/crm/what-is-crm/

https://www.creatio.com/page/what-is-crm

https://images.app.goo.gl/kXBNPPiz7MuocBBX7

https://www.investopedia.com/terms/r/rfm-recency-frequency-monetary-value.asp

https://images.app.goo.gl/n3yP3TPJgMW3QoLL7

https://images.app.goo.gl/2JvrD1bLNsUocKVK9