Customer Churn

Introduction

The biggest problem a telecom company faces is of Churned Customers. Churning is a term used in this industry to describe whether the consumer is going to continue the service with the company any further or not. Churning has a huge impact on Revenue of a company. If the company predicts the churn rate of the customers with high accuracy, it gives the company an estimate of how its revenues would look like and in turn give it freedom to plan finances ahead. One of those often forgotten demographics is our senior citizens. The majority of brands overlook this age group. This is a big mistake. Seniors make up a sizable percentage of the United States population.

The 2010 Census found that 13 percent of the US population were senior citizens and by 2050 the share is projected to increase to 20 percent. A Pew study found that 47 percent of all seniors have broadband at home. Also, in next few decades, the “baby boomers”, the large generation born in the 1950s and 1960s, will grow old. As they do, their sheer numbers and their different attitude to age will create new markets in the world’s rich countries. Yet business remains largely obsessed with youth. Many companies seem blind to the fact that their customers are greying. Some have started, with uneven success, to market and advertise to an older population and to design products and services that meet its special needs. Few, though, see the elderly as an exciting group to sell to.

According to a recent study, roughly 70%of all the disposable income in the United States will come from this expanding group of seniors within the next five years. According to AARP, nearly 10,000 adults turn 65 every day. This age group has roughly 47x the net worth of their younger counterparts and an eagerness to participate with technologies. According to the Pew Research Center’s, Internet & American Life Project, “the 74-plus demographic is the fastest growing demographic among social networks.” Another good reason is that the old are wealthier and healthier than ever.

According to the United States Census Bureau, the poverty rate among Americans over 65 has dropped from 35% in 1960 to 10.2% today, compared with a fall from 22% to 11.3% for the population as a whole. Senior agency International, a consultancy specializing in marketing to the elderly, says that the over-50s own three-quarters of all financial assets and account for half of all discretionary spending power in developed countries. Over two-thirds of them own their own homes, three-quarters of which are unencumbered by a mortgage. In America, they control four-fifths of the money invested in savings-and-loan associations and own two-thirds of all the shares on the stock market.

Methodology

Problem Statement

In this article, we are more concerned about the senior citizen customers who are going to get churned. We are going to predict whether the customer be Churned or not, on the basis of its billing information and customer demographics.

Data Source/Description

The data we obtained is an online IBM sample data set that contains over 7,000 observations. Each observation corresponds to a client of a telecommunications company for whom it has been collected information. The target variable is ‘churn’ and it determines whether the client is still in the company or not. Moreover, the data set contains services that each customer has signed up for that include phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies. Furthermore, demographic information related to the customers was collected that include gender, age range, and if they have partners and dependents.

Data Preparation

Data cleansing and preparation will be done in this step. Transforming continuous variable into meaningful factor variable will improve the model performance and help understand the insights of the data. For example, include the methods of data preparation

Exploratory Data Analysis

  • Include some graphs discuss with team members
  • Research Variables
  • Research Methods
  • Logistics Regression

Logistic Regression helps understand the degree to which each feature affects the decision of churn and decision tree provides a graphical overview of the available data from which rules can be generated and strategies can be built for customer retention. Logistic regression is one of the more basic classification algorithms in data analysis. It is used to predict a category or group based on an observation. Logistic regression is usually used for binary classification (1 or 0, win or lose, true or false). The output of logistic regression is a probability, which will always be a value between 0 and 1. While the output value does not give a classification directly, we can choose a cutoff value so that inputs with probability greater than the cutoff belong to one class, and those with less than the cutoff belong to the other. For example, if the classifier predicts a probability of customer attrition being 70%, and our cutoff value is 50%, then we predict that the customer will churn.

Similarly, if the model outputs a 30% chance of attrition for a customer, then we predict that the customer won’t churn. Logistic regression uses maximum likelihood estimation for transforming the dependent variable into a logistic variable. Logistic regression uses the linear regression function to estimate the value of dependent variable by estimating the parameters for the linear equation. As shown below α, b1,b2…bn are the parameters to be calculated using the training data and the equation will then be used to predict P(X) which represents the dependent variable value if values of features x1, x2,…xn are given.