← Back to Projects
Neural Networks Python CNN TensorFlow EDA

Who's About to Leave?
Telco Customer Churn

IBM WA Telco Churn
7,043 customers
Neural Network + CNN
79.49% (CNN)
Neural Networks
View on GitHub
7,043Customer Records
26%Churn Rate
31Features After Encoding
2Models Built & Compared
79.49%Best Accuracy (CNN)

The question we were trying to answer

Every telecom company loses customers — but can you predict who is about to leave before they actually do? That's the churn prediction problem. If you can spot a customer who's about to cancel, you have a window to intervene — offer a discount, improve their plan, or just check in. Miss that window and they're gone.

We took 7,043 real customer records from IBM's Watson Analytics dataset and built two neural network models to answer exactly that question.

The short answer

The customers most likely to leave are the ones who've been around the least and are paying the most. New + expensive = high risk. Get a customer past the 30-month mark and they're almost certainly staying.

Step 1 — Looking at the raw data

The dataset has 21 columns — customer ID, demographics, services (phone, internet, streaming, security), contract type, payment method, charges, and whether they churned. Here's what the first few rows look like straight out of the CSV:

First 5 rows of the Telco dataset
The raw dataset — 7,043 customers, 21 columns. TotalCharges is stored as text (object type) — that needed fixing before anything else.

One early catch: TotalCharges was stored as text, including blank values for brand new customers who'd never been billed. We converted it to numeric and filled blanks with the median — simpler and cleaner than dropping those customers.

Step 2 — First look at the distributions

Before touching any model, we visualised how the data is shaped. These are the initial charts straight from the notebook:

Distribution of Tenure (top) and Distribution of Monthly Charges (bottom)
Top: Distribution of Tenure — how long customers have been with the company. Bottom: Distribution of Monthly Charges.

Tenure chart (top): Giant spike at month 0–2 (new customers), relatively flat through the middle, then another spike at month 70–72 (long-term loyalists). The customer base is polarised — you either just joined or you've been here forever. Very few people in between.

Monthly Charges chart (bottom): Massive spike at $20 (basic phone-only plans), dips in the middle, then climbs again at $70–$100 (premium packages with internet, streaming, security). Two very different customer profiles.

Step 3 — Does churn relate to tenure and charges?

These boxplots compare customers who stayed vs customers who left — first pass:

Churn by Tenure and Churn by Monthly Charges boxplots
Top: Churn by Tenure. Bottom: Churn by Monthly Charges. The size difference between the boxes tells the whole story.

Tenure boxplot (top): The "No" (stayed) box spans 15–60 months, median around 38. The "Yes" (churned) box is tiny and sits right at the bottom — median around 10 months. Most churners are gone before month 30. This is the clearest signal in the dataset.

Monthly Charges boxplot (bottom): The "Yes" box sits higher. Churners' median is around $79/month; non-churners' median is around $65. A consistent $14/month gap — not a fluke.

Step 4 — The same charts with colour, plus the key scatter plot

Coloured charts: tenure distribution, monthly charges, and scatter plot
Top: Tenure distribution (blue). Middle: Monthly Charges distribution (green). Bottom: Scatter plot — every single customer as a dot, blue = stayed, orange = churned.

The scatter plot (bottom) is the most revealing chart in the project. Each dot is one customer. Look at the top-left corner — high charges, short tenure. That's where orange dots cluster. Bottom-right — low charges, long tenure — almost entirely blue. If you wanted a simple rule: "flag anyone paying over $70 who's been here under 12 months" — you'd catch a huge chunk of the real risk.

Churn by Tenure — coloured boxplot
Churn by Tenure (coloured) — green box = stayed, orange box = churned. The churners' box is dramatically smaller and lower.
Churn by Monthly Charges — coloured boxplot
Churn by Monthly Charges (coloured) — churners (yellow, right) pay noticeably more per month than non-churners (teal, left).

Step 5 — The notebook's annotated final charts

These are the charts as they appeared in the final submitted notebook, with findings written alongside each one:

Notebook chart 1 — Distribution of Tenure with annotations
Notebook — Chart 1: Distribution of Tenure. Customers cluster at two extremes — brand new and long-term loyal.
Notebook chart 2 — Distribution of Monthly Charges with annotations
Notebook — Chart 2: Distribution of Monthly Charges. Low-charge spike at $20, premium cluster at $70–100.
Notebook chart 3 — Scatter Plot Tenure vs Monthly Charges by Churn
Notebook — Chart 3: Scatter Plot. Churners (orange) concentrate in the top-left danger zone: new + expensive.
Notebook chart 4 — Boxplot Churn by Tenure
Notebook — Chart 4: Churn by Tenure. Short-tenure customers are significantly more likely to leave.
Notebook chart 5 — Boxplot Churn by Monthly Charges
Notebook — Chart 5: Churn by Monthly Charges. Higher payers churn more — the company's most valuable customers are also its most fragile.

Making the data model-ready

feature engineering
data_encoded['AvgMonthlyCharges'] = data_encoded['TotalCharges'] / data_encoded['tenure']
data_encoded['AvgMonthlyCharges'].replace([np.inf, np.nan], 0, inplace=True)

scaler = StandardScaler()
data_encoded[numerical_features] = scaler.fit_transform(data_encoded[numerical_features])

Model 1 — Neural Network

A feedforward neural network: layers that transform the data, learning which combinations of features best predict churn over 50 training passes. The Dropout layers (30%) randomly silence neurons during training — forcing the model to learn robust patterns instead of just memorising the training data.

31Input Features
64 → 32Hidden Neurons
30%Dropout
50Epochs
79.42%Test Accuracy
neural network
nn_model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    Dropout(0.3),
    Dense(32, activation='relu'),
    Dropout(0.3),
    Dense(1, activation='sigmoid')  # outputs a 0–1 churn probability
])

Model 2 — CNN on tabular data

CNNs are usually for images — but we reshaped each customer's 31 features into a column (31×1) and let a convolutional filter scan across adjacent features looking for clusters that signal churn. It sounds unusual, but it works: the filter can detect combinations like "fibre optic + month-to-month contract + high charges" as a unit.

31×1Reshaped Input
32 filtersConv1D
64Dense Neurons
50Epochs
79.49%Test Accuracy ✓
cnn
X_train_cnn = np.expand_dims(X_train, axis=-1).astype('float32')

cnn_model = Sequential([
    Conv1D(32, kernel_size=3, activation='relu',
           input_shape=(X_train_cnn.shape[1], 1)),
    Dropout(0.3),
    Flatten(),
    Dense(64, activation='relu'),
    Dropout(0.3),
    Dense(1, activation='sigmoid')
])

Results — full breakdown

What these numbers mean in plain English

Accuracy 79.49% — about 1,120 of 1,409 test customers predicted correctly.

Precision 63.75% (churn) — when the CNN says "this person will churn," it's right 64% of the time. 36% are false alarms.

Recall 52.67% (churn) — the model caught 53% of actual churners. Half slipped through — room to improve.

AUC 70.9% — measures how well the model separates churners from non-churners. Random guessing = 50%. We're at 71% — a solid baseline.

84%Non-Churn Precision
89%Non-Churn Recall
86%Non-Churn F1
64%Churn Precision
53%Churn Recall
58%Churn F1

The model is much better at predicting who stays (89% recall) than who leaves (53% recall) — because only 26% of the dataset churned. The classic imbalanced dataset problem.

What I'd do differently