An intelligence engine that transforms raw shopping behavior into subscription insights, frequency predictions, loyalty scoring, and scenario simulation, designed for teams that want to understand not only what customers do, but why they behave the way they do.
Retail loyalty is not a single action, it is a behavioral signature that emerges from repeated decisions: purchasing rhythms, shipping preferences, discount sensitivity, past experiences, and long-term commitment tendencies.
Yet most companies reduce loyalty to naive metrics like “number of purchases” or “subscription status.” This leads to simplistic marketing decisions and predictable churn.
Subscription-Loyalty-Risk-Radar takes a more scientific view:
- Loyalty is multi-dimensional
- Behavior must be quantified
- Predictions must be explainable
- Insights must be actionable
This project builds a full-stack ML system that:
Who is likely to subscribe? Who is unlikely? Why?
How often will a customer buy? What is their behavioral “intensity score”?
A single interpretable metric combining short-term behavior + long-term intent.
Which features raised or lowered loyalty? What factors shape behavior?
What happens if you offer a discount? Change shipping speed? Add a promo?
A complete customer intelligence interface powered by Streamlit.
(A business narrative + a data science narrative)
E-commerce teams struggle with questions like:
- “Which customers are slipping away?”
- “Who should we target with retention offers?”
- “Which segments are discount-driven?”
- “What would increase subscription adoption?”
- “Who buys weekly vs monthly vs annually, and why?”
And crucially:
“Which levers actually change customer behavior?” (not which ones we think do)
Traditional dashboards fail because they answer what happened, but not what will happen or why it will happen.
This project fills that gap.
Most ML pipelines try to predict a single target. But loyalty is not a single target, it is the interaction of at least two dimensions:
This reflects trust, brand fit, and willingness to commit.
This reflects habits, timing, product needs, lifestyle cycles.
These two dimensions do not always correlate, which is why a single model is insufficient.
A customer may:
- Buy frequently but never subscribe
- Buy rarely but have high subscription tendency
- Buy seasonally yet be highly loyal
- Buy many times but be price-sensitive and churn-prone
To model loyalty correctly, we must model:
- Intent
- Behavior
- Consistency
- Sensitivity
- Predictability
This system captures all of them.
(Core design philosophy)
We build models not to label customers but to approximate their latent state.
A high churn score is meaningless unless we know the reason.
Knowing someone is “at risk” is not enough. We need to answer:
- What lever would improve their loyalty?
- What scenario reduces their risk most?
- How does discount sensitivity differ across personas?
This tool is not meant to replace analysts, it amplifies them.
Below is a conceptual high-level diagram (not code-specific):
┌────────────────────────────────────┐
│ Raw Shopping Dataset │
└────────────────────────────────────┘
│
▼
┌────────────────────────────────────┐
│ Data Cleaning & Normalization │
└────────────────────────────────────┘
│
▼
┌────────────────────────────────────┐
│ Feature Engineering & Encoding │
└────────────────────────────────────┘
│
┌─────────────┴─────────────┐
▼ ▼
┌────────────────────────┐ ┌──────────────────────────┐
│ Subscription Model │ │ Frequency Regression │
│ (Binary Classification)│ │ (Ordinal Behavior Score) │
└────────────────────────┘ └──────────────────────────┘
│ │
└─────────────┬─────────────┘
▼
┌────────────────────────────────────┐
│ Loyalty Scoring Engine │
│ (combine probability + frequency) │
└────────────────────────────────────┘
│
▼
┌────────────────────────────────────┐
│ Streamlit Intelligence UI │
└────────────────────────────────────┘
The models leverage a mixture of:
- Age
- Gender
- Location
- Purchase amount
- Previous purchases
- Frequency of purchases (target for frequency model)
- Review rating
- Shipping type
- Discount use
- Promo code use
- Category
- Item purchased
- Color
- Size
- Season
Together, these features reflect both identity and behavior, crucial for modeling loyalty.
Question: “If we removed friction, how likely is this customer to subscribe?”
- Handles non-linear relationships (“young + winter + clothing discount = subscriber”)
- Robust to noise
- Performs well with mixed categorical + numeric data
- Avoids overfitting with minimal tuning
- Customers who buy frequently trend toward subscribing
- Promo usage may indicate value sensitivity
- Shipping preference indicates tolerance for speed vs. cost
- Location interacts with seasonality
- Certain product categories correlate with subscription behavior
Question: “How strong is this customer’s purchasing rhythm?”
The target is treated as an ordinal variable, converted to an intensity scale (1–7).
Because:
- The distance between categories matters
- Weekly ≠ Fortnightly ≠ Monthly
- Regression treats the output as a continuum
- Allows subtle differences between customers
It essentially measures habit strength.
We model loyalty as:
Loyalty = Intent (60%) + Behavior (40%)
Why?
- Subscription intention reflects commitment
- Frequency score reflects habit strength
Both matter, but intention is slightly more predictive long-term.
Then we compute:
loyalty_index =
0.6 * p_subscribe
+ 0.4 * (frequency_score / 7)
loyalty_risk = (1 - loyalty_index) * 100
High risk means:
- Low frequency + low subscription probability
- Inconsistent or seasonal buying pattern
- Price-sensitivity with low commitment
- Weak habit + friction sensitivity
Segment-level insights reveal patterns like:
- Winter clothing buyers may be high-frequency but low-subscriber
- Cash users may have sporadic behavior
- Express shipping demand might correlate with loyalty
- Promo-heavy shoppers may churn if discounts stop
These insights guide:
- Marketing personalization
- Pricing strategy
- Retention campaigns
- Seasonal promotions
- Subscription product design
This is one of the most powerful features.
You can modify a customer’s attributes to answer:
Examples:
- Change shipping from “Standard” → “Express”
- Toggle “Discount Applied: Yes → No”
- Add a promo code
- Switch payment method
The system recomputes:
- New subscription probability
- New frequency score
- New loyalty risk
- And shows the delta for each metric
This helps teams test strategies before deploying them.
Marketing and product teams care about:
- “Why did the model say this customer is at risk?”
- “What drives loyalty in this segment?”
Explainability provides:
What factors matter most overall?
Which features increased or decreased:
- Intent
- Frequency
- Loyalty
This turns predictions into stories:
- “This customer buys weekly but rarely uses discounts, high loyalty.”
- “This customer buys only in winter and always uses promos, seasonal but price-sensitive.”
- “This customer prefers express shipping and leaves high reviews, strong subscription potential.”
Now the model is not a black box. It is a diagnostic tool.
pip install -r requirements.txtpython -m src.cli prepare-data
python -m src.cli train-all
python -m src.cli evaluate
python -m src.cli score-customers --output data/processed/scored.parquet
streamlit run app/app.py- Replace RandomForest with LightGBM for better performance
- Hyperparameter optimization (Optuna)
- Add ordinal regression for frequency
- Add seasonally aware models
- Persona clustering (KMeans + PCA/UMAP)
- Retention funnel modeling
- Abandonment probability model
- Price elasticity modeling
- Animated cohort transitions
- Customer “journey cards”
- Auto-generated retention recommendations
- FastAPI backend for scoring
- Docker containerization
- Full cloud deployment
- Automated monitoring + drift detection
Subscription-Loyalty-Risk-Radar is more than an ML pipeline. It is a framework for understanding customer behavior, built with:
- Mathematical clarity
- Business intuition
- System-level thinking
- Explainability
- Actionability
It shows how a data scientist:
- Designs multi-model systems
- Thinks about latent customer states
- Blends prediction with reasoning
- Turns algorithms into decisions
- Makes machine learning useful
This is not just a model, it is a loyalty intelligence engine.