This project turns raw retail transactions into a calendar intelligence system:
- Seasonality: Which days and months consistently perform better?
- Calendar interactions: Which month × day-of-week combinations are unusually strong/weak?
- Spike detection: Which days look like events (promotions, anomalies, shifts) after removing normal seasonality?
- Spike explanation: For each spike day, what drove it, transactions, units, AOV, or category mix?
It includes:
- A reproducible Python pipeline that generates metrics + figures + report artifacts
- A Streamlit dashboard to explore seasonality and drill into spike days
- Exportable outputs (
CSV,JSON) designed for analyst workflows and portfolio review
- What you get
- Business questions answered
- Dataset
- Methodology
- Results & Figures
- Outputs
- How to run
- Dashboard guide
- Project structure
- Quality checks
- Limitations & assumptions
- Improvements / roadmap
outputs/daily_metrics.csv| daily revenue / txns / units / AOVoutputs/weekly_metrics.csv| weekly rollups + number of observed daysoutputs/daily_scored.csv| adds baseline expected revenue + residuals + spike scoresoutputs/spike_days.csv| top spike days rankedoutputs/spike_cards.json| JSON “Spike Cards” with drivers and contextoutputs/quality_summary.json| sanity/quality stats
Stored in reports/figures/:
dow_revenue.pngmonth_revenue.pngheatmap_month_dow.pngspike_scatter.png
Streamlit app (app/app.py) with:
- overview KPIs
- seasonality visuals + heatmap
- spike explorer + drilldowns
- quality summary
- Which day-of-week is consistently strongest for revenue?
- Which months are strongest/weakest within observed data?
- Is performance driven by:
- more transactions (traffic)
- more units per day (basket size)
- higher AOV (ticket size)
- Which days are unusually high/low after accounting for seasonality?
- Are spikes driven by:
- transactions spike (demand surge / traffic)
- AOV spike (higher-priced purchases)
- units spike (bulk buying)
- category mix shift (one category dominates)
Input file used by default:
data/raw/retail_sales_dataset.csv
This project uses the following Kaggle dataset:
- Retail Sales Dataset by Terence Katua
- Dataset URL: https://www.kaggle.com/datasets/terencekatua/retail-sales-dataset
Thanks to the dataset author for publishing the data and making it available for learning and analysis. All credit for the dataset goes to the original creator and Kaggle hosting platform.
| Column | Meaning | Notes |
|---|---|---|
Date |
Transaction date | Parsed to daily granularity |
Transaction ID |
Transaction identifier | Used to count transactions per day |
Customer ID |
Customer identifier | In this dataset each appears once (so no repeat-customer analysis) |
Gender |
Customer gender | Optional slice for analysis |
Age |
Customer age | Optional slice; can be bucketed |
Product Category |
Category label | Used for category mix analysis |
Quantity |
Units bought | Used for units/day and basket drivers |
Price per Unit |
Unit price | Used for pricing distribution |
Total Amount |
Revenue for that transaction | Typically Quantity × Price per Unit |
This dataset contains transaction-days that appear in the file.
If a date is missing:
- it does NOT mean the store had zero sales
- it means: no record exists for that day in this dataset
Why this matters:
Monthly totals are labeled as “Observed Transaction-Days” so I don’t accidentally treat missing dates as true zero-sales days.
This section explains exactly what the pipeline does and why.
I aggregate transaction-level data to a daily grain:
- Revenue (daily)
- Transactions (daily)
- Units (daily)
- AOV (average order value)
(Implementation uses mean of transaction totals; with unique Transaction ID this aligns with the concept.)
I also create calendar fields:
dow(day name)month(YYYY-MM)week(week period)
I compute seasonality using observed days:
For each day-of-week:
- average daily revenue
- median daily revenue (robust to outliers)
- average txns, units, AOV
- number of observed days (
n_days), important for confidence
For each month:
- total monthly revenue (observed)
- total txns, units (observed)
- average AOV
- number of observed days
I compute:
- average daily revenue per
(month, dow)cell
This exposes interaction effects like:
“Tuesdays are not always strong, but they may be strong in May.”
To detect true “events”, we need an expected value for each day (a seasonal baseline).
This repo uses a simple hierarchy:
- Use (month, dow) mean if there are enough observations in that cell
- Otherwise fallback to month mean
- Otherwise fallback to dow mean
- Otherwise fallback to overall mean
This is intentionally:
- explainable (good for analysts)
- stable for small datasets
- good enough for removing first-order seasonality
Formally, expected revenue is:
with fallbacks as above when cell counts are too small.
I detect spikes with two complementary detectors:
This is great for quick, obvious outliers.
Given daily revenue series ( x ):
- median: ( m = median(x) )
- MAD: ( mad = median(|x - m|) )
Robust z-score:
Why robust? Because retail revenue is often heavy-tailed and standard z-scores can be distorted by spikes.
This isolates “event-like” behavior.
Residual:
I then compute a robust z-score on residuals:
Interpretation:
- large positive residual = unusual surge beyond seasonality
- large negative residual = underperformance beyond seasonality
This is the method used to rank spike days in outputs/spike_days.csv.
For the top-N spike days, I generate a JSON Spike Card that captures:
- actual revenue vs expected revenue
- residual and spike score
- driver snapshot:
- transactions vs global baseline mean
- units vs global baseline mean
- AOV vs global baseline mean
- top category contribution and share
- notes about interpretation and coverage
This makes spike results:
- easy to paste into reports
- easy to use for dashboards
- easy to extend into “warning cards” later
All figures are generated by the pipeline and stored in reports/figures/.
What this chart shows
Average daily revenue for each day-of-week across observed days.
How to interpret
- Use this to identify the store’s weekly rhythm.
- If Saturday is highest, that suggests weekend demand strength.
- Use median alongside average for robustness (median isn’t shown in this figure, but is computed in outputs).
Business use
- staffing + labor planning
- inventory replenishment timing
- promotions on weaker weekdays
Analyst caution
- “Average revenue” can be inflated by a few spikes. That’s why the project also uses:
- median stats
- residual-based spike logic
What this chart shows
Total monthly revenue only for transaction-days that exist in the dataset.
How to interpret
- This is not guaranteed to represent full-month store revenue.
- If a month has fewer recorded days, it will appear lower.
Best practice
- Always pair monthly totals with:
- number of observed days per month
- whether the month is partial coverage (e.g., the last month may contain only one day)
Business use
- detect months with consistent uplift/downturn (within observed coverage)
- plan seasonal campaigns and stock strategies
What this chart shows
Average daily revenue by:
- month (rows)
- day-of-week (columns)
Why it matters
This reveals calendar interactions that simple bar charts miss:
- “Tuesdays are average overall”
but - “Tuesdays in May are extremely strong”
How to interpret
- Bright cells = strong month+weekday combination
- Dark cells = weak combination
- Blank/white areas often mean no observations (coverage gaps)
Business use
- precision planning (“strong Tuesdays in May”)
- promo scheduling for specific calendar windows
- identify “unexpectedly strong” weekday blocks in specific months
What this chart shows
For each day:
- residual = actual revenue − expected revenue
Expected revenue is computed from the seasonal hierarchy baseline (month+dow → month → dow → overall).
How to interpret
- Points near 0: normal days (seasonality explains them)
- Big positive spikes: event-like surges
- Big negative dips: event-like underperformance
Why residuals are better than raw revenue Raw revenue spikes may simply be “normal Saturdays.” Residuals answer:
“Was this day abnormal compared to what the calendar predicts?”
Business use
- detect promotion impact
- detect data anomalies
- detect demand shocks
| Path | Purpose |
|---|---|
outputs/daily_metrics.csv |
Daily rollups (revenue, txns, units, AOV) |
outputs/weekly_metrics.csv |
Weekly rollups and observed-day counts |
outputs/daily_scored.csv |
Daily rollups + expected revenue + residuals + z-scores |
outputs/spike_days.csv |
Ranked spike days (top N) |
outputs/spike_cards.json |
Exportable spike explanations in JSON |
outputs/quality_summary.json |
Data quality stats |
python -m venv .venv
# Windows: .venv\Scripts\activate
source .venv/bin/activate
pip install -r requirements.txtpython -m src.pipeline --input data/raw/retail_sales_dataset.csvAfter it finishes, check:
outputs/reports/figures/reports/insights.md
streamlit run app/app.py-
View KPI tiles (revenue, transactions, units, AOV)
-
Explore seasonality:
- DOW bar charts
- monthly bars
- month×DOW heatmap
-
Spike explorer:
- table of top spike days
- choose a spike day → see drivers and category contribution
- view/export Spike Card JSON
-
Quality summary:
- mismatches in
Total Amount = Quantity × Price per Unit - negative/zero values, missing values
- mismatches in
When you select a date:
-
“Residual vs expected” tells you if it’s an event-like day
-
Driver deltas tell you whether the spike came from:
- more transactions (traffic)
- higher AOV (ticket size)
- more units (basket size)
-
Category bar tells you whether one category dominated the day
retail-calendar-pattern-finder/
data/
raw/ # input CSV lives here
src/
io.py # loading + standardizing schema
quality_checks.py # sanity checks
aggregate_daily.py # daily metrics + category mix
aggregate_weekly.py # weekly metrics
seasonality.py # dow/month summaries + heatmap pivots
baseline.py # expected revenue model
spike_detection.py # robust z + residual z
spike_explain.py # spike cards JSON
reporting.py # markdown report generator
pipeline.py # CLI entrypoint
app/
app.py # Streamlit dashboard
outputs/ # generated CSV/JSON outputs
reports/
insights.md # generated report
figures/ # generated figures
This repo includes sanity checks designed for real analyst workflows:
-
Validate identity:
Total Amount == Quantity × Price per Unit
-
Validate positivity:
- quantities and prices should be > 0
-
Date range profiling:
- min/max date
-
Missing values:
- count rows with any missing values
Output:
outputs/quality_summary.json
- Coverage gaps: missing dates mean “unknown coverage”
- No store/region dimension: cannot compare locations
- No promotion/holiday labels: spikes are statistically detected, not causally explained
- Customer-level analysis is limited: most customers appear once (no repeat behavior modeling)
- Expected revenue baseline is simple and explainable (good for analysts), but not perfect.
- Spikes are scored statistically; root cause requires business context or external signals.
If you want to upgrade this project into a standout portfolio capstone:
-
Train a regression model for expected revenue using:
- month, dow, week-of-year
- category mix features
- rolling averages
-
Compare baseline accuracy and spike stability
-
Decompose revenue changes as:
- Revenue ≈ Transactions × AOV
-
Add “driver waterfall” for each spike day
-
Create a complete calendar table and mark:
- observed vs missing days
-
Render heatmaps with both:
- mean revenue
- observation count overlay
- Add
Makefile/taskfile - Add formatting/linting (ruff/black)
- Add a GitHub Actions workflow to run pipeline checks