This repository contains the code and data accompanying the paper "Recession Detection Using Classifiers on the Anticipation-Precision Frontier", written by Pascal Michaillat, and posted on arXiv in December 2025.
The paper is available at https://pascalmichaillat.org/17/.
This project implements a machine learning approach to real-time recession detection using US labor market indicators. The methodology:
- Constructs 95,832 recession indicators by systematically transforming unemployment and vacancy data through smoothing, curvature adjustment, turning point detection, and mixing operations
- Identifies "perfect classifiers" that detect the correct number of recessions in a training period (1929–2021)
- Selects optimal classifiers on the anticipation-precision frontier
- Builds a high-precision ensemble from frontier classifiers
- Computes recession probabilities using the ensemble's error distribution
- Validates the approach through placebo tests and multiple backtests (1965–2015)
The approach balances early recession detection (anticipation) with consistent timing (precision), providing probabilistic assessments suitable for real-time economic monitoring.
The raw data used by the code to produce the results in the paper are stored as CSV files in the input folder.
20210719_cycle_dates_pasted.csv- US business cycle dates, 1857–2021- Source: NBER (2023)
CLF16OV.csv- US labor force level, 1948–2025- Source: BLS (2025a)
CompositeHWI.xlsx - Sheet1.csv- US vacancy rate, 1951–2020- Source: Barnichon (2010)
HistoricalSeries_JME_2020January.csv- US unemployment and vacancy rates, 1890–2017- Source: Petrosky-Nadeau and Zhang (2021)
INDPRO.csv- US industrial production index, 1919–2025- Source: Federal Reserve Board (2025)
JTSJOL.csv- US vacancy level, 2001–2025- Source: BLS (2025b)
UNEMPLOY.csv- US unemployment level, 1948–2025- Source: BLS (2025d)
The results in the paper are obtained using MATLAB. The MATLAB code is located in the code folder.
The main script, main.m, orchestrates the production of the results in the paper. The script:
- Loads and processes raw labor market data (1929–2025)
- Constructs 95,832 recession indicators through systematic transformations
- Identifies perfect classifiers on the training set (1929–2021)
- Selects classifiers on the anticipation-precision frontier
- Builds a high-precision ensemble
- Computes recession probabilities
- Performs placebo tests using First Ladies' death dates
- Conducts backtests for 6 different training periods
The complete script takes approximately 14–20 hours to run due to the computational intensity of the classifier selection process. Intermediate results are saved to allow partial execution.
getUnemployment.m- Loads and splices historical unemployment data (1890–2025)getVacancy.m- Loads and splices historical vacancy data (1919–2025)getNber.m- Load and process NBER recession dates
buildIndicator.m- Constructs 4,356 indicators from a single time series through:- Smoothing: 22 methods (SMA windows 0–11 months, EMA weights 0.1–1.0)
- Curvature: 11 Box-Cox-like transformations (0 = log, 1 = linear)
- Turning point detection: 18 detection windows (1–18 months)
- Total basic indicators: 22 (smoothing) × 11 (curving) × 18 (turning) = 4,356
mixIndicator.m- Combines unemployment and vacancy indicators using linear and minmax mixing (11 mixing weights × 2 methods)- Total indicators: 4,356 (basic indicators) × 22 (mixing) = 95,832
selectPerfectClassifier.m- Identifies indicator-threshold combinations that detect exactly the correct number of recessions in the training period (computationally intensive: about 2–3 hours per training period)selectFrontierClassifier.m- Constructs the anticipation-precision Pareto frontier by selecting classifiers that minimize mean anticipation for each precision level
computeRecessionProbability.m- Calculates recession probabilities using ensemble classifiers and their error distributions
tabulateEnsemble.m- Creates tables of ensemble parameters and performance metricsplotData.m- Plots raw unemployment and vacancy ratesplotIndicator.m- Visualizes all indicator variationsplotFrontier.m- Shows anticipation-precision frontierplotEnsemble.m- Displays ensemble classifiers and thresholdsplotProbability.m- Plots individual and aggregate recession probabilities
flatten.m- Reshapes multidimensional arrays efficientlyformatFigure.m- Configures default figure properties and defines plotting styles
All functions include comprehensive header documentation following MATLAB best practices:
- Syntax with all input/output arguments
- Detailed description of functionality
- Parameter specifications with units and ranges
- Usage examples
- Notes on computational requirements or special cases
Some intermediate results produced by the code are saved as MATLAB files (.mat) in the intermediate folder. These results take significant time to produce (2–3 hours per file) and are provided for reference.
ensemble.mat- Perfect classifiers for main analysis (training: 1929–2021)ensemble_2015.mat- Perfect classifiers for 2015 backtestensemble_2005.mat- Perfect classifiers for 2005 backtestensemble_1995.mat- Perfect classifiers for 1995 backtestensemble_1985.mat- Perfect classifiers for 1985 backtestensemble_1975.mat- Perfect classifiers for 1975 backtest - Too large for GitHub (112MB) so only available upon requestensemble_1965.mat- Perfect classifiers for 1965 backtest
indexPerfect- Column indices of perfect classifiers in the indicator matrixthresholdPerfect- Optimal thresholds for each perfect classifierstartPerfect- Detected recession start dates for each classifier
To skip the computationally intensive classifier selection step, comment the selectPerfectClassifier() and save() lines, and uncomment the appropriate load() line in main.m. This allows you to skip the selection of perfect classifiers and start analysis from the frontier selection stage. This step can also be skipped in each backtest.
The results produced by the code from the raw data are stored as PDF and CSV files in the output folder.
All figures are saved in two formats:
- PDF files - High-quality vector graphics for publication
- CSV files - Underlying data for replication or further analysis
figure_data.pdf- Raw unemployment and vacancy rates (1929–2025)figure_unemployment.pdf- All 4,356 unemployment indicator variationsfigure_vacancy.pdf- All 4,356 vacancy indicator variationsfigure_frontier.pdf- Anticipation-precision frontier with all perfect classifiersfigure_frontier_precision.pdf- Zoomed view of high-precision segment of frontierfigure_ensemble_[1-N].pdf- Individual ensemble classifier plotsfigure_ensemble.pdf- Combined plot of normalized ensemble classifiersfigure_probability_training.pdf- Recession probabilities on training periodfigure_probability_testing.pdf- Recession probabilities on testing period
figure_frontier_placebo.pdf- Anticipation-precision frontier using First Ladies' death dates for placebo test
For each backtest year (2015, 2005, 1995, 1985, 1975, 1965):
figure_frontier_[year].pdf- Frontier for that training periodfigure_frontier_[year]_precision.pdf- Zoomed frontier viewfigure_ensemble_[year]_[1-N].pdf- Individual classifiersfigure_ensemble_[year].pdf- Combined ensemblefigure_probability_[year]_training.pdf- Recession probabilities on training periodfigure_probability_[year]_testing.pdf- Recession probabilities on backtesting period
All tables are saved as CSV files with complete parameter specifications and performance metrics.
table_ensemble.csv- Classifier ensemble parameters and errors. Columns: Smoothing method, smoothing parameter, curving parameter, turning parameter, mixing method, mixing parameter, threshold, standard error, mean error, min error, max error
table_ensemble_placebo.csv- Frontier classifiers for the placebo test
For each backtest year:
table_ensemble_[year].csv- Classifier ensemble parameters and errors for that backtest
- MATLAB R2019b or later (some functions require recent MATLAB features)
- Sufficient memory for large matrix operations (16GB+ RAM recommended)
- Approximately 10GB free disk space for intermediate and output files
- Clone the repository to your local machine using Git or by downloading the ZIP file:
git clone https://github.com/pmichaillat/recession-detection.git-
Open MATLAB and set the
codefolder as the current folder. -
To generate all the results, run the following command in the MATLAB command window:
run('main.m')-
By default, the main script overwrites the files in the
intermediateandoutputfolders. To preserve existing files, make a copy of the folders before running the script. -
Important note: The script takes 14–20 hours to run completely due to classifier selection. Progress is displayed in the command window.
To skip the computationally intensive selection of perfect classifiers, use pre-computed results:
-
Ensure intermediate
.matfiles exist inintermediate/. -
In
main.m, comment theselectPerfectClassifier()andsave()lines, and uncomment the appropriateload()line inmain.m. This allows you to skip the selection of perfect classifiers and start analysis from the frontier selection stage. This step can also be skipped in each backtest. -
Run the
main.mscript. It will use saved classifiers instead of recomputing them.
- Edit
beginTrainingandendTraininginmain.m
- Modify
stdErrorMaxinmain.m - Lower values = more precise but fewer classifiers
- Higher values = more classifiers but less precise
- Edit the backtest year array in
main.m - Add or remove years as desired
- Modify properties in
formatFigure.m - Adjust colors, line widths, fonts, etc.
The results were obtained using MATLAB R2024b on macOS Tahoe (Apple silicon). The code should work on:
- MATLAB R2019b or later
- Windows, macOS, or Linux
- Both Intel and Apple silicon processors
- No required toolboxes. All code uses base MATLAB functions.
This repository is licensed under the MIT License.