Recession Probability Monitoring

Baseline: yield-curve probit • Augmented: additive macro-finance model • Hybrid: reliability-weighted mixture
Last updated (UTC): 2026-06-03 00:09 UTC Next scheduled update: 2026-07-01 10:00 UTC Time window: next 12 months

Overview What this dashboard is measuring

What you are looking at: a recession risk monitor that combines a classic yield-curve signal with broader measures of financial stress.

Why a "regime-aware" approach: the yield curve is historically useful, but on its own it can send prolonged warnings without broad economic stress. This system pairs the classic curve signal with a broader macro-finance model and learns which has been more reliable out-of-sample.

Two models, one gate:

  • Baseline: a yield-curve-only probit — the classic, pure term-spread signal.
  • Augmented: a macro-finance elastic-net using the term spread together with credit stress, financial conditions, and labor-market indicators — a broader read on whether the curve signal is corroborated by real stress.
  • Reliability gate: a learned mixture weight w that decides how much to trust each model at a given time.

Reading tip: start with the probability chart, then check the mixture weight panel to see which model is driving today's reading.

Recession Probability How risk evolves over time

Why it matters: This chart is the main "risk thermometer." It shows the system's recession probability through time.

How to read: Higher values mean the model sees conditions that historically preceded recessions. Shaded bands mark past recessions. A rising line means risk is increasing; a falling line means risk is easing.

Important: A high probability does not mean a recession is guaranteed, and a low probability does not mean one is impossible. This is a monitoring signal based on history, not a calendar prediction.

Latest update As of 2026-04-01
6.9%
Elevated risk
12-month recession probability
Yield-curve model 10.2% Macro-finance model 6.1% Reliability weight (w) 0.80
Leaning toward macro-finance signals
Sensitivity (w ± 0.1): 7.3% → 6.5%
Recession Probability

Indicator Dashboard What inputs look like right now

What you see: the four key inputs standardized as z-scores for visual comparison.

IndicatorWhat it capturesStress direction
Yield curve spread10Y-3M Treasury rate differenceLower (inverted) = more stress
Credit stress (EBP)Excess bond premiumHigher = more stress
Financial conditions (NFCI)Chicago Fed indexHigher = tighter
Labor market (claims)Initial claims signalHigher = deteriorating
Note: The augmented model combines the term spread with these macro-finance signals in a single elastic-net. See the Model Design panel for how it relates to the baseline.
Indicator Dashboard

Reliability Gate (Mixture Weight) Which model the system trusts right now

What this is: The reliability gate outputs w, the weight placed on the macro-finance model when forming the hybrid probability.

How to read: When w is near 0, the system is mostly trusting the yield curve. When w is near 1, it is mostly trusting credit and labor-market stress signals.

Why it helps: The two models can genuinely disagree — the curve can invert while broader stress stays low, or stress can build while the curve still looks benign. The gate learns, out-of-sample, which read to weight more heavily.

0 = Yield curve 1 = Macro-finance
Current w: 0.80 • Leaning toward macro-finance signals
Reliability Gate (Mixture Weight)

Model Design What goes into each model — and why it stays simple

Two models: The baseline uses the yield-curve spread as its only input — the classic Estrella-style recession probit. The augmented model is an additive macro-finance specification: the same spread plus five broader stress measures.

Augmented features: SPREAD, CREDIT_Z, CREDIT_CHG3M, NFCI, NFCI_CHG3M, CLAIMS_SIG

Why additive — and why not interaction terms: An inverted curve plausibly means different things depending on context, so we tested an explicit interaction version — letting the spread's effect bend with credit stress, financial conditions, and labor markets (SPREAD × CREDIT_Z, and so on). In a controlled out-of-sample validation those interaction terms added no distinguishable value: the elastic-net shrank them toward zero and kept the plain spread as the dominant signal, and the variant that dropped the standalone spread was actually less well calibrated. The deployed model therefore stays additive.

What this buys: The additive model is simpler, better calibrated out-of-sample, and matches the specification in the accompanying paper. The reliability gate still has a meaningful job: the baseline reads the curve alone, the augmented model reads the curve in the company of broader stress, and the gate learns which read has been more trustworthy over time.

Regularization: The elastic-net's L1 penalty shrinks unhelpful coefficients toward zero, so the macro features have to earn their place. All inputs are constructed to be leakage-free as of each scoring date.

Known Limitations What this model cannot do

Why this section exists: Transparency about model limitations is as important as the headline probability.

1. Small sample size

The out-of-sample evaluation covers only 3 recession episodes (roughly 28 recession months out of 356 total). Any statistical analysis of threshold selection, lead times, or calibration is inherently limited by this sample. Results should be treated as suggestive, not definitive.

2. Additive, linear specification

The augmented model combines its features additively on the log-odds scale. It does not capture conditional effects — for example, an inversion meaning something different when credit stress is high versus low. We tested explicit interaction terms (SPREAD × CREDIT_Z, and others) and they added no distinguishable out-of-sample value, so they were left out for parsimony. A tree-based or neural approach could capture richer non-linearities, but at the cost of interpretability and with higher overfitting risk on 3 episodes.

3. Alert threshold interpretation

With only 3 episodes and a low base rate, no single threshold achieves clean separation between true and false alarms. The probability level and direction matter more than whether it crosses a specific line.

4. What this system is not

This is a monitoring tool based on historical statistical relationships. It does not model causal mechanisms, it cannot anticipate unprecedented shocks (e.g., a pandemic), and it should never be used as the sole basis for financial decisions. It is one input among many.