Recession Probability Monitoring

Overview What this dashboard is measuring

What you are looking at: a recession risk monitor that combines a classic yield-curve signal with broader measures of financial stress.

Why a "regime-aware" approach: the yield curve is historically useful, but on its own it can send prolonged warnings without broad economic stress. This system uses two intentionally independent models and learns which has been more reliable out-of-sample.

Two models, one gate:

Baseline: a yield-curve-only probit — pure term-spread signal.
Augmented: a macro-finance elastic-net using credit stress, financial conditions, labor market indicators, and interaction terms (SPREAD × CREDIT_Z, SPREAD × NFCI, SPREAD × CLAIMS_SIG). The yield curve enters this model only through its interaction with macro context, so the two models measure genuinely different things.
Reliability gate: a learned mixture weight w that decides how much to trust each model at a given time.

Reading tip: start with the probability chart, then check the mixture weight panel to see which model is driving today's reading.

Recession Probability How risk evolves over time

Why it matters: This chart is the main "risk thermometer." It shows the system's recession probability through time.

How to read: Higher values mean the model sees conditions that historically preceded recessions. Shaded bands mark past recessions. A rising line means risk is increasing; a falling line means risk is easing.

Important: A high probability does not mean a recession is guaranteed, and a low probability does not mean one is impossible. This is a monitoring signal based on history, not a calendar prediction.

Latest update As of 2026-01-01

6.4%

Elevated risk

12-month recession probability

Yield-curve model 11.0% Macro-finance model 3.8% Reliability weight (w) 0.64

Blending both models

Sensitivity (w ± 0.1): 7.1% → 5.6%

Indicator Dashboard What inputs look like right now

What you see: the four key inputs standardized as z-scores for visual comparison.

Indicator	What it captures	Stress direction
Yield curve spread	10Y-3M Treasury rate difference	Lower (inverted) = more stress
Credit stress (EBP)	Excess bond premium	Higher = more stress
Financial conditions (NFCI)	Chicago Fed index	Higher = tighter
Labor market (claims)	Initial claims signal	Higher = deteriorating

Note: The augmented model also uses interaction terms (SPREAD × CREDIT_Z, SPREAD × NFCI, SPREAD × CLAIMS_SIG) so the yield curve enters the macro model only through its relationship with broader stress. See the Model Design panel for details.

Reliability Gate (Mixture Weight) Which model the system trusts right now

What this is: The reliability gate outputs w, the weight placed on the macro-finance model when forming the hybrid probability.

How to read: When w is near 0, the system is mostly trusting the yield curve. When w is near 1, it is mostly trusting credit and labor-market stress signals.

Why it helps: Because the two models are independent (the augmented model does not contain the yield curve as a standalone feature), the gate can meaningfully arbitrate between "yield curve says danger" and "macro says calm" — or vice versa.

0 = Yield curve 1 = Macro-finance

Current w: 0.64 • Blending both models

Model Design How the two models stay independent

Key design choice: The baseline model uses the yield-curve spread as its only input. The augmented model deliberately excludes the spread as a standalone feature. Instead, the spread enters the augmented model only through interaction terms:

Augmented features: CREDIT_Z, CREDIT_CHG3M, NFCI, NFCI_CHG3M, CLAIMS_SIG,
SPREAD × CREDIT_Z, SPREAD × NFCI, SPREAD × CLAIMS_SIG

Why this matters: An inverted yield curve means different things depending on context. When the curve inverts alongside rising credit stress and tightening financial conditions, that historically signals genuine recession risk. When the curve inverts but credit, financial conditions, and labor markets are all calm, the inversion is less meaningful.

By making the spread conditional on macro context, the augmented model can distinguish between these scenarios. The interaction terms capture this: SPREAD × CREDIT_Z is large and negative when both the curve and credit stress point to danger, but near zero when only one is alarming.

Model independence: Because the two models measure genuinely different things (pure term spread vs. macro-financial stress with conditional spread effects), the reliability gate can meaningfully arbitrate between them. When they disagree, that disagreement carries real information.

Scenario	SPREAD	CREDIT_Z	SPREAD×CREDIT_Z	Model behavior
Inversion + stress e.g., 2000 H2	−0.42	+4.2	−1.76	Augmented model raises alarm (interaction amplifies)
Inversion + calm e.g., 2022–23	−1.50	−0.2	+0.30	Augmented model stays calm (interaction dampens)
Steep curve + stress e.g., 2008 GFC	+2.50	+4.6	+11.5	Augmented model raises alarm (macro stress dominates despite safe curve)

Regularization: The elastic-net's L1 penalty will zero out interaction terms if they are unhelpful, so adding them carries no risk of overfitting. Products of leakage-free features remain leakage-free.

Known Limitations What this model cannot do

Why this section exists: Transparency about model limitations is as important as the headline probability.

1. Small sample size

The out-of-sample evaluation covers only 3 recession episodes (roughly 28 recession months out of 356 total). Any statistical analysis of threshold selection, lead times, or calibration is inherently limited by this sample. Results should be treated as suggestive, not definitive.

2. Interaction terms are hand-selected

The interaction features (SPREAD × CREDIT_Z, SPREAD × NFCI, SPREAD × CLAIMS_SIG) were chosen based on domain knowledge. Other interactions (e.g., CREDIT_Z × NFCI) might also be informative but were not included to keep the model parsimonious. The elastic-net will zero out unhelpful terms, but cannot discover interactions we did not provide.

3. Linear interactions only

Products capture linear interaction effects. Higher-order non-linearities (e.g., threshold effects where "SPREAD < −1 AND CREDIT_Z < 0") are not modeled. A tree-based or neural approach could capture these, but at the cost of interpretability and with higher risk of overfitting on 3 episodes.

4. Alert threshold interpretation

With only 3 episodes and a low base rate, no single threshold achieves clean separation between true and false alarms. The probability level and direction matter more than whether it crosses a specific line.

5. What this system is not

This is a monitoring tool based on historical statistical relationships. It does not model causal mechanisms, it cannot anticipate unprecedented shocks (e.g., a pandemic), and it should never be used as the sole basis for financial decisions. It is one input among many.