Overview What this dashboard is measuring
What you are looking at: a recession risk monitor that combines a classic yield-curve signal with broader measures of financial stress.
Why a "regime-aware" approach: the yield curve is historically useful, but on its own it can send prolonged
warnings without broad economic stress. This system pairs the classic curve signal with a broader macro-finance model and
learns which has been more reliable out-of-sample.
Two models, one gate:
- Baseline: a yield-curve-only probit — the classic, pure term-spread signal.
- Augmented: a macro-finance elastic-net using the term spread together with credit stress,
financial conditions, and labor-market indicators — a broader read on whether the curve signal is corroborated by real stress.
- Reliability gate: a learned mixture weight
w that decides how much to trust each model at a given time.
Reading tip: start with the probability chart, then check the mixture weight panel to see which model is driving today's reading.
Recession Probability How risk evolves over time
Why it matters: This chart is the main "risk thermometer." It shows the system's recession probability through time.
How to read: Higher values mean the model sees conditions that historically preceded recessions.
Shaded bands mark past recessions. A rising line means risk is increasing; a falling line means risk is easing.
Important: A high probability does not mean a recession is guaranteed, and a low probability does not mean one is impossible.
This is a monitoring signal based on history, not a calendar prediction.
Latest update
As of 2026-04-01
6.9%
Elevated risk
12-month recession probability
Yield-curve model
10.2%
Macro-finance model
6.1%
Reliability weight (w)
0.80
Leaning toward macro-finance signals
Sensitivity (w ± 0.1):
7.3% → 6.5%
Reliability Gate (Mixture Weight) Which model the system trusts right now
What this is: The reliability gate outputs w, the weight placed on the macro-finance model when forming the hybrid probability.
How to read: When w is near 0, the system is mostly trusting the yield curve. When w is near 1, it is mostly trusting credit and labor-market stress signals.
Why it helps: The two models can genuinely disagree — the curve can invert while broader stress stays
low, or stress can build while the curve still looks benign. The gate learns, out-of-sample, which read to weight more heavily.
0 = Yield curve
1 = Macro-finance
Current w: 0.80 • Leaning toward macro-finance signals
Model Design What goes into each model — and why it stays simple
Two models: The baseline uses the yield-curve spread as its only input — the classic
Estrella-style recession probit. The augmented model is an additive macro-finance specification: the same spread
plus five broader stress measures.
Augmented features: SPREAD, CREDIT_Z, CREDIT_CHG3M, NFCI, NFCI_CHG3M, CLAIMS_SIG
Why additive — and why not interaction terms: An inverted curve plausibly means different
things depending on context, so we tested an explicit interaction version — letting the spread's effect bend
with credit stress, financial conditions, and labor markets (SPREAD × CREDIT_Z, and so on). In a controlled
out-of-sample validation those interaction terms added no distinguishable value: the elastic-net shrank them toward
zero and kept the plain spread as the dominant signal, and the variant that dropped the standalone spread was
actually less well calibrated. The deployed model therefore stays additive.
What this buys: The additive model is simpler, better calibrated out-of-sample, and matches the
specification in the accompanying paper. The reliability gate still has a meaningful job: the baseline reads the
curve alone, the augmented model reads the curve in the company of broader stress, and the gate learns which read
has been more trustworthy over time.
Regularization: The elastic-net's L1 penalty shrinks unhelpful coefficients toward zero, so the
macro features have to earn their place. All inputs are constructed to be leakage-free as of each scoring date.
Known Limitations What this model cannot do
Why this section exists: Transparency about model limitations is as important as
the headline probability.
1. Small sample size
The out-of-sample evaluation covers only 3 recession episodes (roughly 28 recession months
out of 356 total). Any statistical analysis of threshold selection, lead times, or calibration is
inherently limited by this sample. Results should be treated as suggestive, not definitive.
2. Additive, linear specification
The augmented model combines its features additively on the log-odds scale. It does not capture
conditional effects — for example, an inversion meaning something different when credit stress is high
versus low. We tested explicit interaction terms (SPREAD × CREDIT_Z, and others) and they added no
distinguishable out-of-sample value, so they were left out for parsimony. A tree-based or neural approach
could capture richer non-linearities, but at the cost of interpretability and with higher overfitting risk on
3 episodes.
3. Alert threshold interpretation
With only 3 episodes and a low base rate, no single threshold achieves clean separation between true
and false alarms. The probability level and direction matter more than whether it crosses a specific line.
4. What this system is not
This is a monitoring tool based on historical statistical relationships. It does not model causal mechanisms,
it cannot anticipate unprecedented shocks (e.g., a pandemic), and it should never be used as the sole basis
for financial decisions. It is one input among many.