Methodology

How we test the framework

01

Train on pre-crisis data

Strategies are trained and optimized on data that ends before the crisis begins. For the GFC study, training uses 2004 to 2007 data.

02

Validate against the crisis

Run the full seven-layer pipeline against the crisis period. Measure composite scores, PBO, and crisis Sharpe ratios.

03

Measure forward performance

Compare validation scores to actual post-crisis returns. Spearman rank correlation tells us if rankings predict real outcomes.

Results

Key findings across all studies

45.7
Max discrimination gap (good vs bad)
0.706
Max forward correlation (Spearman rho)
61
Total strategies tested
30+
Years of crisis data covered

The forward correlation means that for roughly three-quarters of strategy pairs, the validation ranking correctly predicted which strategy would outperform in the years following each crisis.