Weekly Iron Condor (SPY) Backtest: 9/11 Gates (2026-06-15)

#	Gate	Result	Pass
01	Minimum sample	442 condor weeks	✓
02	Profit factor ≥ 1.20	PF 1.652	✓
03	Sharpe ≥ 0.6	Sharpe 0.86	✓
04	Max drawdown ≤ 12%	MaxDD -70.4%	✗
05	Positive ≥ 60% of periods	79% months positive	✓
06	Bootstrap LB Sharpe > 0	95% LB Sharpe 0.06	✓
07	Placebo beats p95	real PF 1.652 vs placebo p95 1.345	✓
08	2× cost stress PF > 1.0	2x-cost PF 1.418	✓
09	Deflated Sharpe positive	SR_hat 0.86 vs SR0 0.00, DSR=0.958	✓
10	No component > 40%	max share 86% (call 86%/put 14%, year 25%)	✗
11	Walk-forward OOS ≥ 0.9× IS	OOS PF 1.886 vs 0.9*IS 1.341 (IS PF 1.491)	✓

Date: 2026-06-25 Archetype: Defined-risk short-vol — sell weekly ~0.10-delta call + ~0.10-delta put on SPY, with 30-pt protective wings (4-leg iron condor) Window: 2018-01-02 → 2026-06-24 (442 condor weeks) Capital base: $3,000 = one condor’s defined risk (wing width 30 × 100 multiplier). Returns reported as % of this fixed base. Bench: Python BS-synthetic 4-leg pricer (free data only, VIX-as-IV); realized P&L settled on the actual SPY path to expiry, each side wing-capped.

VERDICT (read this first)

VARIANT RETIRED — the edge is REAL but the position sizing is uninvestable. 9 of 11 gates pass. The kill-shot placebo (g7) passes decisively (real PF 1.652 vs placebo p95 1.345; only 0.5% of 200 sign-permuted placebos beat real) — the variance-risk-premium harvest is genuinely distinguishable from a random sign sequence. Per pre-registered verdict logic (g7 passes, other gates fail) ⇒ variant retired + archetype-reconsideration note, NOT “retire archetype.”

The base config fails on g4 (max drawdown −70.4%, ceiling 12%) and g10 (component concentration 86%, ceiling 40%).

g4 is the real, fatal failure. Sized to a single $3,000 condor, ONE bad week loses up to ~$2,900 (97% of base). The worst week (2025-03-31, the Liberation-Day tariff crash) lost −$2,869 = −95.6% of the fixed base in five trading days. Early in the series — before any profit cushion exists — a late-2018 losing cluster drove equity from a $3,107 peak to $919, a −70.4% drawdown. In dollar terms the peak-to-trough loss reached −$3,902 = −130% of the entire collateral base. The wing caps each week, but the cap is nearly the whole account. This is the honest tail of a single defined-risk condor: the credit (~3.5% of wing) is tiny relative to the capped-but-large loss (97% of wing), so 93% small wins do not buy back the occasional near-total-loss week against a one-unit base.
g10 (86% call-side) is a weak/methodological fail. Call and put credits are near-identical ($0.515 vs $0.531/share) and breaches balanced (17 call, 20 put). The 86% figure is an artifact of the P&L attribution split (residual-cost allocation on big put-loss weeks), not a real economic imbalance. The economically meaningful concentration by calendar year is 25% — passes cleanly.

So the substantive killer is g4, and it fails by a mile, not a hair — the opposite of the VRP short-put sibling (which missed g4 by 0.1pp). The wings cap the tail in percent-of-notional terms, but because the base is one condor, a single capped loss is still catastrophic to the account.

1. Gate table — 9/11 PASS

Gate	Threshold	Result	Pass
g1 trades	≥100	442	PASS
g2 PF net	≥1.20	1.652	PASS
g3 Sharpe	≥0.6	0.86 (weekly, √52)	PASS
g4 Max DD	≤12%	−70.4% (−$3,902 = −130% of base in $)	FAIL
g5 months positive	≥60%	79%	PASS
g6 bootstrap LB Sharpe	>0	+0.06 (2.5th pct)	PASS
g7 placebo (KILL SHOT)	real PF > p95	1.652 vs p95 1.345 (0.5% of placebos beat real)	PASS
g8 2× cost PF	>1.0	1.418	PASS
g9 DSR	SR_hat > SR0	0.86 vs 0.00, DSR prob 0.958	PASS
g10 concentration	≤40%	86% call-side (attribution artifact); year share 25%	FAIL
g11 walk-forward	OOS PF ≥0.9×IS	OOS 1.886 vs 0.9×IS 1.341 (IS 1.491)	PASS

2. Headline metrics

Metric	Value
Condor weeks	442
Profit factor (net)	1.652
Sharpe (weekly, ann.)	0.86
Win rate	93%
Total return on $3,000 base	+402% over 8.4y
Max drawdown (% of equity peak)	−70.4%
Max drawdown ($)	−$3,902 (−130% of base)
Avg net credit / condor	$104.6 (3.5% of wing)
ITM (breached) weeks	37 of 442
Early-closed (50% profit)	265
Weekly skew / kurtosis	−7.0 / 69.4 (fat left tail, as expected)
2× cost PF	1.418 (edge survives cost stress)

3. TAIL section (the signature failure)

A short condor’s max loss per week is capped by the wings at W − credit ≈ $2,970. That cap held in every crash — but the cap is the problem when sized to one condor:

Event	Entry week	P&L	% of $3,000 base
Apr-2025 tariff crash (worst)	2025-03-31	−$2,869	−95.6% (near full defined-risk loss)
2022 mid-year selloff	2022-06-06	−$1,607	−53.6%
COVID initial gap	2020-02-24	−$1,029	−34.3%
2022 rate shock	2022-01-18	−$996	−33.2%
Volmageddon (entry 2018-01-29)	2018-01-29	−$530	−17.7%

Wing-cap vs the uncapped short-put sibling. This is the one place the condor helps: in Apr-2025 SPY fell ~11% in a week, blowing clean through the short put to the wing — the loss was hard-capped at the 30-pt spread (−$2,869). A naked/cash-secured short put with no wing would have lost the full intrinsic with no ceiling. So in percent-of-notional the condor’s tail is genuinely bounded (VRP sibling hit −12.1% peak DD on a $31k collateral base; this condor hit −70% only because its base is 10× smaller — one condor, not a fully-funded book).

But the wing cap does not save the account when sized to one unit. The credit is ~3.5% of the wing; a single breach loses ~97% of the wing. You need ~28 clean weeks to earn back one full-loss week. Strings of losses early (late-2018) compound to −70% before a cushion forms. Skew −7.0 / kurtosis 69.4 confirm the fat left tail is in the data, not smoothed.

4. Why it fails — and why that’s honest

Not cost: PF stays 1.42 at 2× modeled costs (commission + slippage + financing all doubled).
Not no-edge: placebo kill-shot decisive (0.5% beat rate, DSR 0.958); the VRP is real and harvested on both tails.
It’s tail × sizing: g4 fails catastrophically because a single defined-risk condor risks ~100% of its own collateral each week, and 93% small wins (3.5%-of-wing credit) cannot absorb the occasional near-total-loss week against a one-unit base. The g10 86% is a P&L-attribution artifact (real by-year concentration 25%).

5. Cost sensitivity

Cost level	PF
1× (base)	1.652
2× (commission + 8% slippage/leg + 2× financing)	1.418

Edge survives heavy cost stress — the failure is risk/sizing, not frictions.

6. vs VRP short-put sibling

	VRP short put (SPY+QQQ)	Iron Condor (SPY)
PF net	1.744	1.652
Sharpe	0.86	0.86
Placebo (kill shot)	PASS (p95 1.28)	PASS (p95 1.35)
Max DD	−12.1% (on $31k full-collateral base)	−70.4% (on $3,000 one-condor base)
Tail nature	uncapped per-week, but big collateral dilutes	per-week capped by wings, but base is one unit
Gates passed	9/11	9/11
Verdict	VARIANT RETIRED	VARIANT RETIRED

Both are real-edge / fails-a-risk-gate (the HL-funding class), not no-edge. The condor caps the per-week tail in notional terms (its structural selling point) but trades that for a tiny credit on a tiny base, so on an apples-to-apples one-unit basis its drawdown is far worse. The short put’s −12.1% looks better only because it is measured on a 10× larger fully-funded collateral base; normalize to capital-actually-at-risk and the two are closer.

7. Methodology & data caveats (free data only)

No paid option data. All 4 legs BS-priced off flat VIX/100; realized P&L on the actual SPY path. Shorts sold at bid (−4%), wings bought at ask (+4%).
Volatility-skew caveat (material, flagged not hidden): flat VIX under-prices the short put (real OTM puts trade at higher IV → we collect LESS than reality → conservative on the put side) but over-prices the short call (real OTM calls trade at lower IV → we collect MORE than reality → OPTIMISTIC on the call side). Real call-side credit would be thinner; the put side would credit more but also sit at a more distant true strike. No skew model was added (would fabricate edge). Net: the headline credit is somewhat optimistic on the call wing.
Pessimistic choices locked pre-registration: 0.10-delta shorts (not closer-in), IV held flat intraweek (no IV-crush help on early close), full defined-risk loss at expiry / no roll, T-bill financing charged as drag on margin, 4% slippage on every one of 4 legs, ITM unwind slippage.
Open-to-open weekly settlement simplifies real Friday/daily expiry; understates pin risk and overnight gap risk on the short strikes.

8. Backtest-vs-live delta

Live will be worse: real Friday-expiry gap risk, four-leg fills in fast markets (the Apr-2025-type week is exactly when the wing-buy fill is worst), and the skew makes the modeled call-side credit optimistic. Expect live Sharpe ~0.5–0.6 and an even deeper realized one-condor drawdown. The −95.6%-in-a-week tail is real and would have nearly wiped a one-condor account on 2025-03-31.

9. Archetype-reconsideration note (for a pre-registered v2)

Defensible but must be pre-registered fresh, and the fix is about sizing, not signal:

g4: the only honest cure is to size the condor as a small fraction of a much larger collateral base (e.g. risk ≤2% of account per condor → ~$150k account for a 30-pt condor), exactly as the VRP sibling’s $31k base diluted its tail. On a one-condor base this archetype cannot pass g4 and should not be deployed on Brent’s $5k.
g10: fix the attribution split (assign breached-side loss cleanly) before re-scoring; the economic by-year concentration already passes (25%).
Program lesson (carry-v2): do NOT bolt on reactive tail hedges or rolls to chase g4 — pre-register and measure; reactive risk overlays have destroyed value in every prior test.

Files: PREREG.md, engine.py, gates.py, results.json, charts/{equity,drawdown,placebo,walkforward,weekly_pnl_hist}.png. Reuses VRP data cache (../vrp/data/*.parquet).

Frequently asked

Is Weekly Iron Condor profitable in 2026?

In this pre-registered backtest (2018-01-02 → 2026-06-15), Weekly Iron Condor (SPY) returned a profit factor of 1.65 and passed 9/11 validation gates (placebo PASS). Verdict: REJECTED. Every result is published, pass or fail.

Has Weekly Iron Condor been backtested honestly?

Yes — through The Validation Gauntlet, a pre-registered 11-gate framework (profit factor, deflated Sharpe, a random-permutation placebo, cost-stress and walk-forward) with the specification locked before any out-of-sample metric is computed. It failed and is published anyway.

Weekly Iron Condor (SPY)

Gate scorecard — 9 / 11

VERDICT (read this first)

1. Gate table — 9/11 PASS

2. Headline metrics

3. TAIL section (the signature failure)

4. Why it fails — and why that’s honest

5. Cost sensitivity

6. vs VRP short-put sibling

7. Methodology & data caveats (free data only)

8. Backtest-vs-live delta

9. Archetype-reconsideration note (for a pre-registered v2)

Charts & evidence

Frequently asked

Gate scorecard — 9 / 11

VERDICT (read this first)

1. Gate table — 9/11 PASS

2. Headline metrics

3. TAIL section (the signature failure)

4. Why it fails — and why that’s honest

5. Cost sensitivity

6. vs VRP short-put sibling

7. Methodology & data caveats (free data only)

8. Backtest-vs-live delta

9. Archetype-reconsideration note (for a pre-registered v2)

Charts & evidence

Frequently asked

Related — Volatility

Variance Risk Premium (SPY / QQQ)

Variance Risk Premium — Tail-Managed v2