Docs
What you feed it, what it runs, and how to read the verdict.
1. Input
Two tables: your trades/fills (timestamp, symbol, side, qty, price) and the price universe they ran against. CSV, Parquet, or a DataFrame. No strategy code, no signal logic, no config required.
2. The battery
26 methods across four families: statistical edge (deflated Sharpe, PSR, multiple-testing), robustness (walk-forward, OOS decay, regime split), integrity (look-ahead, survivorship, point-in-time, fill realism), and risk (drawdown stress, tail, turnover/cost sensitivity).
3. The verdict
One headline call — tradeable / borderline / do-not-deploy — plus the per-check breakdown, the flagged failures, and the suggested next test. Exportable as a reviewer-ready report.
4. Integrate
Drop it into your research pipeline so every candidate strategy auto-runs the battery before promotion, and nothing reaches paper/live without a passing card.
The four families (overview)
| Family | Question it answers | Representative checks |
|---|---|---|
| Statistical edge | Is the headline Sharpe real, or trial-luck? | Deflated Sharpe, probabilistic Sharpe, multiple-testing correction |
| Robustness | Does it survive out-of-sample & across regimes? | Walk-forward, IS/OOS decay, regime split, parameter stability |
| Integrity | Did the backtest cheat? | Look-ahead, survivorship, point-in-time, fill/slippage realism |
| Risk | What's the real downside? | Drawdown stress, tail risk, turnover & cost sensitivity |
Secret-sauce boundary: Validator ships the methodology and the scoring math. It does not contain — and will never share — our own strategies, signal definitions, parameter thresholds, or trading data. Symmetrically, your trades and universe stay in your environment on the on-prem/VPC plans. This is a methodology product, not a data product.