Backtest Results

WhaleClaw doesn’t ask you to trust a black box. The signal pipeline has been backtested with walk-forward validation — the gold standard for strategy verification.

Summary

588Signals Analyzed

+101.4RTotal Return

46%Win Rate

6Profitable KOLs

What was tested

The backtest covers the end-to-end signal pipeline:

Data collection — Raw KOL messages from Telegram (Jul 2025 – present, ~4,500+ messages)
Signal parsing — GPT-4o-mini extraction of entry, stop-loss, and target from each message
KOL filtering — Only signals from KOLs with positive walk-forward performance
Trade simulation — Each signal tested against actual BTC hourly OHLC data

This isn’t a hypothetical model. These are real messages from real traders, parsed and validated against real market data.

Walk-forward methodology

Walk-forward testing is the industry standard for avoiding overfitting:

Dataset: Jul 2025 – Present (588 parsed signals)

┌──────────────────────────────┐
│     TRAIN (60%)              │  ← Find which KOLs are profitable
│     Jul 2025 – ~Feb 2026    │
├──────────────────────────────┤
│     TEST (40%)               │  ← Validate on unseen data
│     ~Feb 2026 – Present     │
└──────────────────────────────┘

Why this matters: Many trading systems look great on historical data but fail in real time (overfitting). Walk-forward testing splits the data — you find the strategy on one portion and prove it works on data the model has never seen.

The 6 profitable KOLs were identified in the training set and confirmed profitable on the test set. This is the validation that matters.

The 6 profitable KOLs

The backtest identified 6 KOLs whose signals consistently produced positive returns across both training and test periods:

KOL	Style	WR	Total R	Signals
Tareeq	Macro	52%	+28.6R	High volume, consistent
Woods	Scalper	48%	+22.1R	Fast entries, tight stops
Eliz	Macro	51%	+19.4R	Swing trades, wide targets
Muzzagin	Scalper	44%	+14.8R	High frequency, positive expectancy
Binance Killers	Mixed	43%	+10.2R	Structured signals with levels
Vivian	Macro	47%	+6.3R	Selective, high conviction

These 6 form the core tier-1 KOLs that carry the highest weighting in the consensus algorithm. The other 122+ KOLs provide directional context but with lower individual influence.

What the numbers mean

Win rate: 46%

A 46% win rate might sound low. In trading, it’s not. What matters is the ratio of average win to average loss (R-multiple):

Average winning trade: +2.2R
Average losing trade: -1.0R
Expectancy per trade: +0.21R

You lose more often than you win, but your wins are more than twice the size of your losses. Over 588 signals, that compounds to +101.4R.

+101.4R total return

This is the cumulative R across all 588 signals from the 6 profitable KOLs. It accounts for:

Slippage (conservative estimates)
Losing streaks (the longest was 8 consecutive losses)
Market conditions (trending and ranging periods)

How this maps to WhaleClaw

The backtest validates the core signal pipeline — that AI can extract tradeable signals from KOL messages and that certain KOLs consistently produce edge.

WhaleClaw goes further by:

Weighting the 6 core KOLs more heavily (based on these results)
Adding 122+ additional KOLs for broader consensus context
Layering in orderflow, structure, and macro for multi-source confirmation
Running continuously (not batch) for real-time signal updates

The backtest is the foundation. The live system is the full product built on top of it.

Limitations (we’re honest about these)

Past performance doesn’t guarantee future results. Market regimes change. KOLs go through cold streaks.
The backtest covers BTC only. We don’t test or trade altcoins.
Signal parsing isn’t perfect. Some messages are ambiguous. The AI gets ~95% of signals correct; ~5% have some parsing error.
Walk-forward is the best test, not a perfect one. Real-time trading has slippage, emotional factors, and timing differences.

We share these limitations because trust is built on transparency, not on cherry-picked numbers.