Walk-Forward Optimization for Trading Strategies

Robert Pardo invented walk-forward analysis. That’s not an exaggeration or marketing language. Bob literally created the technique and described it in “The Evaluation and Optimization of Trading Strategies,” first published in the early 1990s. The updated 2010 edition remains one of the most cited books in systematic trading. He’s been doing this long enough to have worked at Salomon Brothers managing John Merriweather’s futures arbitrage operation, developed what he claims was the first software allowing traders to build and test a trading strategy and generate live signals, and consulted for Goldman Sachs among others.

His current work centres on Pardo Capital and a multi-strategy futures program called Renaissance, built on dozens of uncorrelated strategies with a Sharpe ratio and drawdown profile he describes as unlike anything he’s seen from other CTAs. The path from his early commodity trading through Salomon Brothers, into software development, through a joint venture with Daiwa Securities and a decade-plus CTA partnership, to the current independent operation, covers almost every variation of systematic trading you can imagine.

This episode focuses on the practical side of strategy development and optimization: common mistakes traders make, the problem with standard optimization methods, and how walk-forward analysis actually works and why it matters.

Watch the full episode below, then read on for the complete breakdown.

Why strategies look great in backtests and fail in live trading

The pattern is familiar to every serious systematic trader. You build something. It looks beautiful in testing. You go live. Within months it falls apart. Bob has heard this story from traders throughout his career and his diagnosis is consistent: curve fitting.

Standard optimization takes a strategy, runs it across a historical dataset with many different parameter combinations, and picks the best performing set of parameters. The problem is that the “best” parameters are best for that specific historical period, not for the future. The optimizer is finding the parameters that explain the past, not the ones that have genuine predictive power.

The technical term for this is in-sample overfitting. You’re essentially writing a formula that describes your historical data rather than discovering a genuine edge. When the market enters a new period with different characteristics, the optimized parameters fail because they were tuned to something that no longer exists.

Bob makes a clear diagnostic: if your backtest looks dramatically better than your live trading, you have a curve-fitting problem. There’s no ambiguity. The backtest and live performance should roughly correspond. When they don’t, the explanation isn’t bad luck. It’s that you don’t have what you thought you had.

The confidence problem

The second major issue Bob identifies is confidence. Many traders, particularly those early in their development, can’t validate their strategy thoroughly enough to believe in it. When a drawdown arrives, even one that’s within the historical parameters of the system, they abandon it. They assume the strategy has broken rather than recognising that drawdowns are part of normal system behaviour.

This creates a specific failure mode. The trader builds a system, optimizes it, starts trading it, hits a normal drawdown, panics and stops trading it, then watches it recover. Or they stop trading it just before the profitable period and conclude that systematic trading doesn’t work.

Genuine confidence in a strategy requires two things. First, a testing process rigorous enough to give you actual evidence that the strategy has edge rather than a curve-fit explanation. Second, realistic expectations about what drawdowns look like and how long they can last. Most traders have unrealistic win rate expectations. Bob mentioned that their original CTA program was right about 45% of the time. That troubled a lot of people. But 45% right with winners bigger than losers produces good long-run results. Requiring 60% or 70% win rates forces you toward systems that are either curve-fit or have very short average hold times.

What walk-forward analysis is

Walk-forward analysis is Bob’s solution to the curve-fitting problem. The central idea is to judge a strategy’s performance based on how it performs out-of-sample, not in-sample.

Standard optimization is entirely in-sample. You test across all your historical data, pick the best parameters. The whole dataset is the training set. There’s no holdout period to test against.

Walk-forward analysis divides the history into a sequence of rolling windows. Each window has an in-sample training period and an out-of-sample test period. You optimize on the training period, pick the best parameters, then test on the out-of-sample period. Then you roll the window forward, optimize again, and test again. You do this 20 or 30 or 40 times across your full history.

The walk-forward result is constructed by stringing together all those out-of-sample test periods. The performance you see in the final result is entirely out-of-sample. Every data point in that result was generated by parameters that were optimized on data before that point, not on data including that point.

If the walk-forward result looks similar to your in-sample optimization result, you likely have a robust strategy. If the out-of-sample result collapses, you have a curve fitter. The gap between in-sample and out-of-sample performance is diagnostic. Large gap: problem. Small gap: promising.

Why Bob invented it

Bob’s description of how walk-forward analysis came about is useful. He wanted an idiot-proof way of optimizing because he was building a money management business and needed a process that could be trusted. He’d seen what happened when people used standard optimization without out-of-sample testing. They produced equity curves that looked outstanding but didn’t correspond to anything real.

His original intent was practical rather than theoretical. He needed a method that would give him and his partners genuine confidence in a strategy before committing real capital. Walk-forward analysis produced that confidence because the out-of-sample results were real evidence of performance, not a mathematical artifact of optimization.

Building a portfolio of uncorrelated strategies

Bob’s current program, Renaissance, has dozens of strategies that are genuinely uncorrelated with each other. The result is a Sharpe ratio and drawdown profile he’s never seen matched in another CTA. This isn’t an accident. It’s the direct product of building many strategies with deliberately different approaches and then combining them at the portfolio level.

The three strategy families in Renaissance each contain many sub-families using different filters and risk management profiles. Every strategy is also examined in both trend-following and mean-reverting modes, running both versions at the same time. Bob has never found a strategy that couldn’t be inverted, and having both trend and mean-reversion versions running simultaneously improves consistency.

The mix of trend-following to mean-reversion strategies is weighted slightly toward mean reversion because markets spend more time ranging than trending. Trend following is excellent when trends exist and poor when they don’t. Mean reversion provides a more consistent base return, with trend-following providing larger returns during trending periods.

The portfolio construction also addresses a capacity problem specific to futures. A stock hedge fund can absorb billions of dollars because there are 20,000 stocks to trade. Futures capacity is more limited. Bob’s approach to this constraint is to keep adding strategies across different markets and timeframes, increasing capacity by adding more positions rather than increasing size in existing ones.

Common mistakes at the strategy development stage

Beyond curve fitting and the confidence problem, Bob identified several other recurring issues:

Unrealistic accuracy expectations. Many traders want 60% or 70% win rates and reject anything below that threshold. A system that’s right 45% of the time with a good profit factor is perfectly tradeable. The obsession with high win rates pushes traders toward scalping approaches or toward over-fitted systems that appear to have high accuracy in backtesting.

Insufficient backtesting rigour. Most traders backtest less thoroughly than they think. They run one or two parameter sweeps, see a good curve, and start trading. Bob’s methods are exhaustive by comparison. He wants high robustness across many parameter combinations and time periods before he’ll consider a strategy ready.

Difficulty finding strategies in the first place. Some traders simply haven’t yet found anything that works. Bob’s advice for those starting out: take a known-working strategy, like the original Turtle system, understand it thoroughly, and then try to improve it systematically. Starting from something with documented historical edge is more productive than starting from zero.

The open-mindedness deficit. Bob attributed part of his own success to a willingness to trade strategies he doesn’t fully understand, as long as they’ve passed rigorous testing. If the testing says it’s robust, the strategy can go into production even without a theoretical explanation for why it works. Many traders reject perfectly good strategies because the story isn’t clear to them.

Markets and diversification

Renaissance currently trades all domestic US futures, covering a broad range of market sectors. When the program was in a joint venture with a large CTA, they traded international futures as well, but Bob found the portfolio became heavily weighted toward bonds and stock indices because that’s most of what’s available outside the US. That concentration created risk he wasn’t comfortable with.

Trading domestic US futures provides better sector diversification. Energy, metals, agricultural commodities, financial futures, equity indices, each sector behaves differently. A portfolio spread across all of them has lower correlation than one concentrated in any subset.

Bob also noted that constantly adding new strategies serves a diversification function at the strategy level. New strategies reduce correlation not just across markets but across time. A strategy optimized to current market conditions captures something the older strategies don’t. Over time, the portfolio becomes more robust simply through the accumulation of many different strategy types.

The key message on optimization

The core lesson from this conversation is straightforward: in-sample optimization alone is not a valid test of a strategy. It tells you that a strategy can be optimized to explain historical data. It does not tell you that the strategy has forward-looking edge.

Walk-forward analysis is the minimum rigorous test. If your strategy survives 20 to 40 out-of-sample test windows with performance that roughly matches the in-sample results, you have evidence of robustness. If it collapses out-of-sample, you have a curve fit and should start over rather than adjusting parameters to make the result better.

The discipline to discard curve-fit strategies rather than adjust them into something that looks better is one of the hardest things to develop in systematic trading. Most traders adjust. Bob’s advice is to walk away.

Get the show notes & transcript

Related episodes


Want to learn more about strategy optimization, walk-forward analysis and building robust trading systems? Subscribe to the Better System Trader podcast for weekly interviews with the world’s top systematic traders and quantitative researchers.

Scroll to Top