By Dave Walton – Statistrade.com

When I wrote my Wagner Award winning paper “Know your System! – Turning Data Mining from Bias to Benefit,” I had two goals in mind:

- Introduce a new method to reasonably estimate the long-run expected performance of a trading system, and
- Provide a simple method for the average system trader to understand and employ the method.

I’ve subsequently realized that the paper’s focus on goal #2 was actually a bit limiting. Therefore I decided to write this addendum to the original paper to explain some of the assumptions and limitations of SPP and also describe what I now consider to be a better alternative.

First let’s rehash System Parameter Permutation (SPP). As mentioned in the original paper, SPP generates sampling distributions of system performance metrics by leveraging the system optimization process. SPP follows these general steps:

- Parameter scan ranges for the system concept are determined by the system developer.
- Each parameter scan range is divided into an appropriate number of observation points (specific parameter values).
- Exhaustive optimization (all possible parameter value combinations) is performed using a realistic portfolio-based historical simulation over the selected time period.
- The simulated results for each system variant are combined to create a sampling distribution for each performance metric of interest (e.g. CAR, max drawdown, Sharpe ratio, etc.). Each point on a distribution is the result of a historical simulation run from a single system variant.

For each system metric of interest, the output of SPP is a sampling distribution that includes trade results from all system variants (combinations of parameter values). Each point in a given distribution is the result of a historical simulation run that accurately modeled portfolio effects. Via sampling distributions, the trader may evaluate a system based on any desired performance metrics. SPP then uses the descriptive statistics of the sampling distributions to arrive at performance estimates and measures of statistical significance.

As mentioned before, I chose the specific steps in the process to keep things easy to implement for the average system trader:

- The process is simple to understand and implement. It can be implemented in just about any commercially available backtesting platform that supports exhaustive optimization (just about all of them).
- Exhaustive permutation results in exact repeatability.
- The number of variants (parameter combinations) is known up front once the system developer has set the parameter ranges and observation points.

However, embedded in these steps are some assumptions and limitations:

- Assumption: The resulting distribution contains all (important) information about the system within the choice of parameter ranges. In other words, space between observation points doesn’t have an impact.
- Assumption: The number of permutations is not too large — to the point of becoming intractable.
- Limitation: One cannot choose a standard number of iterations to run because it depends on the combinations of parameters for a specific system.

In reality, dealing with **assumption #1** is not trivial. I specifically did not address it in the paper. An observation point is a specific value within the parameter scan range you plan to evaluate. For example, let’s look at moving average lengths. If your scan range begins at 50 and ends at 250, you could choose to evaluate the following specific moving average values: 50, 75, 100, 150, 200, and 250. The six values are observation points.

Observation points are important because they are the specific values which will result in actual tested combinations for optimization and SPP. The spacing of observation points should also be well thought-out. If the spacing of the observation points is too wide, lots of variation may happen in between to which you’ll be blind (values never tested). If the spacing of the observation points is too narrow, redundancy of information is possible. Kaufman (2013) and Pardo (2008) provide guidelines on choosing observation point spacing but it sure would be easier to not have to deal with it.

Depending on the trading system, **assumption #2** can become problematic. A system with many different rules and parameters and a fine grained selection of observation points can quickly lead to a computationally infeasible scenario for the average system trader working with even a powerful PC. This wouldn’t be a problem for a hedge fund or institution but then I developed the SPP method specifically for the “average” system trader, not sophisticated investors with deep pockets.

Somewhat related is **limitation #3**. SPP must be customized to each trading system and thus the number of permutations can vary significantly. It is both time consuming and tedious to go through the process for a large number of trading systems.

In order to address all of these, I’d like to introduce a close cousin of SPP which I named System Parameter Randomization (SPR). The fundamental mechanism behind both SPP and SPR is the same, yet the implementation is very different. SPR can be thought of as a random sample of a continuous SPP distribution. The process is explained in the steps below:

- Parameter scan ranges for the system concept are determined by the system developer.
- A Monte Carlo process is used to pick individual parameter values within in the scan ranges.
- A fixed number of iterations is performed using a realistic portfolio-based historical simulation over the selected time period.
- The simulated results for each system variant are combined to create a sampling distribution for each performance metric of interest (e.g. CAR, max drawdown, Sharpe ratio, etc.). Each point on a distribution is the result of a historical simulation run from a single system variant.

With the SPR method, the problem of picking appropriate observation points is no longer an issue as the Monte Carlo process randomly chooses them. This works very well due to the statistical “law of large numbers.” Parameter combinations are chosen (sampled) randomly and as the number of iterations increase, the descriptive statistics of the sample become better estimates of the population.

By fixing the number of iterations, the tractability concern and customization inconvenience are both solved. For example, the system developer can decide to fix the number of iterations for all systems under evaluation (should be in the thousands) or can decide to tailor the number of iterations based on compute power available. SPR is a much more flexible method than SPP.

It is clear that SPR addresses the assumptions and limitations mentioned above but there is another assumption behind SPP which is also applicable to SPR:

- Assumption: There is enough variation in both the number of parameters and their ranges to allow randomness to affect sufficient (reasonable) dispersion of potential trading results.

This assumption is the subject of several critiques of the SPP paper. SPP and SPR (I’ll refer to them as SPx) are not designed to evaluate indicators outside of a more comprehensive trading system. In brief, just evaluating a single indicator (or a couple) does not enable enough random interactions. For example, the system used in the SPP paper had four parameters and resulted in over 4,000 different combinations. It is unlikely you can get enough meaningful samples just using a single indicator.

SPx is a method intended for application to a complete system in a portfolio context, meaning inclusion of filters, setups, entries, exits, signal ranking rules, and money management rules and also including commissions and slippage.

One of the finer points I make in the paper is that variation in the parameter values in the optimization ranges leads to variations in the way **ALL** the system components interact. This variation leads to randomness in trading results. Some entries are earlier, some later, some trades are shorter, some longer, etc. You can do the same thought exercise for all system components. This randomness allows the creation of the SPx distribution from which you calculate probabilities.

The method is somewhat similar to stochastic modeling that is used extensively in insurance applications. A stochastic model is a tool for estimating probability distributions of potential outcomes by allowing for random variation in one or more inputs over time. The random variation is usually based on fluctuations observed in historical data for a selected period using standard time-series techniques. Distributions of potential outcomes are derived from a large number of simulations (stochastic projections) which reflect the random variation in the input(s).

Based on a set of random outcomes, the experience of the policy/portfolio/company is projected, and the outcome is noted. Then this is done again with a new set of random variables. In fact, this process is repeated thousands of times. At the end, a distribution of outcomes is available which shows not only the most likely estimate but what ranges are reasonable too.

The same thing can be done for a trading system. There are many possible stochastic inputs, but the obvious one is price action. Although the direct approach of randomizing input price action is possible, it can be problematic and difficult in practice. So instead of directly randomizing price action, we can turn things around and randomly select parameter values.

Why? As system developers we have to understand that the best we can hope for is that there is some signal in all the price action noise and that our trading system will pick-up on that signal to generate more profitable trades than losers. Regardless of whether you subscribe to the efficient market hypothesis or not, it is very well understood by both academics and practitioners that price action contains a large noise component.

What happens when you optimize on this historical price action that contains a lot of noise is that your selected optimal parameter set is at least partially fit to a noise component that will never repeat. Thus randomness has influenced your chosen parameter set and the historical performance is positively biased. Here is what Perry Kaufman has to say in his excellent book *Trading Systems and Methods*:

Use the average of all test results. Specifying reasonable parameter ranges is important when evaluating the test results. Nearly all sets of tests will show losses, but hopefully, there will be some areas of attractive profits. If you tested 1,000 cases and 30% of the tests showed returns of about 25% per year, 30% showed breakeven results, and the last 40% showed various profits and losses, you might say that the 30% profitable tests are a broad area from which parameter values can be chosen. That assumes that the market will continue to perform in a way that allows those parameters to generate profits during the next year. It is better to assume that the price patterns change; you cannot tell which combination of parameters will be the best. Regardless of the past returns for the parameters you choose, your expectations should be the average performance of all tests.

It would be optimistic to expect the average return of all tests to be highly profitable; however, that is the correct goal. When comparing systems, the best one has the highest average of all tests as well as the most profitable number of tests. If you accept the premise that actual trading performance is represented by the average of all tests, then your expectations are realistic.

Using the average (or median as I suggest in SPx) means there is no selection and thus no selection bias. Better yet you can use the entire SPx distribution to draw conclusions. It comes down to two things:

- Market conditions change. The optimal parameter values for one period are highly probable to be sub-optimal in another.
- The performance of the system with optimal parameter values in the backtest is positively biased due to DMB.

Projecting these conclusions into the future, we know that whatever parameter set we choose from the past will be sub-optimal because of the random noise we will encounter in future price action. We can simulate this effect by randomly choosing parameter combinations and applying them to historical data. This is exactly what SPR does.

If you would like more information, I recently gave a presentation at the 2016 MTA Annual Symposium on the application of stochastic modeling to trading systems. It goes into the background of stochastic modeling, alternatives, and applications. The concept of SPR is covered throughout. The video is available on the MTA website.

You may also want to check out the podcast interview here.

**References**

Kaufman, Perry J., 2013, Trading Systems and Methods, + Website, 5th Edition, John Wiley & Sons, Inc., Hoboken, NJ, 1232p.

Pardo, Robert, 2008, The Evaluation and Optimization of Trading Strategies 2nd Edition, John Wiley & Sons, Inc., Hoboken, NJ, 334p.