Bull and Bear Market Indicators: David Aronson’s Logistic Regression Research

David Aronson has been thinking about the intersection of statistics and markets for longer than most quantitative traders have been alive. He started keeping point-and-figure charts at 15, joined Merrill Lynch in 1973, and by 1980 had founded his own software company, Raiden Research Group, to develop pattern recognition tools for trading systems. His book Evidence-Based Technical Analysis is one of the more rigorous treatments of which technical analysis methods hold up to statistical scrutiny and which don’t.

When this episode was recorded there was active debate about whether the stock market was transitioning from a bull to a bear market. David had been doing something relevant: testing hundreds of indicators against 120 years of market data to see which ones could reliably identify current market state. The results are specific, and some of them are surprising in ways that matter for systematic traders.

Watch the full episode below, then read on for the complete breakdown.

What logistic regression does and why it’s useful here

Before getting into the research findings, David takes time to explain the statistical method he’s using. Standard multiple regression analysis takes a set of indicators and combines them to produce a continuous forecast of a target variable, a price return for example. Logistic regression is a variant that does something slightly different: instead of predicting a quantity, it predicts the probability that a binary condition is true or false.

In this context, the target variable is market state: bull or bear. Using historical data where we know, with the benefit of hindsight, when bear markets occurred, you assign a value of 1 to every day that was historically in a bear market and 0 to every other day. The logistic regression model then looks for indicators that can push the probability estimate for that binary condition significantly above or below the historical base rate.

The base rate matters. Looking at the Dow Jones from 1896 forward, bear markets have been in effect about 17% of the time. If you have no other information, your best estimate that you’re currently in a bear market is 17%. The value of any indicator is measured by how much it can move that probability estimate, and how reliably it does so across different market periods.

David frames this as “nowcasting” rather than forecasting. You’re not trying to predict where the market goes next. You’re trying to identify the state of the market right now, as quickly and accurately as possible. If a bear market tends to persist after you detect it, you’re implicitly making a forward prediction, but the first step is accurate detection of current state.

The 50-200 day moving average: what the data shows

David started with the most obvious candidate: the golden cross / death cross, the difference between the 50-day and 200-day moving average. This is probably the most widely discussed market state indicator, and his results give it some empirical grounding.

Using Dow Jones data back to 1896, here’s what the logistic regression found:

50-day vs 200-day positionBear market probability
50 is above 200 (golden cross)~8% (well below 17% base rate)
50 is below 200 (death cross)~36% (roughly double the base rate)

So the 50-200 spread produces about a four-fold difference in bear market probability depending on which side of the threshold you’re on. When the 50 is above the 200, you’re roughly half as likely to be in a bear market as the historical average. When it’s below, you’re roughly twice as likely.

This is a meaningful result, though David is careful about what it doesn’t tell you. A 36% probability of a bear market still means there’s a 64% chance you’re not in one. And the number of truly independent samples in 120 years of data is smaller than it appears, because bear markets have high serial correlation. Once you’re in a bear market state, the next day’s reading is highly predictable from today’s. The actual number of independent bull-to-bear or bear-to-bull transitions in the data set is modest.

Does the magnitude of the spread add information?

The follow-up question is natural: if the sign of the 50-200 spread tells you the direction, does the size of the spread tell you anything additional about probability?

David tested this. The degree of separation does shift the probability somewhat. A larger gap between the two moving averages produces a higher or lower probability estimate than a small gap. But the additional information beyond the simple above/below reading is limited. “Most of the information is: is it above or is it below?”

He did find one refinement that adds value: incorporating the 65-day slope of the oscillator. Whether the spread is increasing or decreasing adds a modest amount of information on top of the direction alone. It’s not a large effect, but it’s measurable.

Testing 200 indicators: what else works

After establishing the 50-200 moving average spread as a baseline, David assembled roughly 200 different indicators from his software library and ran them through the logistic regression routine. Two findings stand out.

The first is the Follow-Through Index (FTI) indicator, developed by a researcher named Govinda Khalsa and published in a now-obscure 1985 monograph. David found the monograph through an ad in Barron’s and may have one of the only existing copies. The indicator uses signal processing mathematics to measure the degree of follow-through in a market trend relative to counter-trend noise. Despite its complexity in derivation, it outperformed many simpler alternatives in the logistic regression test. David found it worked well in combination with the 50-200 spread, adding meaningful incremental information.

The second is the Wells Wilder RSI, tested across a range of lookback periods from approximately 35 days to 250 days. RSI showed up as useful in the bear market detection context, though the specific best lookback varied. The general finding was that longer-period RSI values (reading overall trend strength rather than short-term overbought/oversold conditions) had more predictive value for market state classification than shorter periods.

The data mining problem and why it matters here

Testing 200 indicators raises an obvious concern. If you test enough indicators, some will appear to work just through random chance. David has written extensively about this in Evidence-Based Technical Analysis, and his research on market state indicators has to grapple with the same issue.

His approach: use very long data series (120 years for the Dow, back to 1928 for the S&P 500) to maximize the number of observations. Require that indicators show meaningful results across both data sets, not just one. Apply corrections for multiple hypothesis testing where possible. Be explicit about the distinction between in-sample discovery and out-of-sample validation.

He acknowledges the serial correlation problem as a genuine limit on statistical inference. When a bear market lasts several years, the daily readings within that period are not independent observations. The effective sample size is the number of distinct bull/bear transitions, which over 120 years is a far smaller number than 120 years of daily data. This limits how confident you can be in any single indicator’s results.

What makes machine learning useful for market state detection

David’s career trajectory is worth understanding here. He got interested in machine learning in 1976 after a conversation with a retired Boeing engineer who talked about “statistical pattern recognition.” He spent two decades building pattern recognition software before the term “machine learning” was widely used.

His argument for machine learning in market analysis is straightforward: human pattern recognition is encumbered by cognitive biases and limited by how many variables we can hold in mind simultaneously. Machines can systematically search much larger spaces. Logistic regression is one of the simplest machine learning methods, but its age doesn’t reduce its validity. It’s been around long enough to understand its failure modes, which is a genuine advantage over newer, more opaque techniques.

Signal filtering, which was Raiden’s commercial business for years, is a related application: taking the trade signals from an existing system and building a predictive model that estimates, at the time of each signal, whether it’s likely to succeed or fail. That model can then be used to filter out low-probability signals. The same logistic regression framework applies.

Practical implications for systematic traders

What does this research mean in practice? A few things worth carrying away:

  • The 50-200 moving average spread is a statistically grounded market state indicator, not just a popular signal. The four-fold difference in bear market probability above and below the threshold is meaningful over 120 years of data.
  • Testing indicators for market state classification is methodologically different from testing them for return prediction. The logistic regression approach gives you probability estimates, which are more honest than binary predictions.
  • Data mining over long histories doesn’t eliminate the multiple comparison problem, but it makes results more credible than testing over short periods.
  • The most widely cited technical indicators (50-200 MA, RSI) have more empirical support for market state detection than for trade entry/exit timing. The distinction matters.
  • Adding multiple well-validated indicators to a logistic regression model can improve classification accuracy, but the marginal benefit of each additional indicator decreases. Two or three good indicators often do most of the work.

Get the show notes & transcript

Related episodes


Want to learn more about evidence-based trading system development? Subscribe to the Better System Trader podcast for weekly interviews with the world’s top systematic traders and quantitative researchers.

Scroll to Top