Statistical Significance: The Antidote to Anecdote

Research is built on a foundation of statistics. Not just reams of numbers, but rigorous exploration by hypothesis, analysis and synthesis. Professional and academic journals are packed with math and statistics backing up findings and conclusions.

Loose Talk

Yet outside the realm of deep quants and hedge funds, statistical rigor gets virtually forgotten. Market opinion and strategy (and a lot of decision making) too often ignore even the most elementary concepts from freshman Statistics. Consider these press quotations:

"...The current environment seems similar to the 1966-82 period. (...) You could end up going sideways for a while."

"These stocks still are way ahead of themselves. I am not at all sure we have seen the bottom."

"The amount of debt taken on to buy stocks increased 8%, the biggest jump since the market's peak--a sign that investors may be getting irrationally exuberant again."

"Wednesdays have been tough on the bulls this year, with 13 up and 19 down. ...Particularly...on the third Wednesday of the month, with 2 up and 5 down...."

"Since 1926, after five-year stretches where equities trailed bonds by (6 percentage points), stocks have averaged nearly 14% annual gain, compared with 4% with bonds.

Sometimes such statements rest on some unstated rigor. But not often. The first two are bald assertion, based on...based on what? Some personal vision? The fact that these early 2003 envisionings failed to occur makes them easy targets, but that is not the problem. The problem (we believe) is that these pronouncements are without analytical foundation.

The third entry at least introduces an interesting factoid about margin debt. So we could conceivably test how often 8% margin increases lead to problems. But the warning itself gives not a shred of a nexus from fact to conclusion. How common is an 8% margin jump? When it happens, what's the average outcome? What's the range of outcome? Over what time horizon? If margin jumps are indeed signs of "exuberance," what is the empirical implication?

Or consider that fourth entry, about Wednesdays; with no overall daily up/down mix, and no context about other days of the week, it's impossible to assess the 13/19 finding...much less draw any usable inference.

That last item (the bond-stock performance gap) comes closest to providing any basis for evaluation. At least we have a specific time span and specific returns to consider. But where is the comparison data for stock and bond returns on all other occasions? A quick lookup suggests that stocks and bonds have delivered about those stated returns from 1926 overall. So what? And what about variability? Even accepting the insinuation that the historical 14% average result (following such stock-bond gaps) is differentiably above-average, how widely did results vary around that average result? And how many such 6-point spread occasions have there been? A finding based on two occasions would be far different from 22 occasions.

Significance Testing

"Statistical significance" is the core criterion for analytical validity. It's the basic gold standard for empirical study in any research subject. And indeed, formal research is immersed in significance findings. But in daily strategy and practice--where perhaps it's needed most--significance is all but unknown.

Investors' tendency to conjure dubious connections and patterns is well documented. The danger is nearly universal. Although there can be no guarantees against illusion, objective significance criteria can screen a large measure of doubtful (or outright spurious) analysis. Unless a relationship is sizeable, consistent and numerous, it simply won't pass.

Side Benefit

Reliance on significance can also sharpen our perception. Always assessing How much? How many? How consistent? steers us aways from wishful formulations of imagined futures. We see the flow of market data as never definitive, but a streaming sample from the larger universe of data that might have emerged. Each sampling differs from the others, and each sampling provides an update about the state of the universe. And always the question remains; How confident this is not mere chance?

Keeping It Simple

This is not a plea for arcane complexity. Where deep math is needed, it's already fully employed. But in the course of daily decisions, quite substantial assets are often deployed with little more than some generalized notions and anecdotal metaphors.

Not everyone is or should be a math geek. But if condition "A" is said to favor outcome "B," anyone should ask for probabilities. When an analyst projects a stock at $35, shouldn't we expect a confidence interval? When portfolio performance is said to beat some benchmark, shouldn't we be told if the difference is significant?