Message Font: Serif | Sans-Serif
 
UnThreaded | Threaded | Whole Thread (15) | Ignore Thread Prev | Next
Author: repoonsatad Big red star, 1000 posts Add to my Favorite Fools Ignore this person (you won't see their posts anymore) Number: of 40850  
Subject: Evaluating the discovery period: A simple method Date: 5/22/2000 4:18 AM
Post New | Post Reply | Reply Later | Create Poll . Report this Post | Recommend it!
Recommendations: 37
Given that Dow Dividend Approaches (DDA's) don't outperform post-discovery (the only clean control evidence available), or pre-discovery (slightly contaminated evidence, but somewhat useful), but some people still believe in the Foolish Four we need to consider the discovery period also. Of course, the evidence widely presented at this site from the discovery period does not suggest that DDA's outperform because the inference is incorrect.

Suppose the stellar performances of Dow Dividend Approaches (DDA's) including the Foolish Four (FF) during their discovery periods were nothing but statistical flukes with respect to dividend-yield. In other words, the strategies did outperform for sure but it had nothing to do with dividend yield. More specifically, certain portfolios of 4, 5, or 10 stocks outperformed the market and by random chance had high dividend-yield. Any outperforming portfolio's got to have a high something. By searching through sufficiently many factors you'll find that something, and in this case it's dividend-yield.

Given the above data mining explanation it's only the 4, 5, or 10 stocks actually selected by the strategy that outperform. The DDA's (as far as I understand) were developed using earliest January data (I'll call this Jan 1 in the following) with annual rebalancing. Consider any given strategy in the following. What would we expect on applying the strategy the calendar day after the calendar day on which it was developed again with annual revisions? For example, suppose the strategy was developed using January 1 – January 1 holding periods, what would we expect for the same strategy implemented January 2 – January 2? Well, if the strategy picked random stocks there is no reason to expect Jan 2 – Jan 2 to outperform just because Jan 1 to Jan 1 does because there is little chance that the same stocks will be selected. In contrast, if the ranking of stocks used by the strategy doesn't change on a day-to-day basis we would likely end up selecting the same portfolio of stocks Jan 2 – Jan 2 as Jan 1 – Jan 1, and because the holding periods almost completely overlap we would get virtually identical returns. Of course most strategies including the DDA's are somewhere in-between these two extremes. The ranking of stocks change over time but not every day.

Suppose we select our Jan 1 – Jan 1 DDA (say it picks four stocks) and wait patiently some days (say 9, i.e. Jan 10) until the rankings by dividend-yield (in some cases also by price) change. More precisely, one stock in the DDA is being replaced by another. Given the above data mining explanation how would we expect this portfolio to perform relative to the Jan 1 – Jan 1 portfolio. Well, there is still considerable overlap in stocks (3 out of 4) between the Jan 1 – Jan 1 and the Jan 10 – Jan 10 implementations to go with considerable overlap in time (356 out of 365 days). However, because there is no causality between dividend-yield and subsequent returns, the new stock, while picked on dividend-yield, is basically picked at random. Therefore, we expect it to perform as well as the average stock in the universe of stocks we select from. The Jan 10 – Jan 10 portfolio therefore blends three stocks with known stellar performance with one stock of average performance Therefore, the Jan 10 – Jan 10 portfolio will outperform, but clearly less than the Jan 1 - Jan 1 portfolio. Suppose we could wait long enough to see all stocks in the Jan 1 – Jan 1 portfolio replaced by new average performing portfolios, then the new implementation of the strategy would not beat the universe it's picked from.

From the above discussion, given that the Jan 1 – Jan 1 implementation of a DDA strategy only yields superior performance by chance, we expect it's outperformance to vanish slowly as we implement the strategy at increasingly later calendar days. This effect is the cause of the high autocorrelation between subsequent monthly annual returns for DDA's presented for example in (for the BTD5): http://boards.fool.com/Message.asp?id=1030001007445003&sort=id At some point in time, however, we are so late in the calendar year that we start picking up stocks that will be selected at the next scheduled regular Jan 1 rebalancing. Because those stocks also outperform by random chance we will be replacing average performing stocks with better performing stocks. This causes our returns to increase again until Jan 1 where they peak, only to decrease again during the next calendar year. On average the returns will reach their low right in the middle of the Jan 1 to Jan 1 range, i.e. for the Jul 1 – Jul 1 holding period. In summary if the outperformance of our strategy is a 100% spurious effect of data mining we expect to see the following intra-year seasonality in returns from different annual holding periods (monthly only):

Exhibit

Month Return

JAN *********************
FEB ***************
MAR ***********
APR ********
MAY ******
JUN ****
JUL ***
AUG ****
SEP ******
OCT ********
NOV ***********
DEC ***************

The above pattern should be more like a "(" than a "<" Also, only if all stocks from the Jan 1 – Jan 1 holding period are replaced for the Jul 1 – Jul 1 holding period which also contains no stocks from the subsequent Jan 1 – Jan 1 holding period will the Jul 1 – Jul 1 return not be higher than the return on the universe the stocks were picked from. In general, all possible intra-year annual holding periods will appear to outperform the market even if it's only the four original stocks selected in the Jan 1 – Jan 1 screen that outperform.

Interestingly, we see exactly this intra-year effect for the first Foolish Four (F4.0) and the current (RP4): http://www.fool.com/ddow/1998/ddow980807.htm Of course, the shape is never perfect when using a limited sample. However, the more the Jan 1 – Jan 1 returns outperform the market, the more the "(" effect dominates the sampling variation, and the more perfect the shape is. This is why the RP4 has a more "(" like shape than the F4.0. Of course a similar "(" shape may arise due to a seasonality in the strategy that (by chance) happens to look like the seasonality that we expect to see for a non-effective screen. In other words, although the strategy picks winners all year, the winners picked for the Jan 1 – Jan 1 holding period outperform the most. Clearly in this case the strategy should not only rely on the four stocks picked for the originally data mined Jan 1 – Jan 1 holding period, but should also display an ability to pick other winners at different times during the year.

It follows from the above discussion that:

* DDA's worked during their discovery periods in the sense that dividend-yield systematically predicted future returns if they display an ability to pick outperforming stocks when applied on days different from the original Jan 1 used to develop them

* DDA's only worked during their discovery periods because Jan 1 high-yielding stocks by chance outperformed when selected on that date, but new stocks selected on different calendar dates did not outperform. In this case dividend-yield did not predict or "cause" future outperformance, rather chance was the reason.

The above was Qwerty1999's reason to suggest that we use stocks selected outside the Jan 1 – Jan 1 start dates as proxy for out-of-sample evidence during the discovery period for example in this message: http://boards.fool.com/Message.asp?id=1030001007646000&sort=username With time I have learned to like the idea, although I would have some difficulties interpreting a small outperformance (say one or two percent) for stocks picked out-of-sample in the above way during the discovery period. However, if the picked stocks do not outperform the conclusion is clear. I believe somebody (not me that is) should perform a detailed study for the RP4 using the monthly DDA database. The problem with the RP4 is that the 1961-1996 (or perhaps 1995?) discovery period leaves very little evidence before and after. With the above-described method there would be a large body of pseudo out-of-sample data available during the discovery period.

Finally, I'd like to illustrate the method using some data for the original Foolish Four (F4.0) that's readily available: http://marriottschool.byu.edu/emp/grm/Foolish_Four.html

The F4.0 was developed using 1973-1993 data. The above source lists F4.0 picks every year for January and July starts. The July starts run from July the previous year to July in the current year. Thus, the first July start that's entirely inside the discovery period is 1974, the last is 1993. For all July starts I compared the return on the stocks that were also on the data mined previous January list of stock picks to the new stocks that were not on that previous January list. I got the following table.

YEAR JAN # JUL # ALL F4 F4.0 DOW30

74 0 -16.2 4 -16.2 -19.3 -2.4
75 41.5 2 -1.1 2 20.2 21.5 23.6
76 27.5 1 20.4 3 22.2 23.3 23.1
77 13.0 2 13.3 2 13.1 11.8 1.7
78 0 -8.2 4 -8.2 -8.7 -7.9
79 22.9 3 4.1 1 18.2 14.7 7.5
80 13.2 3 13.7 1 13.3 13.4 10.7
81 27.9 3 17.0 1 25.2 23.6 19.6
82 2.1 4 0 2.1 9.1 -14.5
83 41.1 2 72.5 2 56.8 54.0 60.7
84 4.9 3 1.6 1 4.1 4.2 -2.7
85 29.2 2 28.2 2 28.7 31.4 26.7
86 45.4 4 0 45.4 39.8 37.7
87 75.2 2 30.6 2 52.9 51.1 32.6
88 -18.2 1 -13.9 3 -14.9 -15.6 -6.0
89 20.9 3 24.6 1 21.8 24.5 18.2
90 1.2 2 12.0 2 6.6 5.3 18.7
91 6.0 2 2.4 2 4.2 7.1 5.8
92 16.1 3 9.4 1 14.4 5.3 18.6
93 41.5 2 17.6 2 29.6 31.4 13.5

AVE 22.87 2.2 12.67 1.8 16.98 16.40 14.26

All returns are July-July from the previous year to the current. "JAN" indicates the July-July return for stocks that were also on the January 1 F4.0 list. The subsequent "#" indicates the number of stocks that were also on the Jan list. The column entitled "JUL" displays the Jul-Jul return of the stocks independently selected by the July screen, i.e. that were not also in the previous January F4.0, followed by the number of stocks. "All F4" is an equally weighted average of the Jul-Jul return on the F4.0 stocks, "F4.0" is the true F4.0 return that double on the second ranked stock, and "DOW30" is an equally weighted average of the returns on the DJIA stocks, i.e. the proper benchmark. Finally, the last row reports average returns (not CAGR's as there are some holes in the returns series).

As you can see, on average 1.8 of the 4 stocks on the January F4.0 lists have been replaced six months later. Therefore, by construction the July F4.0 will outperform it's benchmark as evidenced by an average 2.14% outperformance in the last two columns. There is no evidence that doubling up on the second ranked stock helps, as evidenced by comparing the "ALL F4" and "F4.0" columns, and in fact the top ranked stock doesn't underperform (results not shown). I will therefore compare equally weighted portfolios.

I'm not up to testing formal hypotheses here, merely illustrate a method, but it's striking that the new stocks the F4.0 picked for July starts during the discovery period returned roughly 1.5% less than its benchmark. For all practical purposes this is average performance consistent with the notion that there is no causal relation between F4.0 picks and subsequent returns. Only the four stocks picked for Jan 1 starts outperform consistent with the relation between F4.0 picks and subsequent outperformance being spurious.

Incidentally, the 22.87% for the stocks on the January list that re-appear on the July list is quite good, and it's feasible. May I suggest that if you re-balance your Foolish Four in July you buy an equally weighted portfolio of the stocks that also appeared on the January 1 list? Just kidding of course.

The above was just a snapshot of possible analyses that can be made. One weakness of the data I had is that the proper benchmark for the new stocks in July starts is a "DOW26" because only 26 of the DOW stocks are available. I also tried to eliminate the stocks that appeared on the subsequent January 1 list. As a consequence only 0.75 new stock was selected each year with an average return of 6.81%. I attribute this to an extremely small sample and possibly some poor performance of the DOW22-26 that's now the proper benchmark.

I'm predict that the RP4 only picked four outperforming stocks every year during its discovery period, namely the data-mined Jan 1 picks. The average remaining stock pick during the year had average performance. Therefore, all evidence suggests that data-mining biases explain everything here.

Datasnooper.
Post New | Post Reply | Reply Later | Create Poll . Report this Post | Recommend it!
Print the post  
UnThreaded | Threaded | Whole Thread (15) | Ignore Thread Prev | Next

Announcements

Post of the Day:
Apple

Apple and Ninety Years Ago
What was Your Dumbest Investment?
Share it with us -- and learn from others' stories of flubs.
When Life Gives You Lemons
We all have had hardships and made poor decisions. The important thing is how we respond and grow. Read the story of a Fool who started from nothing, and looks to gain everything.
Community Home
Speak Your Mind, Start Your Blog, Rate Your Stocks

Community Team Fools - who are those TMF's?
Contact Us
Contact Customer Service and other Fool departments here.
Work for Fools?
Winner of the Washingtonian great places to work, and "#1 Media Company to Work For" (BusinessInsider 2011)! Have access to all of TMF's online and email products for FREE, and be paid for your contributions to TMF! Click the link and start your Fool career.
Advertisement