Hi Fools;

While experimenting with some of the BTD models, I discovered a slight modification that outperforms almost every model class we've considered thus far in Q1 starting periods. Before I go on, and before dwade and others start shooting methodological arrows at me, let me say that this may appear to be shameless data mining, and in a sense it is. However, I think there may be a way to address that issue at least to some extent, which I'll get into after the results are presented.

As we all know, the standard BTD models sort first for high yield, and then for low price among the top 10 high yield stocks. It turns out that a simple variation on this theme enhances returns considerably. If we examine the returns for the individual stocks in the HY sort before performing the LP sort, it becomes apparent that the highest yielding stock (regardless of price) gives the most 'bang for the buck' of any of the HY10 stocks, provided we use the UV procedure of skipping high yield/low price (HY/LP) combinations. This variation is the basis for the models described below.

My friend MontanaFool suggested the name Empirical Yield (EY) for these models, since we are letting the numbers tell us what works and what doesn't, and the procedure in a nutshell is this. Always buy the high yield stock regardless of price (except when it is also the lowest priced). After that, fill out the portfolio with the usual BTD sort. The mechanics of constructing the models are as follows. Beginning with the 30 Dow stocks, sort (descending) by yield. Among the top 10 high yield stocks, always leave the highest yield stock (HY1) in first position, and then sort the remaining nine stocks (ascending) by price. EYn buys equal dollar amounts of the first 'n' stocks. If there is a HY/LP combination, skip the HY/LP stock (HY1) and move everything down one position.

**PLEASE NOTE: the 'juiced' versions of the EY models are denoted EYn+, but they double the allocation of the first stock only!**

Below are the returns in multiplier form for these models by starting quarter for four quarter holding periods. Standard deviations and Sharpe ratios are shown below each model's returns. The Sharpe ratio is based on the one year Treasury bill rate, and has been multiplied by 100 and rounded to the nearest integer. For comparison purposes, the returns for similar UV and ERP models are also shown. Fool4 is the 'Old' Fool4, and UV5+ has been replaced by UV5/6+. The DsPc variation of ERP introduced by RayVT is denoted DP below. The sample period is 1965-96 (last portfolio ends in 97:Q4).

**Start Q1 (1965-96)**

Model EY2 EY2+ EY3 EY3+ EY4 EY4+ EY5 EY5+

H4Q 1.2290 1.2416 1.1993 1.2169 1.1880 1.2050 1.1677 1.1859

StDev 0.2858 0.3345 0.2497 0.2936 0.2273 0.2645 0.1924 0.2264

Sharpe 68 64 63 62 63 63 61 62

Model UV2 UV3 UV3+ F4 UV4 UV4+ UV5 UV5/6+

H4Q 1.2038 1.1743 1.1869 1.1846 1.1829 1.1911 1.1647 1.1779

StDev 0.2775 0.2462 0.2556 0.2509 0.2194 0.2332 0.1926 0.2034

Sharpe 60 53 57 57 61 62 59 63

Model DP2 ERP2 DP3 ERP3 DP4 ERP4 DP5 ERP5

H4Q 1.1692 1.2138 1.1817 1.1868 1.1694 1.1948 1.1795 1.1732

StDev 0.2954 0.2615 0.2340 0.2321 0.2188 0.2108 0.2030 0.1927

Sharpe 46 68 59 62 56 71 65 64

**Start Q2**

Model EY2 EY2+ EY3 EY3+ EY4 EY4+ EY5 EY5+

H4Q 1.1725 1.1730 1.1479 1.1555 1.1399 1.1486 1.1384 1.1468

StDev 0.2585 0.2924 0.2128 0.2460 0.1730 0.2048 0.1619 0.1844

Sharpe 52 48 48 47 50 49 52 52

Model UV2 UV3 UV3+ F4 UV4 UV4+ UV5 UV5/6+

H4Q 1.1439 1.1355 1.1386 1.1365 1.1294 1.1359 1.1392 1.1433

StDev 0.2167 0.1816 0.1907 0.1793 0.1579 0.1676 0.1592 0.1642

Sharpe 45 45 46 47 46 49 53 54

Model DP2 ERP2 DP3 ERP3 DP4 ERP4 DP5 ERP5

H4Q 1.1435 1.1398 1.1591 1.1332 1.1492 1.1456 1.1568 1.1357

StDev 0.2528 0.2340 0.2259 0.1867 0.1927 0.1810 0.1822 0.1574

Sharpe 41 42 51 45 52 53 58 52

**Start Q3**

Model EY2 EY2+ EY3 EY3+ EY4 EY4+ EY5 EY5+

H4Q 1.1540 1.1379 1.1347 1.1280 1.1260 1.1227 1.1333 1.1297

StDev 0.1981 0.1860 0.1692 0.1616 0.1687 0.1602 0.1752 0.1643

Sharpe 51 45 47 44 42 41 44 44

Model UV2 UV3 UV3+ F4 UV4 UV4+ UV5 UV5/6+

H4Q 1.1495 1.1311 1.1391 1.1260 1.1219 1.1319 1.1396 1.1510

StDev 0.2122 0.1935 0.1970 0.1741 0.1747 0.1829 0.1723 0.1756

Sharpe 48 41 45 41 38 43 49 55

Model DP2 ERP2 DP3 ERP3 DP4 ERP4 DP5 ERP5

H4Q 1.1130 1.1761 1.1572 1.1701 1.1575 1.1434 1.1387 1.1328

StDev 0.2091 0.2080 0.1864 0.1830 0.1715 0.1678 0.1600 0.1777

Sharpe 31 62 57 65 61 53 52 45

**Start Q4**

Model EY2 EY2+ EY3 EY3+ EY4 EY4+ EY5 EY5+

H4Q 1.1630 1.1450 1.1537 1.1436 1.1479 1.1419 1.1499 1.1451

StDev 0.2576 0.2801 0.2315 0.2495 0.2210 0.2321 0.2208 0.2258

Sharpe 47 38 47 40 46 41 46 43

Model UV2 UV3 UV3+ F4 UV4 UV4+ UV5 UV5/6+

H4Q 1.1756 1.1587 1.1660 1.1417 1.1479 0.1579 1.1531 1.1571

StDev 0.2655 0.2443 0.2499 0.2478 0.2272 0.2358 0.2246 0.2168

Sharpe 52 48 50 40 45 48 47 50

Model DP2 ERP2 DP3 ERP3 DP4 ERP4 DP5 ERP5

H4Q 1.1446 1.1516 1.1450 1.1519 1.1486 1.1506 1.1482 1.1412

StDev 0.2410 0.2003 0.2190 0.2141 0.2125 0.2002 0.2009 0.1941

Sharpe 41 51 44 49 47 51 49 47

As with almost all high yield models, the returns start to unravel for starting periods other than Q1. It might be noted, however, that for Q3, which is the worst starting period for almost all of the models, both ERP and DP hold up relatively well.

Turning to Q1, among the two stock models, EY2+ offers by far the best return. That return is bought with a price, however, namely higher volatility, which causes a lower Sharpe ratio, about which more shortly. We seldom hear about three stock models because few of them perform very well, but as the numbers show, EY3+ does quite well, offering an expected return of more than 21% with a bit more diversification than the two stock models and outperforming any other two or three stock model outside the class, although the difference between EY3+ and ERP2 is insignificant. Among the four stock models, EY4 outperforms F4 and UV4, but lags behind ERP4. EY4+, however, outperforms even ERP4, albeit with more volatility (and hence a lower Sharpe ratio). Among the five stock models, EY5+ shows the best returns of the lot, with DP5 and UV5/6+ running close behind. It is interesting to note that in Q1 the DP models in general do not do as well as their ERP counterparts, but DP5 is an exception to that rule.

Elan and others have remarked that the Sharpe ratios may be, if not irrelevant, certainly not the be-all and end-all for our purposes, and several of us share that opinion. If a higher return is the goal, and we have to pay a price of higher volatility to get a higher expected return, so be it. Up to a point, at least, volatility is really only an issue if one faces the prospect of having to sell to raise cash in a down market, and that should not be the case if the planning has been done correctly up front. The differences in the Sharpe ratios are also exaggerated because they have been multiplied by 100. If they are calculated as they were in Sharpe's original paper (divide the numbers above by 100), the differences are considerably less dramatic. The point of all this is that judging models by Sharpe ratio alone can be rather misleading, and perhaps that should be the *last* criterion used in evaluating the decision rather than the first.

The Fools have recently begun to show results from 1971 forward, excluding the 'flat' years of the 1960s. Let me say at the outset that I strongly disagree with this practice, partly because it is being done on the grounds that 'things were different' then, and it is far from obvious how anyone would know that. Nonetheless, the question of comparative model performance from 1971 forward will doubtless arise, so below are the returns for all three model classes for the 1971-96 sample period, starting in Q1.

**Start Q1 (1971-96)**

Model EY2 EY2+ EY3 EY3+ EY4 EY4+ EY5 EY5+

H4Q 1.2603 1.2761 1.2279 1.2485 1.2250 1.2429 1.2017 1.2212

StDev 0.2859 0.3378 0.2503 0.2965 0.2198 0.2605 0.1804 0.2182

Sharpe 78 73 73 71 79 76 80 78

Model UV2 UV3 UV3+ F4 UV4 UV4+ UV5 UV5/6+

H4Q 1.2358 1.2116 1.2221 1.2267 1.2171 1.2247 1.1989 1.2105

StDev 0.2895 0.2497 0.2623 0.2483 0.2126 0.2328 0.1808 0.1995

Sharpe 68 66 67 72 77 75 78 79

Model DP2 ERP2 DP3 ERP3 DP4 ERP4 DP5 ERP5

H4Q 1.1862 1.2542 1.2084 1.2282 1.2003 1.2398 1.2152 1.2142

StDev 0.3026 0.2464 0.2279 0.2168 0.2109 0.1901 0.1901 0.1728

Sharpe 49 86 71 82 70 98 85 91

The relative performance of the various models stays roughly the same, with all boats being lifted by the rising tide. Some of the absolute returns, however, are rather amazing. EY2+ comes in with an eye popping figure of almost 28%, truly a remarkable return for any method that restricts us only to Dow stocks. Among the two stock models, EY2 is second, with a bit over 26%, ERP2 comes in third, with a return of over 25%, and UV2 is last but still produces a very respectable return of over 23%. Among the 'non-juiced' four stock models, ERP4 is the winner with a return of almost 24%, but EY4+ again leads by a small (insignificant) margin, with a return just over 24%. For what it's worth, all of the Sharpe ratios rise significantly, and that of ERP4 goes through the roof. That may not be entirely good news, however, a point we'll return to shortly.

To return momentarily to the question about Sharpe ratios and the possibility that they are of limited use at best when comparing models, look at EY4+ and EY5. EY5 has a slightly better Sharpe ratio, but EY4+ has a higher expected return by more than 4%. In the end, this probably boils down to one's tolerance for volatility (as opposed to 'risk'), but for some of us at least, the higher expected return is worth the higher volatility regardless of what the Sharpe ratio says.

Finally, let me turn to the issue of data mining, after which I promise I'll shut up :^). As Doug observed recently, the only *real* test of any model is to see how it performs five years or more beyond the sample period used to construct the model. In principle, I agree with that completely. Unfortunately, the practical significance is limited because we all know very well that a bunch of number crunchers aren't going to sit around for five years after discovering a promising class of models before they announce the results. Given that, how might we proceed to minimize or at least to recognize the effects of data mining?

The heart of the problem is that the entire sample period has been used to fit the models to the data. On the other hand, suppose that same model or class of models had been discovered in, say, 1989. If it still performed well by 1998 we'd be rather impressed, and would likely be convinced that the relationship was neither a fluke nor just the result of data mining. Working that logic backwards, suppose we take the model results and eliminate certain years in the sample period. Then we look at how the returns perform over the restricted sample period. If the models perform relatively the same except possibly for a 'scale' factor due to market conditions, that at least suggests that we *would* have discovered the same relationship had we been looking for promising strategies in 1989. One version of this in fact is just what was done above, restricting the sample period to 1971 forward, although because of the periods involved there are good reasons to take those results with a grain of salt. A more informative procedure might be to eliminate significant portions of the 1990s bull market and observe how the model results change. Below is another table which shows returns for the three classes of models with the 1989-97 period eliminated, again considering only Q1 starts.

**Start Q1 (1965-88)**

Model EY2 EY2+ EY3 EY3+ EY4 EY4+ EY5 EY5+

H4Q 1.2290 1.2366 1.2007 1.2144 1.1854 1.2002 1.1599 1.1769

StDev 0.2606 0.2987 0.2291 0.2629 0.2053 0.2333 0.1844 0.2093

Sharpe 72 68 67 65 65 66 57 60

Model UV2 UV3 UV3+ F4 UV4 UV4+ UV5 UV5/6+

H4Q 1.2091 1.1689 1.1856 1.1701 1.1719 1.1853 1.1555 1.1748

StDev 0.2400 0.2215 0.2259 0.2267 0.1995 0.2072 0.1862 0.1929

Sharpe 67 53 60 54 59 64 54 63

Model DP2 ERP2 DP3 ERP3 DP4 ERP4 DP5 ERP5

H4Q 1.1740 1.2032 1.1736 1.1763 1.1604 1.1805 1.1674 1.1594

StDev 0.2422 0.2719 0.2178 0.2384 0.2081 0.2116 0.1886 0.1992

Sharpe 52 61 57 54 52 62 60 54

Looking first at the EY models, none of them loses even a full percentage point in the returns by eliminating the bull market years, and one model (EY3) actually shows *better* returns for the 1965-88 sample period than for the full sample. Considering the run that the market has been on since 1989 (until recently, that is :^)) the models exhibit impressive stability. For what they are worth, the Sharpe ratios also exhibit quite a bit of stability, all of them differing by four points or less. All of this suggests that the same relationship existed, and would have been discovered, had we been working only with the data through 1989.

Since they contain many of the same stocks, it isn't surprising that the UV models exhibit much of the same stability. Fool4 has the biggest problem, giving up almost 1.5%, and UV4 gives up slightly over 1% in return by eliminating the bull market, but other than those two the returns are fairly stable. MontanaFool's UV5/6+ actually outperforms F4 and UV4 with the bull market removed, which should be of more than passing interest given events of the last several weeks. The Sharpe ratios are a bit more variable, with two of them (UV2 and UV5) differing by five points or more, but that is still not a dramatic shift.

For the ERP models, the picture is a bit more troubling. In particular, ERP4 gives up over 1.4% return when we eliminate the bull market from the sample, and ERP5 loses almost as much. Also ERP4 now underperforms EY4+ by almost two percentage points, and also underperforms EY4 and UV4+, although by smaller margins. The Sharpe ratios also seem rather volatile, spiking markedly when we restrict the sample to 1971 forward, and then falling rather dramatically when the bull market period is removed from the sample. Except for the 36% fall in the Sharpe ratio from one restricted sample to the other, most of these differences are not large, and frankly I'm still uncertain as to just what if anything they are telling us, but together the results *may* suggest that the returns for the ERP class in general and ERP4 in particular depend more heavily on the bull market of the 1990s for their superior historical performance than either of the other two model classes.

As Doug has pointed out, there is more than one way to mine data. I've tried above to address one facet of the issue, but completely sidestepped the question of whether or not significant strategies exist for starts in other quarters or months. I don't have much to offer on that issue, except to say that those who believe such relationships exist should be busy looking for them :^). RayVT has suggested two such strategies (DsPc and Sig4) which do fairly well in quarters other than Q1, but neither of them in their best starting quarter can touch the best Q1 performers. If all that is solely the result of data mining, then I'd have to say that we miners have been astonishingly lucky :^). I personally doubt that non-Q1 strategies exist that will beat known Q1 results, because I believe the January Effect is real, but that is speculation and is certainly open to debate. All I can say is the EY models are another Q1 based strategy, and even with the increased volatility due to the HY1 stock, they look like they might be promising. So fire away, Doug, because in a very real sense, I probably deserve it :^).

My best to all you Fools,

TimberFool (Hinds Wilson - ffxres@erols.com)

P.S. And thanks to Orangeblood for what was indeed a posting tip to beat them all :^) [see #10537].