No. of Recommendations: 67
Executive Summary
This post details a series of backtests using a slightly modified Riding the Wave strategy -- i.e. trading monthly -- using baskets of poorly correlated investments including major developed markets, gold, currencies, bonds and commodities.

By using monthly data, a 30 year backtest is possible. In various baskets, reasonably attractive performance was seen over the past 5 years (CAGR in the mid 20s, GSD around 9, Sharpes over 0.4), but this does not hold for earlier time frames; the overall 30 year performance wasn't any better than buying the equity indices of the basket in equal weights.

---------------------------------------------------------------

Hi, folks. I'm a long-term lurker with an Excel addiction, and since I finally took the plunge into MI a year and a half ago (I started reading here back in '98 or '99), I've been wanting to help contribute something.

I've been interested in the Riding The Wave concept, so I thought I'd have my own look at it; kick the tires, so to speak, before I took it out for a spin. What I found surprised me.

Data
Since we're talking about using index funds, I thought I'd have a look at using various international stock indices as a proxy. Kenneth French's web site provides monthly historical index values for 15 major developed markets, about 97% by market cap. I extended S&P's historical S&P 500 index data back to 1976 with a historical series from Ohio State, and I found gold, T-bill and 10 year treasury bonds from a site I Googled called Wren Investments. I added in the CCI and CRB indices, as well as the CRB Currency index from CRB (the Commodity Research Board). Finally, using current national market cap values from S&P, I back-calculated synthetic indices grouping the developed markets into various chunks; Asia Pacific ex Japan, for instance (i.e. Hong Kong, Australia and Singapore). In the end, I wound up with a basket of 31 different investment types for consideration.

The reason I looked to these sources is simple: These data are all available going back to 1976, permitting a 30 year backtest of this general method.

One big caveat with all of this, of course -- the input data is only monthly, so the trading process of RtW is approximated; monthly instead of weekly or daily, and the lookbacks are similarly approximated; 2 and 11 months (roughly 42/231 instead of 50/235). The results I present are consistent when small adjustments are made -- to 3/11, or 2/12, for example. This fits with descriptions of a mound of toast around the 50/235 range. There's also no go-to-cash switch; I don't know anything about switches of this nature, although bonds are included as an asset.

Choosing Components
Since there's millions of possible groups with this many components, I just ran a macro overnight to construct random groupings and save the ones with low average correlations. Virtually all of the low correlation groups include long bonds and gold, and most include one of the two commodity indices and the currency index. I wound up using the CCI - Continuous Commodity Index, which is available as an ETF from Greenhaven (Amex:GCC) and tracks an equal-weight basket of around 20 commodities. Relative to other commodity indices, it's very heavy in agricultural products (especially 'softs', such as sugar, coffee, cotton and orange juice) as well as precious metals and very light in energy and industrial metals. The Reuters/CRB currency index isn't available as a single ETF, but the five component currencies (GBP, EUR, JPY, CHF, CAD) are, and it isn't frequently chosen, for what that's worth.

I wound up analyzing four baskets:

The seven element basket I used consisted of the S&P 500, Hong Kong, Spain, gold, bonds, commodities and currencies. These had an average correlation of 0.117, with the highest correlation being 0.441 between gold and CCI. (Similarly low correlations were available by swapping Hong Kong and Singapore, and/or by swapping Spain with Italy or Sweden.)

The ten element basket I used had an average correlation of 0.1802. Components were S&P 500, US Small Cap, Japan, Singapore, Spain, Italy, gold, bonds, commodities and currencies. US Small cap is from French's web site, and is defined as the smallest 30% by market cap -- as of Dec 2007, the cutoff would be about $1 B, so it's somewhat smaller than, say, the Russell 2000, but larger on average than microcap.

I also considered an 11 element basket, comprised of S&P 500, Japan, Canada, UK, France, Germany, Asia Pacific (ex Japan) and our old friends gold, bonds, commodities and currencies. Correlation is much higher; 0.259 overall and 0.464 for the seven equity markets, but the condolence is that this represents the major stock markets of the developed world and around 65% of total global equity capitalization (and as recently as 2002 represented almost 80%).

Performance
In each case, I'm comparing the historical performance of the RtW strategy on the selected basket with three other values: the S&P 500, the basket components equally-weighted, and the equity components of the basket (since the diversifying elements like gold and bonds tend to reduce return). I think the latter is the best benchmark; a strategy involving switching in and out of a handful of equities should surely beat buy and hold.

7 element basket (S&P 500, Hong Kong, Spain, Gold, 10-yr Bond, Currencies, Commodities)

Annual returns:
All 7 Equity Mkts
Year Top 1 Top 2 Equal-Weight Equal-Weight S&P 500
1978 2.2% 11.4% 17.4% 14.7% 1.1%
1979 79.8% 67.5% 32.6% 30.3% 12.3%
1980 5.4% 28.7% 19.5% 32.9% 25.8%
1981 -0.7% 1.4% -7.4% -1.4% -9.7%
1982 0.9% 21.0% -8.6% -22.1% 14.8%
1983 -20.2% -3.2% 4.7% 7.1% 17.3%
1984 29.3% 15.8% 5.7% 27.2% 1.4%
1985 59.3% 28.5% 22.3% 44.4% 26.3%
1986 115.1% 55.0% 28.9% 61.2% 14.6%
1987 -13.2% 3.5% 18.9% 15.6% 2.0%
1988 24.5% 6.1% 6.5% 17.3% 12.4%
1989 12.1% 13.7% 7.3% 19.8% 27.3%
1990 -3.4% -8.9% 2.5% -0.3% -6.6%
1991 31.9% 21.3% 11.2% 30.8% 26.3%
1992 30.2% 16.3% 1.0% 5.2% 4.5%
1993 50.8% 45.4% 24.3% 46.5% 7.1%
1994 -19.8% -21.1% -1.9% -10.7% -1.5%
1995 13.6% 17.5% 14.0% 28.9% 34.1%
1996 22.1% 13.8% 12.0% 31.3% 20.3%
1997 0.0% 1.8% -0.6% 7.9% 31.0%
1998 29.2% 18.6% 9.3% 24.1% 26.7%
1999 20.5% 14.5% 12.0% 24.0% 19.5%
2000 -22.9% -18.3% -3.7% -10.3% -10.1%
2001 -5.6% -7.8% -7.8% -13.1% -13.0%
2002 5.2% 1.0% 1.6% -15.8% -23.4%
2003 30.5% 36.6% 24.1% 41.4% 26.4%
2004 19.4% 13.4% 13.0% 20.7% 9.0%
2005 15.6% 10.8% 7.4% 5.8% 3.0%
2006 43.4% 28.0% 20.0% 31.0% 13.6%
2007 30.0% 23.6% 17.3% 18.9% 3.5%

CAGR: 16.3% 13.6% 9.6% 15.4% 9.6%
GSD: 26.3% 18.3% 10.6% 19.5% 14.7%
Sharpe: 0.15 0.14 0.12 0.17 0.02

Trailing Years Performance (Top 1 RtW):

CAGR GSD Sharpe
5 years (2003-2007) 27.4% 8.8% 0.44
10 years (1998-2007) 14.9% 20.2% 0.19
15 years (1993-2007) 13.5% 21.4% 0.16
20 years (1988-2007) 14.7% 19.5% 0.17
30 years (1978-2007) 16.3% 26.3% 0.15


10 element basket (S&P 500, US Small, Japan, Singapore, Spain, Italy, Gold, 10-yr Bond, Currencies, Commodities)
					
Annual Returns:
RtW RtW All 10 Equity
Year Top 1 Top 2 Equal Equal S&P 500
1978 -5.0% 11.4% 25.7% 30.1% 1.1%
1979 114.6% 67.5% 23.1% 15.9% 12.3%
1980 26.0% 28.7% 27.4% 40.1% 25.8%
1981 4.1% 1.4% -1.4% 5.9% -9.7%
1982 20.6% 21.0% -1.3% -4.0% 14.8%
1983 -14.1% -3.2% 11.3% 17.6% 17.3%
1984 5.9% 15.8% -1.2% 3.8% 1.4%
1985 72.0% 28.5% 26.2% 40.1% 26.3%
1986 56.5% 55.0% 39.4% 63.6% 14.6%
1987 32.5% 3.5% 14.4% 10.7% 2.0%
1988 3.6% 6.1% 11.9% 21.1% 12.4%
1989 23.4% 13.7% 11.4% 20.7% 27.3%
1990 6.0% -8.9% -7.5% -15.4% -6.6%
1991 14.1% 21.3% 11.3% 20.7% 26.3%
1992 2.7% 16.3% -4.7% -6.5% 4.5%
1993 56.7% 45.4% 21.1% 29.3% 7.1%
1994 -4.3% -21.1% 5.0% 4.9% -1.5%
1995 2.7% 17.5% 11.4% 16.6% 34.1%
1996 9.6% 13.8% 7.6% 13.5% 20.3%
1997 0.1% 1.8% 1.5% 7.3% 31.0%
1998 27.3% 18.6% 12.1% 21.2% 26.7%
1999 42.9% 14.5% 21.7% 35.2% 19.5%
2000 -25.1% -18.3% -8.4% -14.6% -10.1%
2001 0.7% -7.8% -9.8% -13.9% -13.0%
2002 -1.4% 1.0% -1.5% -12.2% -23.4%
2003 27.1% 36.6% 31.5% 45.5% 26.4%
2004 11.1% 13.4% 15.8% 21.6% 9.0%
2005 15.1% 10.8% 9.2% 9.6% 3.0%
2006 40.6% 28.0% 21.3% 27.6% 13.6%
2007 25.4% 23.6% 11.3% 8.1% 3.5%

CAGR: 16.9% 13.6% 10.5% 14.0% 9.6%
GSD: 24.1% 18.3% 12.0% 17.9% 14.7%
Sharpe: 0.16 0.14 0.13 0.16 0.02


Trailing Years Performance (Top 1 RtW):

CAGR GSD Sharpe
5 years (2003-2007) 23.4% 9.7% 0.39
10 years (1998-2007) 14.5% 21.5% 0.18
15 years (1993-2007) 13.4% 20.8% 0.16
20 years (1988-2007) 12.4% 18.2% 0.14
30 years (1978-2007) 16.9% 24.1% 0.16


11 element basket (S&P 500, Japan, Canada, UK, France, Germany, Asia Pacific ex Japan, Gold, 10-yr Bond, Currencies, Commodities)
Annual Returns:					
RtW RtW All 11 Equity
Year Top 1 Top 2 Equal Equal S&P 500
1978 14.5% 12.4% 26.3% 30.4% 1.1%
1979 99.9% 61.4% 25.3% 20.2% 12.3%
1980 -4.4% 17.1% 18.1% 22.8% 25.8%
1981 -18.2% -0.2% -9.0% -7.4% -9.7%
1982 -10.3% 7.1% 0.9% -0.2% 14.8%
1983 -11.6% -7.5% 16.9% 26.0% 17.3%
1984 -6.2% -3.1% -2.8% 0.4% 1.4%
1985 54.5% 51.2% 32.0% 48.3% 26.3%
1986 36.2% 50.4% 30.3% 44.2% 14.6%
1987 7.8% 20.9% 14.2% 10.8% 2.0%
1988 13.2% 17.0% 14.1% 23.5% 12.4%
1989 3.4% 21.2% 15.2% 25.6% 27.3%
1990 -7.6% -6.0% -4.6% -9.8% -6.6%
1991 0.4% 4.6% 10.6% 18.1% 26.3%
1992 -4.2% -5.3% -3.8% -4.7% 4.5%
1993 57.1% 43.6% 22.3% 30.4% 7.1%
1994 -21.9% -6.9% 2.6% 1.2% -1.5%
1995 20.1% 15.0% 12.1% 17.0% 34.1%
1996 9.6% 20.8% 10.3% 17.1% 20.3%
1997 6.4% -0.4% 1.4% 6.5% 31.0%
1998 26.2% 28.8% 10.1% 16.9% 26.7%
1999 39.1% 43.1% 24.1% 37.2% 19.5%
2000 -5.5% -6.2% -6.1% -10.3% -10.1%
2001 -7.3% -12.5% -13.0% -18.1% -13.0%
2002 20.0% -0.5% -5.1% -16.0% -23.4%
2003 36.3% 48.4% 32.0% 44.2% 26.4%
2004 10.7% 9.2% 14.8% 19.2% 9.0%
2005 20.4% 13.7% 12.0% 14.0% 3.0%
2006 25.7% 27.4% 19.8% 24.2% 13.6%
2007 36.6% 28.3% 16.2% 16.2% 3.5%

CAGR: 12.1% 14.8% 10.5% 13.6% 9.6%
GSD: 23.6% 18.8% 12.3% 17.3% 14.7%
Sharpe: 0.11 0.16 0.13 0.16 0.02


Trailing Years Performance (Top 1 RtW):

CAGR GSD Sharpe
5 years (2003-2007) 25.6% 9.3% 0.43
10 years (1998-2007) 19.1% 15.6% 0.29
15 years (1993-2007) 16.5% 19.8% 0.23
20 years (1988-2007) 12.4% 18.7% 0.15
30 years (1978-2007) 12.1% 23.6% 0.11

Interestingly, this third basket seems to perform relatively well out to choosing the top 5 or so ranked components.

Summary, Analysis and Conclusions

The RtW strategies outperformed the S&P 500, in all three baskets of investments. The 10 element basket outperformed the other two, but none of the RtW strategies outperformed an equal weighted investment in the underlying equities -- they had slightly higher CAGRs, paid for with a large increase in GSD.

In all three baskets, the results for the last 5 years were far more impressive -- CAGRs in the mid-20s for GSDs around 9%. However, looking farther back, there was a strong mean reversion type tendency. Also, it turns out that you should have bought gold in 1979.

Personally, this helped to confirm one of my fears about the Moose / RtW approaches. The main suspicion I had is that the massive results are due to a couple of hot picks, like the recent Brazil run up (and, for the Decision Moose, a 95% return in 6 months on "gold share" ASA -- during a period when gold bullion itself went up less than 20%). The baskets I tested on included equity markets from Europe, Asia and America, plus four nonequity components, and had very low correlations overall. The second two baskets, the 10 element basket and what I call the Global 11 had major differences in the equity components (only the S&P and Japan in common), and they provided similar performance, which seems to me like it may be a more believable result than some other outliers.

The new result, though, is the instability. While the baskets presented here didn't produce world-beating results, they still produce respectable results -- for the last 5 or 6 years. But the previous 25 years tell a very different story, and one that's not particularly impressive, in my opinion.

An argument could be made that the new world market is more fluid than it's ever been, that the recent outperformance of this strategy is something that's likely to continue. But if I had a dollar for every time I heard that it's different this time, well, I'd have enough money to make up my losses from all the times I thought it was different this time.

Kevin :)

Data sources:
Kenneth French - http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_...
Ohio State - http://www.cob.ohio-state.edu/~fin/resources_data/data/spind...
S&P - http://www.google.ca/url?sa=t&ct=res&cd=3&url=http%3A%2F%2Fw...
Wren Investment Advisers - http://www.wrenresearch.com.au/downloads/index.htm
CRB - http://www.crbtrader.com/crbindex/default.asp

PS: Corrections, comments and questions are, of course, very welcome; however, I'm likely to be busy at work over the next couple of days, and out of Internet distance over the weekend. Sorry in advance for any delays in response.
Print the post Back To Top
No. of Recommendations: 17
Kevin: Wow! Impressive work. If I could put together a post at that level, I'd seriously consider writing a book!

There are a few thoughts that come to mind as I read through your research... not necessarily in any order of importance.

* Granularity is everything. I'm able to easily test granularity in days, weeks, months or even years so I did test a vast # of these and found that anything that is less than a week hurts results, and anything more hurts results more yet. By more, I mean tests I ran from 1 to 8 week granularities. At two or even three weeks it isn't too bad, but once you get out to four or more weeks any out-performance disappears. It seems to me that at least partly, this is what you've confirmed in your test. Why is this important? Simply because momentum in uncorrelated assets needs to identify the switch sooner, not later, and a month, seems to be too late.

* Significant movers are necessary. One of the other things I've noticed is the importance of having assets that have two key characteristics: they are trenders and they are movers. In other words, they have a tendency to sharp movements that result in new trends that continue for some time. Anytime I've put currencies into the mix I've found that returns were hurt. Could it be that this particular asset group does not have these two characteristics? I'm just wondering what it is about them that makes it consistently hurt returns. It may be that there needs to be a sort at the beginning that removes any asset that hasn't seen a move of some significance in say the past three years or so. This is relatively easy for me to test, and I would think in Excel, also feasible. Does the logic in this sound valid?

* Correlation. What has a very low correlation in one period, can change to be highly correlated in another period. It is important to test correlation over a set period of time up to the period of investment / selection. I've found anywhere from one to three years to be optimal. Were you looking at the full history to arrive at the correlation numbers you shared?

* Benchmark. I agree that an equal-weighted basket is a good measure of whether the system truly works or not. But when it comes to investing in the strategy or not, part of the purpose is simply to be invested in something that is not correlated with the rest of my investments. The result of that low correlation will be a dampening of my port's overall volatility that will ultimately enhance my risk-adjusted return. In this sense, it is not so important that RtW beats out the S&P 500 or even an equal weighted basket of the asset group. What matters is that it is putting me into something that moves differently from the rest of my investments. Of course, my testing shows that there is out-performance, no matter what the period tested, but at least in theory, this thought is what interested me most and first when it comes to working on RtW.

Thanks for sharing your research. A remarkable contirbution indeed!
Print the post Back To Top
No. of Recommendations: 1
Kevin-
Great post-
Where have you been hiding???
Print the post Back To Top
No. of Recommendations: 2
Here is a test to illustrate what I mean regarding weekly granularity. This examines granularities from one week to eleven weeks.
Granularity, Value   CAGR    GSD   Sharpe  Ulcer Index  Drawdown
1 32.21% 30.42 1.04 8.65% -28.69%
2 21.92% 29.64 0.75 10.02% -31.94%
3 28.94% 30.46 0.95 12.99% -41.95%
4 20.65% 30.07 0.7 10.21% -35.70%
5 20.51% 30.39 0.7 11.69% -37.04%
6 26.44% 31.44 0.86 12.24% -35.70%
7 23.19% 29.46 0.79 12.47% -38.02%
8 25.35% 32.43 0.81 11.65% -35.70%
9 20.33% 31.72 0.67 12.74% -36.50%
10 19.02% 32.21 0.63 17.04% -49.40%
11 19.43% 30.41 0.66 15.68% -39.96%

It is most certainly not a smooth progression, and the drop after a one week granularity is pretty sharp.

NOTE: This is a test that is a perfect match with the DM site as to asset list, and does not use the FED indicator or any means for reducing whipsaw.

From this I conclude that the RtW strategy requires finer granularity. One month is simply too long between checks.
Print the post Back To Top
No. of Recommendations: 8
From this I conclude that the RtW strategy requires finer granularity.

One way of looking at it:
At heart, RtW is just a momentum/relative strength technique.
So, this result is consistent with RRS: checking frequently helps a lot.
RRS189 with 1-week hold is a lot better than RRS189 with 6-week hold.
But, interestingly, it doesn't really increase the number of trades much---
mainly it just improves their timing, getting you out of the zoomers
soon after they top out on a relative basis, rather than living
through a meanginful drop before leaving.

Jim
Print the post Back To Top
No. of Recommendations: 1
RRS189 with 1-week hold is a lot better than RRS189 with 6-week hold.
But, interestingly, it doesn't really increase the number of trades much


What you say makes sense, but I'm getting the opposite conclusion from a GTR1 run, at least in the VL Tim=1 universe (all with 0.35% friction):

http://www.backtest.org/gtr1/h126f.35::tim.v:et1:rrs(0,189)t...
http://www.backtest.org/gtr1/h5f.35::tim.v:et1:rrs(0,189)tn5...
http://www.backtest.org/gtr1/h5f.35::tim.v:et1:rrs(0,189)tn5...

126d hold has avg CAGR/GSD 24.9/46.7 and avg annualized turnover 1.88
5d hold has avg CAGR/GSD 33.6/45.0 and avg annualized turnover 6.37
5d HTD 20 has avg CAGR/GSD 33.1/44.9 and avg annualized turnover 6.25

- Joe
Print the post Back To Top
No. of Recommendations: 3
What you say makes sense, but I'm getting the opposite conclusion from a GTR1 run

Your example is a 126 day hold, which is 6 months, not 6 weeks.

My point is that the turnover will of course go up when using a shorter
hold period within the "usual" range of 1-6 weeks, but it doesn't go up
nearly as much as you might think, because RRS slopes change slowly.

With 0.35% friction, 10 stocks, CAGR and annual turnover

Hold 5 31.4 / 5.72
Hold 10 31.2 / 4.89
Hold 15 31.1 / 4.45
Hold 20 31.3/ 4.16
Hold 25 30.8 / 3.93
Hold 30 30.4 / 3.73

Contrast that with a total return check tr(0,189):
Hold 5 17.4 / 11.66
Hold 30 26.5 / 4.27

In this case, the turnover rises rapidly and the friction eats any
possible improvement in the total returns.

So, the lesson is, if there is a way to calculate your relative
strength formula in a way that doesn't change the sort order too
rapidly, it can be worthwhile to have a system which checks frequently
but trades infrequently, which is what the RtW tends to do.

Frequency of tradnig can be brought down with any of the "usual" techniques,
including hold-till-drop or (as here using slopes rather than hard
total return periods. If I were using RtW, I would be tempted
to replace the prices 50 days ago and 225+50 days ago with smoothed
versions of those. So, for example, replace the price 50 days ago
with the average(48-52) days ago, and replace the price 275 days ago
with the average(270-280) days ago. This reduces the influence of
data errors, unusual temporary price movements in the past, and any
over sensitivity to the tuning valuse of the lookback, while also
reducing the frequency of changing of sort order a little bit. This
might reduce the need for HTD and conceivably improve returns a bit.
It's much more debatable whether the current price should be
replaced with (for example) the average over the last week, since you
want to catch downturns quickly. WMA(5) might be a nice compromise.
Note, I would probably do this smoothing even if it hurt a tiny bit
in backtest, since the backtest could be slightly less in danger of
overtuning to a particular pair of lookbacks.

Jim
Print the post Back To Top
No. of Recommendations: 0
Your example is a 126 day hold, which is 6 months, not 6 weeks.


Whoops! <blush>

- Joe
Print the post Back To Top
No. of Recommendations: 0
If you could check just using:
1. last month RS
2. last two months
Print the post Back To Top
No. of Recommendations: 8
and the drop after a one week granularity is pretty sharp.
...
From this I conclude that the RtW strategy requires finer granularity. One month is simply too long between checks.


I would be inclined to conclude that this shows that RtW is illusory, and that the reported CAGR of RtW[1] is the result of over-tuning. As I said back in the days when we were torpedoing the Foolish Four, "Reality doesn't change when the calendar changes."

BTW, the stdev's are up in nosebleed territory. That alone would dissuade me from doing RtW.
Print the post Back To Top
No. of Recommendations: 17
Sometimes I think you have to step back from the backtests and ask what might be different now than during the backtest period. Until the past couple of years the masses could not easily pull this strategy off and the professionals normally specialize in one of the major asset categories. Sure there are firms that cover all the bases for their clients, but jumping whole hog into commodities per se was far from the mainstream it is becoming.

I don't know what this portends for the future exactly, but one prediction is faster price movement and increased volatility.

I think the real issue for RtW-like strategies is; are you going to ride the short term momentum waves or the long term secular ones? Which ever it is suggests slightly different parameters/mechanics. Zee's method is a compromise that undeniably works as of late.

It is important to realize that there is a completely different philosophy with regard to backtesting that is not used on this board. The philosophy used here is the longer the backtest the better. The competing philosophy is basically constant reoptimization over shorter back testing periods to pick up changes in the markets as they evolve. Anyone who has backtested RtW independently understands over the past 3-5 years this strategy has become much stronger than it was historically. Will it continue, who knows?

Bottom line: I think everyone employing this strategy should go in eyes open to the fact that this is an evolving market space.
Print the post Back To Top
No. of Recommendations: 6
The philosophy used here is the longer the backtest the better. The competing philosophy is basically constant reoptimization over shorter back testing periods to pick up changes in the markets as they evolve. Anyone who has backtested RtW independently understands over the past 3-5 years this strategy has become much stronger than it was historically.

Anyone who believes this should read Taleb's Fooled by Randomness. A 3-5 year backtest is a curve fitting exercise that isn't worth the time of day.

Elan
Print the post Back To Top
No. of Recommendations: 0
Top 1 Top 2

One thing I've started to look at -- when the "Top 1" is an extreme value, it may be a good one to avoid. And vice versa -- when the "Bottom 1" is an extreme value, it may be a good one to pick up.
Print the post Back To Top
No. of Recommendations: 0
"Correlation. What has a very low correlation in one period, can change to be highly correlated in another period. It is important to test correlation over a set period of time up to the period of investment / selection. I've found anywhere from one to three years to be optimal."

I havent seen this discussed much before. Are you advocating rotating your ETF selection based on changing correlation?
Print the post Back To Top
No. of Recommendations: 1
Cgabriel asked:
Are you advocating rotating your ETF selection based on changing correlation?

Yes, I most definitely am. In fact, I believe I've posted on that recently if you care to track it down. There are times I end up posting too much for me to even keep up with. :)
Print the post Back To Top
No. of Recommendations: 8
Anyone who believes this should read Taleb's Fooled by Randomness. A 3-5 year backtest is a curve fitting exercise that isn't worth the time of day.

Indeed it is and that is exactly the point. I have no intent of debating what is essentially a philosophical approach to backtesting. But if I were arguing the other side, my simple question would be...now how many screens have been developed on this board that have not worked going forward? The other thing I would put forward is the fact, not the supposition, that many things that used to work in the market no longer do.

However, for the readers out there I will also add that anyone employing the alternative backtesting method correctly also has in place a trade by trade statistical monitoring system to identify if the backtested approach is "busted." I've never seen anything like that on this board...not a complaint on my part, simply an observation of one area where this board could improve its technique. Personally, I'd go with the statistical monitoring of a screen to decide if it was a continued keeper over falling back on the hope that because it has a 30 year backtest it will remain a good screen. If the 30 years really means something it will survive the statistical monitoring, no harm no foul. If it doesn't, why continue on hope until you throw the towel in at the obvious? (probably a couple of years after the damage has been done or at the cost of opportunity lost.)

A book I finished reading recently, even though it has been out a while, is Ken Fisher's "The Only Three Questions that Count" which talks to the above point all through the book. It is one of the few investing books I've read where the author goes back and demonstrates how several of the things he used to hold as predictive through backtesting are now useless or nearly so.

Cheers Kev
Print the post Back To Top
No. of Recommendations: 3
I will also add that anyone employing the alternative backtesting method correctly also has in place a trade by trade statistical monitoring system to identify if the backtested approach is "busted." I've never seen anything like that on this board..

Not necessarily true. I have been and there have been numerous other posts about choosing screens that have been doing well in the recent past from a group of screens that have proven themselves over the long term. Every two weeks I run my SI Pro monthly holds (2 week offsets) the recent 9, 6 and 4 month gain and STD along with their long term performance and lack of correlation with the other screens go into picking which screens to select stocks from. In addition on P123 I run my ranking systems to see how if they are holding up in the current environment.

Sounds good!! Well in actuality in mid 2007 ranking systems and screens all started either loosing all or most of their edge. As some in the P123 community have noted there was a dramatic flight to safety. As I now know (after the fact) I was relying too much on momentum and not paying enough attention to assuring multiple dimensions of diversification. It may be nice to recognize that your selection process isn’t working now but it doesn’t help if you are so sure that your picks are so outstanding that environment will change next month. Well I have to admit I went to 40% cash but still 60% in.

You can improve your odds by verifying screens have worked in different markets, don’t deteriorate significantly when holding 10 or 15, with longer holds and penalize excessive turnover. Use a blend or at least assure that there isn’t too much correlation to the screens you are holding. Even then I still believe that you have to limit the number of stocks you have in any one sector (not back testable for a portfolio). But nothing will keep you from having a bad year. My screens haven’t done well this last 7 or 8 months but every time I start to get upset I look at my overall last 3, 5 or 10 years and feel good again.

As for the wave? I still believe if there is any argument at all supporting the wave the best is the Fund*X results. Very similar ranking system 5 funds based on recent performance. Hulbert in this months AAII jornal has them ranked first for risk adjusted return, 14.5% annualized gain 6/30/1980 to 5/31/2008! But then again that is only 2% better than the S&P 500 for the same period and they are one of only 3 newsletters that have survived and beat the index.

RAM
Print the post Back To Top
No. of Recommendations: 3
However, for the readers out there I will also add that anyone employing the alternative backtesting method correctly also has in place a trade by trade statistical monitoring system to identify if the backtested approach is "busted." I've never seen anything like that on this board...not a complaint on my part, simply an observation of one area where this board could improve its technique. Personally, I'd go with the statistical monitoring of a screen to decide if it was a continued keeper over falling back on the hope that because it has a 30 year backtest it will remain a good screen. If the 30 years really means something it will survive the statistical monitoring, no harm no foul. If it doesn't, why continue on hope until you throw the towel in at the obvious? (probably a couple of years after the damage has been done or at the cost of opportunity lost.)

Robbie Geary has completed a major validation effort lately, although it may not be what you have in mind. The GTR1 backtester is IMO a very significant monitoring mechanism that "busted" some screens and validated others. It raised the validity of the backtests by an order of magnitude (whatever that means :-).

The other thing I do on an annual basis is re-test all the screens with the additional year of data, always starting in 1989. I chose 1989, incidentally, because it provides a large enough group of screens to work with. If a screen has fallen on its face in the last few years, it will inevitably drop further and further in the cumulative tests relative to other screens.

Elan
Print the post Back To Top
No. of Recommendations: 0
It raised the validity of the backtests by an order of magnitude (whatever that means :-).

Two numbers have the same order of magnitude if the larger number is less than ten times the smaller one.
Print the post Back To Top
No. of Recommendations: 0
Sounds good!! Well in actuality in mid 2007 ranking systems and screens all started either loosing all or most of their edge. As some in the P123 community have noted there was a dramatic flight to safety. As I now know (after the fact) I was relying too much on momentum and not paying enough attention to assuring multiple dimensions of diversification. It may be nice to recognize that your selection process isn’t working now but it doesn’t help if you are so sure that your picks are so outstanding that environment will change next month.

The type of monitoring I'm referring to really has nothing to do with current market conditions. It has to do with the system's/screen's statistical profile throughout its backtest period then comparing it to how the screen is behaving currently. Using Jamie's backtester as an example, he lists each screen's avg win and avg loss. One way of monitoring the screen is to use those statistics and see if they remain valid going forward. If the screen's profile changes to the negative you probably ought to dump the screen.

If the screen behaves differently in various market conditions then simply divide the performance characteristics into up and down market segments, keeping your definition of "up and down" consistent, and monitor that way. Of course screens can simply degrade and then it becomes a matter of preference as to whether you want to keep it or not.

Cheers Kev
Print the post Back To Top
No. of Recommendations: 0
KBGlenn,

In the type of screen monitoring that you describe, how long does it take to determine that a screen has degraded sufficiently to warrant discarding it?

Thanks,

Todd
Print the post Back To Top
No. of Recommendations: 2
However, for the readers out there I will also add that anyone employing the alternative backtesting method correctly also has in place a trade by trade statistical monitoring system to identify if the backtested approach is "busted." I've never seen anything like that on this board

A long time ago someone did investigate how much time would be required to obtain statistical significance for one of our screens. The amount of time was quite long.

Professional trading systems have a lot more stocks and might be able to reach significance but it would still take quite awhile. From a practical standpoint, there are a lot of hedge funds and others that are trying to exploit such advantages. By the time an arbitrage advantage can be statistically verified there are already too many funds exploiting it. One could employ a complicated and obscure method but that may just be datamined. Thus, it makes sense that best way to profit is using methods that cannot yet be statistically verified. These methods still should have a better chance than not of beating the market but when they are "guaranteed" to work there will be too many people using them.

- DesertHeat
Print the post Back To Top
No. of Recommendations: 0
DesertHeat wrote:
By the time an arbitrage advantage can be statistically verified there are already too many funds exploiting it. One could employ a complicated and obscure method but that may just be datamined. Thus, it makes sense that best way to profit is using methods that cannot yet be statistically verified. These methods still should have a better chance than not of beating the market but when they are "guaranteed" to work there will be too many people using them.

I'd most definitely agree with this sentiment.

What do the statisticians among us think? Elan? Eric? Others?
Print the post Back To Top
No. of Recommendations: 4
By the time an arbitrage advantage can be statistically verified there are already too many funds exploiting it. One could employ a complicated and obscure method but that may just be datamined. Thus, it makes sense that best way to profit is using methods that cannot yet be statistically verified. These methods still should have a better chance than not of beating the market but when they are "guaranteed" to work there will be too many people using them.

I'd most definitely agree with this sentiment.

What do the statisticians among us think? Elan? Eric? Others?


There are some examples that defy this logic. For example, the RS26 screen has continued to outperform the index for almost forty years.

Elan
Print the post Back To Top
No. of Recommendations: 0
If the worry is that too many people/funds use the screens why is that when the momemtum filter/s are removed from most of the screens the performance declines?


Bryan
Print the post Back To Top
No. of Recommendations: 0
Todd,

Standard statistics would say 30 observations, with more being better. MI techniques raise some interesting questions as to what you could consider "30 observations." It would be interesting to run it both ways on screens that we don't believe are performing out of sample (e.g. Zee's recent post might be a starter). One way would be the more conservative 30 "monthly" observations while the other would be a more aggressive 30 trades. I'm not sure I'd feel very comfortable with 30 trades to make a definitive call, but if one were to divide the observations into bullish and bearish data sets I think I'd feel comfortable putting screens on a "watch list" if they were substantially different than historical norms.
Print the post Back To Top
No. of Recommendations: 5
There are some examples that defy this logic. For example, the RS26 screen has continued to outperform the index for almost forty years.

Elan


There is some logic to this. First, the outperformance of RS26 based on Sharpe is not as great as its performance based on CAGR. Thus, some of the outperformance is achieved by the assumption of more risk than the index.

For the second, consider the case of merger arbitrage. Hedge funds take advantage of the difference in price of the firm acquiring a company and the stock of the acquired company. In the process they bring the price of the two into close equilibrium. But they cannot bring the two prices into complete equilibrium because they face costs, such as transaction cost, cost of money (interest), and risk of the merger breaking up. The hedge funds with lower costs are the ones that can profit because they can take advantage of smaller discrepancies and thus they will push the other hedge funds out of merger arbitrage. But consider the case of a private individual. Can they profit by merger arbitrage? The answer is a limited yes, if they are buying the stock anyway. If they are buying the stock then the transaction cost is going to be paid anyway so it can be ignored in the decision. Thus the private investor probably has a lower cost and can profit from the price discrepancy by choosing the proper stock (acquiring or acquired) to buy.

How about hedge funds exploiting RS26? What are their costs in trying to do so? They are several but I will discuss the main one. Hedge funds have to deploy a lot of money in the market in order to make enough money to pay their operations costs and still have enough profits to satisfy their investors. This may mean they either have a lot of money invested or that they are heavily leveraged. Most hedge funds are leveraged to a significant extent. If we look at RS26 we probably would need to expect a drawdown on the order of 40% at some point. Leverage on the order of a bit more than two to one would mean that at some point the hedge fund could expect to go broke. In practice, hedge funds only employ strategies that meet a minimum Sharpe ratio or similar risk measure. Banks and brokerages use Value at Risk (VaR) to limit their risk so if they are investors in the hedge fund they will only deploy capital if it satisfies their overall VaR. Most private investors are not investing enough money to spread their risk over a large portion of the market in order to raise their Sharpe ratio. Private investors are also not leveraged very much, if at all, and so they can tolerate higher levels of risk. Thus private investors have a cost structure that allows them to employ strategies with a lower Sharpe ratio than would be acceptable to a hedge fund. There was an article in the Wall Street Journal several years ago that admitted that momentum strategies "work" but implied that they were too volatile for Wall Street.

- DesertHeat
Print the post Back To Top